Polish orthography

Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic – the written letters (or combinations of them) correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see Polish phonology.

Polish alphabet

Main article: Polish alphabet

The diacritics used in the Polish alphabet are the kreska (graphically similar to the acute accent) in the letters ć, ń, ó, ś, ź; the kreska ukośna (stroke) in the letter ł; the kropka (overdot) in the letter ż; and the ogonek ("little tail") in the letters ą, ę. There are 32 letters in the Polish alphabet: 9 vowels and 23 consonants.

The Polish alphabet. Grey indicates letters not used in native words.

Letter		Name
A	a	a
Ą	ą	ą
B	b	be
C	c	ce
Ć	ć	ci
D	d	de
E	e	e
Ę	ę	ę
F	f	ef
G	g	gie
H	h	ha
I	i	i
J	j	jot
K	k	ka
L	l	el
Ł	ł	eł
M	m	em
N	n	en
Ń	ń	eń
O	o	o
Ó	ó	ó or o z kreską
P	p	pe
R	r	er
S	s	es
Ś	ś	eś
T	t	te
U	u	u
W	w	wu
Y	y	y or igrek
Z	z	zet
Ź	ź	ziet
Ż	ż	żet

The letters q (named ku), v (named fau), and x (named iks) do not belong to the Polish alphabet, but are used in some foreign words and commercial names. In loanwords they are often replaced by kw, w, and ks, respectively (as in kwarc "quartz", weranda "veranda", ekstra "extra").

When giving the spelling of words, certain letters may be said in more emphatic ways to distinguish them from other identically pronounced characters. For example, H may be referred to as samo h ("h alone") to distinguish it from CH (ce ha). The letter Ż may be called żet (or zet) z kropką ("Ż with a dot") to distinguish it from RZ (er zet). The letter U may be called u otwarte ("open u", a reference to its graphical form), to distinguish it from Ó, which is sometimes called u zamknięte ("closed u").

Note that (unlike in languages such as French) Polish letters with diacritics are treated as fully independent letters in alphabetical ordering. For example, być comes after bycie. The diacritic letters also have their own sections in dictionaries (words beginning with ć are not usually listed under c).

Polish additionally uses the digraphs ch, cz, dz, dź, dż, rz, and sz. Digraphs are not given any special treatment in alphabetical ordering. For example, ch is treated simply as c followed by h, and not as a single letter as in Czech.

Spelling rules

Vowels
Grapheme	Usual value	Other values
a	/a/
ą	/ɔ̃/	[ɔn], [ɔŋ], [ɔm]; merges with /ɔ/ before /w/ (see below)
e	/ɛ/
ę	/ɛ̃/	[ɛn], [ɛŋ], [ɛm]; merges with /ɛ/ before /w/ and often word-finally (see below)
i	/i/	[j] before a consonant; marks palatalization of the preceding consonant before a vowel (see below)
o	/ɔ/
ó	/u/
u	[w] after vowels
y	/ɨ/

Consonants
Grapheme	Usual value	Voiced or devoiced
b	/b/	[p] if devoiced
c¹	/t͡s/	[d͡z] if voiced
ć¹	/t͡ɕ/	[d͡ʑ] if voiced
cz	/t͡ʂ/	[d͡ʐ] if voiced
d	/d/	[t] if devoiced
dz¹	/d͡z/	[t͡s] if devoiced
dź¹	/d͡ʑ/	[t͡ɕ] if devoiced
dż	/d͡ʐ/	[t͡ʂ] if devoiced
f	/f/	[v] if voiced
g	/ɡ/	[k] if devoiced
h	/x/	[ɣ] if voiced²
ch
j	/j/
k	/k/	[ɡ] if voiced
l	/l/
ł	/w/
m	/m/
n¹	/n/
ń¹	/ɲ/
p	/p/	[b] if voiced
r	/r/
s¹	/s/	[z] if voiced
ś¹	/ɕ/	[ʑ] if voiced
sz	/ʂ/	[ʐ] if voiced
t	/t/	[d] if voiced
w	/v/	[f] if devoiced
z¹	/z/	[s] if devoiced
ź¹	/ʑ/	[ɕ] if devoiced
ż	/ʐ/	[ʂ] if devoiced
rz

Voicing and devoicing

Voiced consonant letters frequently come to represent voiceless sounds (as shown in the above tables). This is due to the neutralization that occurs at the end of words and in certain consonant clusters; for example, the ⟨b⟩ in klub ("club") is pronounced like a ⟨p⟩, and the ⟨rz⟩ in prze- sounds like ⟨sz⟩. Less frequently, voiceless consonant letters can represent voiced sounds; for example, the ⟨k⟩ in także ("also") is pronounced like a ⟨g⟩. The conditions for this neutralization are described under Voicing and devoicing in the article on Polish phonology.

Palatal and palatalized consonants

The spelling rule for the alveolo-palatal sounds /ɕ/, /ʑ/, /t͡ɕ/, /d͡ʑ/ and /ɲ/ is as follows: before the vowel ⟨i⟩ the plain letters ⟨s z c dz n⟩ are used; before other vowels the combinations ⟨si zi ci dzi ni⟩ are used; when not followed by a vowel the diacritic forms ⟨ś ź ć dź ń⟩ are used. For example, the ⟨s⟩ in siwy ("grey-haired"), the ⟨si⟩ in siarka ("sulphur") and the ⟨ś⟩ in święty ("holy") all represent the sound /ɕ/.

Sound	Word ending position or before a consonant	Before a vowel other than ⟨i⟩	Before ⟨i⟩
/t͡ɕ/	ć	ci	c
/d͡ʑ/	dź	dzi	dz
/ɕ/	ś	si	s
/ʑ/	ź	zi	z
/ɲ/	ń	ni	n

Special attention should be paid to ⟨n⟩ before ⟨i⟩ plus a vowel. In words of foreign origin the ⟨i⟩ causes the palatalization of the preceding consonant ⟨n⟩ to /ɲ/, and it is pronounced as /j/. This situation occurs when the corresponding genitive form ends in -nii, pronounced as /ɲji/, not with -ni, pronounced as /ɲi/ (which is a situation typical to the words of Polish origin). For examples, see the table in the next section.

Similar principles apply to the palatalized consonants /kʲ/, /ɡʲ/ and /xʲ/, except that these can only occur before vowels. The spellings are thus ⟨k g (c)h⟩ before ⟨i⟩, and ⟨ki gi (c)hi⟩ otherwise. For example, the ⟨k⟩ in kim ("whom", instr.) and the ⟨ki⟩ in kiedy both represent /kʲ/.

Other issues with i and j

Except in the cases mentioned in the previous paragraph, the letter ⟨i⟩ if followed by another vowel in the same word usually represents /j/, but it also has the palatalizing effect on the previous consonant. For example, pies ("dog") is pronounced /pˈjɛs/. Some words with ⟨n⟩ before ⟨i⟩ plus a vowel also follow this pattern (see below). In fact i is the usual spelling of /j/ between a preceding consonant and a following vowel. The letter ⟨j⟩ normally appears in this position only after ⟨c⟩, ⟨s⟩ and ⟨z⟩ if the palatalization effect described above has to be avoided (as in presja "pressure", Azja "Asia", lekcja "lesson", and the common suffixes -cja "-tion", -zja "-sion": stacja "station", wizja "vision"). The letter ⟨j⟩ after consonants is also used in concatenation of two words if the second word in the pair starts with ⟨j⟩, e.g. wjazd "entrance" originates from w + jazd(a). The pronunciation of the sequence wja (in wjazd) is the same as the pronunciation of wia (in wiadro "bucket").

The ending -ii which appears in the inflected forms of some nouns of foreign origin, which have -ia in the nominative case (always after ⟨g⟩, ⟨k⟩, ⟨l⟩, and ⟨r⟩; sometimes after ⟨m⟩, ⟨n⟩, and other consonants), is pronounced as /ji/, with the palatalization of the preceding consonant. For example, dalii (genitive of dalia "dalia"), Bułgarii (genitive of Bułgaria "Bulgaria"), chemii (genitive of chemia "chemistry"), religii (genitive of religia "religion"), amfibii (genitive of amfibia "amphibia"). The common pronunciation is /i/. This is why children commonly misspell and write -i in the inflected forms as armii, Danii or hypercorrectly write ziemii instead of ziemi.

In some rare cases, however, when the consonant in case is preceded by another consonant, -ii may be pronounced as /i/, but the preceding consonant is still palatalized, for example, Anglii (genitive etc. of Anglia "England") is pronounced /anɡlˈi/. (The spelling Angli, very frequently met with on the Internet, is simply an error in orthography, caused by this pronunciation.)

The words of Polish origin do not have the ending -ii but simple -i, e.g. ziemi (genitive of ziemia).

A special situation applies to ⟨n⟩: it has the full palatalization to /ɲ/ before -ii which is pronounced as /ji/ - and such a situation occurs only when the corresponding nominative form in -nia is pronounced as /ɲja/, not as /ɲa/.

For example, (pay attention to the upper- and lower-case letters):

Case	Word	Pronunciation	Meaning	Word	Pronunciation	Meaning
Nominative	dania	/daɲa/	dishes (plural)	Dania	/daɲja/	Denmark
Genitive	(dań)	(/daɲ/)	(of dishes)	Danii	/daɲji/	of Denmark
Nominative	Mania	/maɲa/	Mary (diminutive of "Maria")	mania	/maɲja/	mania
Genitive	(Mani)	(/maɲi/)	(of Mary)	manii	/maɲji/	of mania

The ending -ji, is always pronounced as /ji/. It appears only after c, s and z. Pronunciation of it as a simple /i/ is considered a pronunciation error. For example, presji (genitive etc. of presja "pressure") is /prɛsji/; poezji (genitive etc. of poezja "poetry") is /pɔɛzji/; racji (genitive etc. of racja "reason") is /rat͡sji/.

Nasal vowels

The letters ⟨ą⟩ and ⟨ę⟩, when followed by plosives and affricates, represent an oral vowel followed by a nasal consonant, rather than a nasal vowel. For example, ⟨ą⟩ in dąb ("oak") is pronounced /ɔm/, and ⟨ę⟩ in tęcza ("rainbow") is pronounced /ɛn/ (the nasal assimilates with the following consonant). When followed by ⟨l⟩ or ⟨ł⟩ (and in the case of ⟨ę⟩, often at the end of words) these letters are pronounced as just /ɔ/ or /ɛ/.

Homophonic spellings

Apart from the cases resulting from the sections above, there are three sounds in Polish that can be spelt in two different ways, depending on the word:

/x/ can be spelt either ⟨h⟩ or ⟨ch⟩; the spelling ⟨h⟩ appears mostly in loanwords.
/u/ can be spelt ⟨u⟩ or ⟨ó⟩; the spelling ⟨ó⟩ usually indicates that the sound developed from a historical long ⟨o⟩ (see Historical development in the article on Polish phonology).
/ʐ/ can be spelt either ⟨ż⟩ or ⟨rz⟩; the spelling ⟨rz⟩ usually indicates that the sound developed from palatalization of a historical /r/.

Other points

Occasionally the letter ⟨u⟩ represents /w/ (pronounced [u̯]) after a vowel, For example, autor ("author") is [ˈau̯tɔr].

Notice that doubled letters represent separate occurrences of the sound in question; for example Anna is pronounced /ˈanna/ in Polish. In practice a doubled consonant is often realized as a single sound pronounced in a prolonged manner.

There are certain clusters where a written consonant would not normally be pronounced. For example, the ⟨ł⟩ in the words mógł ("could") and jabłko ("apple") is omitted in ordinary speech.

Capitalization

Names are generally capitalized in Polish as in English. Polish does not capitalize the months and days of the week, nor adjectives and other forms derived from proper nouns (for example, angielski "English").

Titles such as pan ("Mr"), pani ("Mrs/Ms"), doktor etc. and their abbreviations are not capitalized, except in polite address. Pronouns (chiefly those of the second person) are often capitalized out of politeness when they refer to the person one is writing to.

Punctuation

Polish punctuation is similar to that of English. However, there are more rigid rules concerning use of commas—subordinate clauses are almost always marked off with a comma, while it is normally considered incorrect to use a comma before a coordinating conjunction with the meaning "and" (i or oraz).

Abbreviations (but not acronyms or initialisms) are followed by a period when they end with a letter other than the one which ends the full word. For example, dr has no period when it stands for doktor, but takes one when it stands for an inflected form such as doktora and prof. has period because it comes from profesor (professor).

Apostrophes are used not to separate foreign-spelt stems from Polish inflected endings, as it is commonly (erroneously) worded, but to mark the elision of the final letter(s) of the foreign words unspoken before the Polish endings, as in Tony'ego (genitive of "Tony"; because of unspoken "y", pronounced Tonego) but Johna (genitive of "John"; final "n" is spoken); cf. genitive of Charles: English name—Charlesa (final "s" is spoken, pronounced Czarlza), French name—Charles'a (final "s" is not spoken, pronounced Szarla).

History

Main article: History of Polish orthography

Poles adopted the Latin alphabet in the 12th century. However that alphabet was ill-equipped to represent certain Polish sounds, such as the palatal consonants and nasal vowels. Consequently, Polish spelling in the Middle Ages was highly inconsistent, as different writers used different systems to represent these sounds, For example, in early documents the letter c could signify the sounds now written c, cz, k, while the letter z was used for the sounds now written z, ż, ś, ź. Writers soon began to experiment with digraphs (combinations of letters), new letters (φ and ſ, no longer used), and eventually diacritics.

The Polish alphabet was one of two major forms of Latin-based orthography developed for Slavic languages, the other being Czech orthography, characterized by carons (háčeks), as in the letter č. The other major Slavic languages which are now written in Latin-based alphabets (Slovak, Slovene, Bosnian, and Croatian) use systems similar to the Czech. However a Polish-based orthography is used for Kashubian and usually for Silesian, while the Sorbian languages use elements of both systems.

Computer encoding

There are several different systems for encoding the Polish alphabet for computers. All letters of the Polish alphabet are included in Unicode, and thus Unicode-based encodings such as UTF-8 and UTF-16 can be used. The Polish alphabet is completely included in the Basic Multilingual Plane of Unicode. The standard 8-bit character encoding for the Polish alphabet is ISO 8859-2 (Latin-2), although both ISO 8859-13 (Latin-7) and ISO 8859-16 (Latin-10) encodings include glyphs of the Polish alphabet. Microsoft's format for encoding the Polish alphabet is Windows-1250.

The Polish letters which are not present in the English alphabet have the following HTML codes and Unicode codepoints:

Upper case	Ą	Ć	Ę	Ł	Ń	Ó	Ś	Ź	Ż
HTML entity	Ą &Aogon;	Ć &Cacute;	Ę &Eogon;	Ł &Lstrok;	Ń &Nacute;	Ó Ó	Ś &Sacute;	Ź &Zacute;	&#379 &Zdot;
Unicode	U+0104	U+0106	U+0118	U+0141	U+0143	U+00D3	U+015A	U+0179	U+017B
Result	Ą	Ć	Ę	Ł	Ń	Ó	Ś	Ź	Ż

Lower case	ą	ć	ę	ł	ń	ó	ś	ź	ż
HTML entity	ą &aogon;	ć &cacute;	ę &eogon;	ł &lstrok;	ń &nacute;	ó ó	ś &sacute;	ź &zacute;	&#380 &zdot;
Unicode	U+0105	U+0107	U+0119	U+0142	U+0144	U+00F3	U+015B	U+017A	U+017C
Result	ą	ć	ę	ł	ń	ó	ś	ź	ż

For other encodings, see the following table. Numbers in the table are hexadecimal.

Other encodings

character set	Ą	Ć	Ę	Ł	Ń	Ó	Ś	Ź	Ż	ą	ć	ę	ł	ń	ó	ś	ź	ż
ISO 8859-2	A1	C6	CA	A3	D1	D3	A6	AC	AF	B1	E6	EA	B3	F1	F3	B6	BC	BF
Windows-1250	A5	C6	CA	A3	D1	D3	8C	8F	AF	B9	E6	EA	B3	F1	F3	9C	9F	BF
IBM 852	A4	8F	A8	9D	E3	E0	97	8D	BD	A5	86	A9	88	E4	A2	98	AB	BE
Mazovia	8F	95	90	9C	A5	A3	98	A0	A1	86	8D	91	92	A4	A2	9E	A6	A7
Mac	84	8C	A2	FC	C1	EE	E5	8F	FB	88	8D	AB	B8	C4	97	E6	90	FD
ISO 8859-13 and Windows-1257	C0	C3	C6	D9	D1	D3	DA	CA	DD	E0	E3	E6	F9	F1	F3	FA	EA	FD
ISO 8859-16	A1	C5	DD	A3	D1	D3	D7	AC	AF	A2	E5	FD	B3	F1	F6	F7	AE	BF
IBM 775	B5	80	B7	AD	E0	E3	97	8D	A3	D0	87	D3	88	E7	A2	98	A5	A4
CSK	80	81	82	83	84	85	86	88	87	A0	A1	A2	A3	A4	A5	A6	A8	A7
Cyfromat	80	81	82	83	84	85	86	88	87	90	91	92	93	94	95	96	98	97
DHN	80	81	82	83	84	85	86	88	87	89	8A	8B	8C	8D	8E	8F	91	90
IINTE-ISIS	80	81	82	83	84	85	86	87	88	90	91	92	93	94	95	96	97	98
IEA-Swierk	8F	80	90	9C	A5	99	EB	9D	92	A0	9B	82	9F	A4	A2	87	A8	91
Logic	80	81	82	83	84	85	86	87	88	89	8A	8B	8C	8D	8E	8F	90	91
Microvex	8F	80	90	9C	A5	93	98	9D	92	A0	9B	82	9F	A4	A2	87	A8	91
Ventura	97	99	A5	A6	92	8F	8E	90	80	96	94	A4	A7	91	A2	84	82	87
ELWRO-Junior	C1	C3	C5	CC	CE	CF	D3	DA	D9	E1	E3	E5	EC	EE	EF	F3	FA	F9
AmigaPL	C2	CA	CB	CE	CF	D3	D4	DA	DB	E2	EA	EB	EE	EF	F3	F4	FA	FB
TeXPL	81	82	86	8A	8B	D3	91	99	9B	A1	A2	A6	AA	AB	F3	B1	B9	BB
Atari Club (Atari ST)	C1	C2	C3	C4	C5	C6	C7	C8	C9	D1	D2	D3	D4	D5	D6	D7	D8	D9
CorelDraw!	C5	F2	C9	A3	D1	D3	FF	E1	ED	E5	EC	E6	C6	F1	F3	A5	AA	BA
ATM	C4	C7	CB	D0	D1	D3	D6	DA	DC	E4	E7	EB	F0	F1	F3	F6	FA	FC

A common test sentence containing all the Polish diacritic letters is the nonsensical "Zażółć gęślą jaźń".

External links

Polish language

History Phonology Alphabet Spelling Grammar Morphology Dialects

This article is issued from Wikipedia - version of the Tuesday, March 29, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.