What is Digraph (orthography)?

Information about Digraph (orthography)

A digraph, bigraph or digram is a pair of characters used to write one phoneme (distinct sound) or a sequence of phonemes that does not correspond to the two characters in sequence. The sound is often, but not necessarily, one which cannot be expressed using a single character in the orthography used by the language. Normally, the term "digraph" is reserved for graphemes whose pronunciation is always or nearly always the same.

When digraphs do not represent a special sound, they may be relics from an earlier period of the language when they did have a different pronunciation, or represent a distinction which is made only in certain dialects, like wh in English. They may also be used for purely etymological reasons, like rh in English.

In some languages, digraphs are considered individual letters, meaning that they have their own place in the alphabet, in the standard orthography, and cannot be separated into their constituent graphemes; e.g.: when sorting, abbreviating or hyphenating. In others, like English, this is not the case.

Some schemes of Romanization make extensive use of digraphs (e.g. Cyrillic to Roman for English readers), while others rely solely on diacritics (e.g. Cyrillic to the modified Roman used for Turkish). To avoid ambiguity, transliteration based on diacritics is generally preferred in academic circles. Many languages, like Serbian and Turkish, have no digraphs, and so transliterations into these languages also cannot use digraphs.

Types of digraphs

There are two main kinds of digraphs, sequences and double letters.

Sequences

This is a pair of different letters in a specific order. Examples in English are: Digraphs may also be composed of vowels. Common examples in English are:
  • ea usually pronounced /i:/, /ɛ/ or /eɪ/.
  • ie usually pronounced /i:/ or /aɪ/
  • ai usually pronounced /ɛ/ or /eɪ/.
  • ei usually pronounced /i:/, more rarely /aɪ/.
  • au usually pronounced /ɔ/.
  • eu usually pronounced /ju/.
  • ou usually pronounced /aʊ/, more rarely /u:/.
  • aw usually pronounced /ɔ/.
  • ew usually pronounced /ju/.
  • ow usually pronounced /oʊ/ or /aʊ/.
For further information on English, see English orthography.

In Dutch, the digraph ij, which often resembles a y (or a ÿ) in handwriting, represents the diphthong /ɛɪ/. Opinions are divided on whether it should be considered part of the alphabet.

Double letters

These are pairs of identical letters that have a special pronunciation. In some languages they indicate consonant length or vowel length, a stressed syllable or a new sound, but in other cases they are just part of the spelling convention. Ll is the most common in English, though it does not represent a different sound from l, being essentially an etymological digraph. In Welsh, however, it stands for a voiceless lateral, and in Spanish it stands for a palatal consonant. Ee and oo are common English digraphs made up of vowels. Some more examples:
  • In several languages of western Europe, including English and French, ss is used between vowels for the voiceless sibilant /s/ (voiceless alveolar fricative), since an s alone between vowels is normally voiced, /z/ (voiced alveolar fricative). In German, an archaic version of this digraph originated the letter ß.
  • In Romance languages such as Spanish or Italian, rr is used between vowels for the alveolar trill /r/, since an r alone between vowels represents an alveolar flap /ɾ/ (the two are different phonemes in these languages).
  • In Italian, zz (as in the word pizza) is an affricate, /ts/ or /dz/.
  • In several Germanic languages, including English, CC (where C stands for a given consonant) corresponds to C and signifies that the preceding vowel is short.

Ambiguity

Some letter pairs should not be interpreted as digraphs, but appear due to compounding, like in hogshead and cooperate. This is often not marked in any way (it is an exception which must simply be memorized), but some authors indicate it either by breaking up the digraph with a hyphen, as in hogs-head, co-operate, or with a diaeresis mark, as in coöperate, though this usage is rare in English.

In Czech also (and analogically in other Slavic languages), double letters may appear in compound words, but they are not considered digraphs. Examples: bezzubi (bez + zubi, toothless), cenni (cen + ni, valuable), černooki (černo + oki, black-eyed).

Digraphs versus letters

In some languages, digraphs and trigraphs are counted as distinct letters in themselves, and assigned to a specific place in the alphabet, separate from that of the sequence of characters which composes them, in orthography or collation. Other languages, such as English, make no such convention, and split digraphs into their constituent letters for collation purposes. Some language alphabets that include digraphs are:

In non-Latin alphabets

Digraphs also exist in languages that are not written with the Latin alphabet. For example, modern Greek has the following:
  • αι (ai) represents /e̞/
  • ει (ei) represents /i/
  • οι (oi) represents /i/
  • ου (ou) represents /u/
  • υι (yi) represents /i/
  • γγ (gg) represents /ɡ/
  • γκ (gk) represents /ɡ/
  • μπ (mp) represents /b/
  • ντ (nt) represents /d/

See also

International Phonetic Alphabet

Note: This page may contain IPA phonetic symbols in Unicode.

The International
Phonetic Alphabet
History
Nonstandard symbols
Extended IPA
Naming conventions
IPA for English The
..... Read more.
Unicode is an industry standard allowing computers to consistently represent and manipulate text expressed in any of the world's writing systems. Developed in tandem with the Universal Character Set standard and published in book form as The Unicode Standard
..... Read more.
International Phonetic Alphabet

Note: This page may contain IPA phonetic symbols in Unicode.

The International
Phonetic Alphabet
History
Nonstandard symbols
Extended IPA
Naming conventions
IPA for English The
..... Read more.
phoneme is the smallest unit of speech that distinguishes meaning. Phonemes are not the physical segments themselves, but abstractions of them. An example of a phoneme would be the /t/ found in words like tip,
..... Read more.
The orthography of a language specifies the correct way of using a specific writing system to write the language. (Where more than one writing system is used for a language, for example for Kurdish, there can be more than one orthography.
..... Read more.
grapheme is the fundamental unit in written language. Graphemes include alphabetic letters, Chinese characters, numerals, punctuation marks, and all the individual symbols of any of the world's writing systems.

In a phonemic orthography, a grapheme corresponds to one phoneme.
..... Read more.
English 
Writing system: Latin (English variant) 
Official status
Official language of: 53 countries
Regulated by: no official regulation
Language codes
ISO 639-1: en
ISO 639-2: eng
ISO 639-3: eng  
..... Read more.
Etymology is the study of the history of words - when they entered a language, from what source, and how their form and meaning have changed over time.

In languages with a long written history, etymology makes use of philology, the study of how words change from culture to
..... Read more.
letter is an element in an alphabetic system of writing, such as the Greek alphabet and its descendants. Each letter in the written language is usually associated with one or two phonemes (sounds) in the spoken form of the language.
..... Read more.
romanization (or Latinization, also spelled romanisation or Latinisation) is the representation of a word or language with the Roman (Latin) alphabet, or a system for doing so, where the original word or language uses a different writing system (or none).
..... Read more.
Cyrillic alphabet

Sister systems Latin alphabet
Coptic alphabet
Armenian
Unicode range U+0400 to U+052F
ISO 15924 Cyrl

Note: This page may contain IPA phonetic symbols in Unicode.
..... Read more.
English 
Writing system: Latin (English variant) 
Official status
Official language of: 53 countries
Regulated by: no official regulation
Language codes
ISO 639-1: en
ISO 639-2: eng
ISO 639-3: eng  
..... Read more.
A diacritical mark or diacritic, also called an accent, is a small sign added to a letter to alter pronunciation or to distinguish between similar words.
..... Read more.
The Turkish alphabet is a variant of the Latin alphabet, itself derived from the Greek alphabet, used for writing the Turkish language, consisting of 29 letters, a certain number of which (Ç, Ğ, I, İ, Ö, Ş, and Ü) have been adapted or modified for the phonetic
..... Read more.
Turkish (Türkçe, ]
..... Read more.
Serbian 
Official status
Official language of:  Serbia

 Republic of Macedonia (in some municipalities)
Regulated by: Board for Standardization of the Serbian Language
Language codes
ISO 639-1: sr
ISO 639-2: scc (B)
..... Read more.
Turkish (Türkçe, ]
..... Read more.
English 
Writing system: Latin (English variant) 
Official status
Official language of: 53 countries
Regulated by: no official regulation
Language codes
ISO 639-1: en
ISO 639-2: eng
ISO 639-3: eng  
..... Read more.
Ch is a digraph in the Roman alphabet. It is treated as a letter of its own in the Spanish[1], Chamorro, Czech, Slovak, Quechua, Welsh, Breton and Belarusian Lacinka alphabets.
..... Read more.
The voiceless palato-alveolar affricate or domed postalveolar affricate is a type of consonantal sound used in some spoken languages. It is familiar to English speakers as the "ch" sound in "chip".
..... Read more.
The voiceless velar plosive is a type of consonantal sound used in many spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is k, and the equivalent X-SAMPA symbol is k.
..... Read more.
The voiceless palato-alveolar fricative or domed postalveolar fricative (IPA [ʃ]) is a type of consonantal sound, used in some spoken languages.
..... Read more.
Ck is a digraph common in many languages. In English, ck represents the /k/ sound, and is common at the ends of words, as in "duck", "track", "tack", "deck", "tick", "lock" etc.
..... Read more.
The voiceless velar plosive is a type of consonantal sound used in many spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is k, and the equivalent X-SAMPA symbol is k.
..... Read more.
Gh is a digraph found in many languages.

In Latin-based orthographies

English

In English, gh historically represented [x] (the voiceless velar fricative, as in the Scottish Gaelic word loch).
..... Read more.
voiced velar plosive is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is ɡ, and the equivalent X-SAMPA symbol is g.
..... Read more.
The voiceless labiodental fricative is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is f, and the equivalent X-SAMPA symbol is
..... Read more.
In an alphabetic writing system, a silent letter is a letter that, in a particular word, does not correspond to any sound in the word's pronunciation. Silent letters create problems for both native and non-native speakers of a language, as they make it more difficult to guess the
..... Read more.
In linguistics, a compound is a lexeme (a word) that consists of more than one other lexeme.

An endocentric compound consists of a head, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning.
..... Read more.
Ng (lowercase: ng) is a digraph of the Latin alphabet. In English and English-derived orthographies, it generally represents the velar nasal, IPA ŋ.
..... Read more.