What is Arabic Transliteration?

Information about Arabic Transliteration

Arabic alphabet
                        
                        
?                        
                   
               ه‍    
History · Transliteration
Diacritics · Hamza
Numerals · Numeration
    [ e]


Different approaches and methods for romanizing Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin alphabet; they also use different symbols for Arabic phonemes that do not exist in English or other European languages.

Romanization Issues

Any transliteration system has to make a number of decisions, dependent on its intended field of application. One basic problem is that written Arabic is normally unvocalized, i.e. many of the vowels are not written out, and must be supplied by a reader familiar with the language. But unvocalized Arabic writing does not give a reader unfamiliar with the language sufficient information for accurate pronunciation. An exact equivalent of e.g. صدام حسين would be ṣdʾm ḥsyn, which is meaningless to an untrained reader. The "full transliteration" adds information not in the text, which has to be supplied by a speaker of Arabic, ṣaddām ḥusayn. Usually, newspapers and popular books use not a transliteration, but a transcription: instead of transliterating each written letter they try to reproduce the sound of the words according to the orthography rules of the target language: Saddam Hussein.

Most issues around the romanization of Arabic are about transliterating vs. transcribing – others, about what should be romanized:
  • transliteration ignores assimilation (sandhi) of the article before the "sun letters," and may be easily misread by non-Arabs. For instance an-nur (or an-nuur, or an-noor) would be more correctly transliterated along the lines of alnur. In the transcription an-nur, a hyphen is added and the unpronounced 'l' removed for the convenience of the uninformed non-Arab reader, who would otherwise pronounce an 'l', probably not understand the word to be nur, pronounce only one 'n', and be confused by the role of the double 'n'. Alternatively, if the shadda is not transliterated (since it is strictly not a letter), a hypercorrect transliteration would be alnur, which presents similar problems for the uninformed non-Arab reader.
  • a transliteration must render the "tied tā" (ta marbouta ة) faithfully, a transcription must render the sound ("a" like any other "a" or "t" like any other "at" — or in a vocalized text nothing vs. t)
  • ISO 233 has a unique symbol, , ISO/R 233 uses superscript h, t.
  • "broken alif" (alif maqṣura, ى) must be transliterated with a special symbol, but is transcribed like standing alif, when it stands for a long a (ā)
  • Nunation: what is true elsewhere is also true for nunation: transliteration renders what you see, transcription what you hear.
A transcription may reflect the language as spoken, for example, by the people of Baghdad, or the official standard as spoken by a preacher in the mosque or a TV news reader. A transcription is free to add phonological (such as vowels) or morphological (such as word boundaries) information. Transcriptions will also vary depending on the writing conventions of the target language; compare English Omar Khayyam with German Omar Chajjam, both for عمر خيام (unvocalized ʿmr ḫyʾm, vocalized ʿumar ḫayyām).

A transliteration is ideally fully reversible: a machine must be able to translate it into Arabic and back. A transliteration may be criticized as flawed for any of the following reasons:
  • A "loose" transliteration is ambiguous, rendering several Arabic phonemes with an identical transliteration, or digraphs for a single phoneme (such as sh) may be confused with two adjacent phonemes;
  • Symbols representing phonemes may be considered too similar (e.g., ` and ' or ʿ and ʾ for ayin and hamza);
  • ASCII transliterations using capital letters to disambiguate phonemes are easy to type but may be considered unaesthetic.
A fully accurate transcription may not be necessary for native Arabic speakers as they would be able to pronounce names and sentences correctly anyway, but it can be very useful for those not fully familiar with spoken Arabic and who are familiar with the Roman alphabet. An accurate transliteration serves as a valuable stepping stone for learning, pronouncing correctly, and distinguishing phonemes. It is a useful tool for anyone familiar with the sounds of Arabic but who are not fully conversant in the language.

One criticism is that a fully accurate system would require special learning that most do not have to actually pronounce names correctly, and that with a lack of a universal Romanization system they will not be pronounced correctly by non-native speakers anyway. The precision will be lost if special characters are not replicated and if someone is not familiar with Arabic pronunciation.

Transliteration standards

A table comparing romanizations using DIN 31635, ISO 233, ISO/R 233, UN, ALA-LC, and Encyclopaedia of Islam systems is available here: [1].

Comparison table

Letter Name SATTS UNGEGN ALA-LC DIN ISO ISO/R Qalam SAS SM IPA
hamzaEʼ, ——, ’ʾˈ, ˌ—, ’'ʾ'/ʔ/
ʼalifAāʾāaaa, i, u; āaa/a(ː)/
bāʼBbbbbb/b/
tāʼTttttt/t/
ṯāʼCththç/θ/
ǧīm, jīm, gīmJjǧjŷj/ʤ/ / /g/
ḥāʼHH/ħ/
ḫāʼOkhkhjx/x/
dālDddddd/d/
ḏālZdhdhd/g/
rāʼRrrrrr/r/
zāy;zzzzz/z/
sīnSsssss/s/
šīn:shšshšš/ʃ/
ṣādXşS/sˁ/
ḍādVD/dˁ/
ṭāʼUţT/tˁ/
ẓāʼYZđ?/zˁ/
ʻayn`ʻʿ`ʿr/ʕ/
ġaynGghġghgg/ɣ/
fāʼFfffff/f/
qāfQqqqqq/q/
kāfKkkkkk/k/
lāmLlllll/l/
mīmMmmmmm/m/
nūnNnnnnn/n/
hāʼ~hhhhh/h/
wāwWwwww; ūw; o/w/, /uː/
yāʼIyyyy; īy; e/j/, /iː/
ʼalif mamdūdaAEAāā, ʼāʾāʾâā, ʾāā'aa/ʔaː/
tāʼ marbūṭa@h, th, th, th, tt; —t/a/, /at/
ʼalif maqṣūra/yāaeàà/aː/
lām ʼalifLAlaʾla; laa/lː/
الʼalif lāmALal-al-ʾˈalal-alal-al-; ál-var.

Online

Main article: Arabic Chat Alphabet
Online communication is sometimes restricted to an ASCII environment in which not only the Arabic letters themselves but also Roman characters with diacritics are unavailable. Even when Arabic letters and Roman characters with diacritics are available, they are often difficult to type. This problem is faced by most speakers of languages that use non-Roman alphabets, or heavily modified ones. An ad hoc solution consists of using Arabic numerals which mirror or resemble the relevant Arabic.

See also

External links

Arabic abjad

Unicode range U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF
U+FE70 to U+FEFF
ISO 15924 Arab (#160)

Note: This page may contain IPA phonetic symbols in Unicode.
..... Read more.
Alif (Arabic: , pronounced ʾalif) is the first letter of the Arabic alphabet.
..... Read more.
Bet, Beth, or Vet is the second letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Taw or Tav is the twenty-second and last letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Gimel is the third letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.

..... Read more.
Dalet (
..... Read more.
Resh is the twentieth letter of many Semitic alphabets, including Phoenician, Aramaic, Hebrew
..... Read more.
Zayin (also spelled Zain or Zayn) is the seventh letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Shin (also spelled Šin or Sheen) is the twenty-first letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Tsade (also spelled Ṣādē or Tzadi or Sadhe or Tzaddik) is the eighteenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.

..... Read more.

..... Read more.
Pe is the seventeenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Qoph or Qop (In Hebrew: Kuf) is the nineteenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Kaph (also spelled Kap or Kaf) is the eleventh letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
Lamed or Lamedh is the twelfth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
MEM is a three-letter abbreviation with multiple meanings, as described below:
  • Maximum entropy method
  • IATA airport code for Memphis International Airport
  • β-Methoxyethoxymethyl ether, a protecting group in chemistry

..... Read more.
Nun is the fourteenth letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
He is the fifth letter of many Semitic alphabets, including Phoenician , Aramaic, Hebrew
..... Read more.
Waw (
..... Read more.
Yodh (also spelled Yud or Yod) is the tenth letter of many Semitic alphabets, including Phoenician, Aramaic, Hebrew
..... Read more.
%,
T also represented Z,
ayin also represented gh %,
S also represented D,
t also represented s.
..... Read more.
ḥarakāt (حركات — the singular is ḥaraka حركة) are the diacritic marks used to represent vowel sounds.
..... Read more.
Hamza (ء) is a letter in the Arabic alphabet, representing the glottal stop [ʔ].
..... Read more.
The Eastern Arabic numerals (also called Arabic-Indic numerals, Arabic Eastern Numerals) are the symbols (glyphs) used to represent the Hindu-Arabic numeral system in conjunction with the Arabic alphabet in Egypt, Iran, Afghanistan, Pakistan and parts of India, and also in
..... Read more.
Abjad numerals are a decimal numeral system which was used in the Arabic-speaking world prior to the use of the Hindu-Arabic numerals from the 8th century, and in parallel with the latter until Modern times.
..... Read more.
romanization (or Latinization, also spelled romanisation or Latinisation) is the representation of a word or language with the Roman (Latin) alphabet, or a system for doing so, where the original word or language uses a different writing system (or none).
..... Read more.
al-‘Arabiyyah in written Arabic (Kufic script):  
Pronunciation: /alˌʕa.raˈbij.ja/
Spoken in: Algeria, Bahrain, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman,
..... Read more.
Latin alphabet
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
Unicode range See Latin characters in Unicode
ISO 15924 Latn

Note
..... Read more.