Information about Arabic Transliteration
| Arabic alphabet | |||||
|---|---|---|---|---|---|
| ﺍ ﺍ ﺏ ﺕ ﺙ ﺝ | |||||
| ﺡ ﺥ ﺩ ﺫ ﺭ ﺭ | |||||
| ? ﺵ ﺹ ﺽ ﻁ ﻅ | |||||
| ﻉ ﻍ ﻑ ﻕ ﻙ | |||||
| ﻝ ﻡ ﻥ ه ﻭ | |||||
| History · Transliteration Diacritics · Hamza ﻱ Numerals · Numeration | |||||
Different approaches and methods for romanizing Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin alphabet; they also use different symbols for Arabic phonemes that do not exist in English or other European languages.
Romanization Issues
Any transliteration system has to make a number of decisions, dependent on its intended field of application. One basic problem is that written Arabic is normally unvocalized, i.e. many of the vowels are not written out, and must be supplied by a reader familiar with the language. But unvocalized Arabic writing does not give a reader unfamiliar with the language sufficient information for accurate pronunciation. An exact equivalent of e.g. صدام حسين would be ṣdʾm ḥsyn, which is meaningless to an untrained reader. The "full transliteration" adds information not in the text, which has to be supplied by a speaker of Arabic, ṣaddām ḥusayn. Usually, newspapers and popular books use not a transliteration, but a transcription: instead of transliterating each written letter they try to reproduce the sound of the words according to the orthography rules of the target language: Saddam Hussein.Most issues around the romanization of Arabic are about transliterating vs. transcribing – others, about what should be romanized:
- transliteration ignores assimilation (sandhi) of the article before the "sun letters," and may be easily misread by non-Arabs. For instance an-nur (or an-nuur, or an-noor) would be more correctly transliterated along the lines of alnur. In the transcription an-nur, a hyphen is added and the unpronounced 'l' removed for the convenience of the uninformed non-Arab reader, who would otherwise pronounce an 'l', probably not understand the word to be nur, pronounce only one 'n', and be confused by the role of the double 'n'. Alternatively, if the shadda is not transliterated (since it is strictly not a letter), a hypercorrect transliteration would be alnur, which presents similar problems for the uninformed non-Arab reader.
- a transliteration must render the "tied tā" (ta marbouta ة) faithfully, a transcription must render the sound ("a" like any other "a" or "t" like any other "at" — or in a vocalized text nothing vs. t)
- ISO 233 has a unique symbol, ẗ, ISO/R 233 uses superscript h, t.
- "broken alif" (alif maqṣura, ى) must be transliterated with a special symbol, but is transcribed like standing alif, when it stands for a long a (ā)
- Nunation: what is true elsewhere is also true for nunation: transliteration renders what you see, transcription what you hear.
A transliteration is ideally fully reversible: a machine must be able to translate it into Arabic and back. A transliteration may be criticized as flawed for any of the following reasons:
- A "loose" transliteration is ambiguous, rendering several Arabic phonemes with an identical transliteration, or digraphs for a single phoneme (such as sh) may be confused with two adjacent phonemes;
- Symbols representing phonemes may be considered too similar (e.g., ` and ' or ʿ and ʾ for ayin and hamza);
- ASCII transliterations using capital letters to disambiguate phonemes are easy to type but may be considered unaesthetic.
One criticism is that a fully accurate system would require special learning that most do not have to actually pronounce names correctly, and that with a lack of a universal Romanization system they will not be pronounced correctly by non-native speakers anyway. The precision will be lost if special characters are not replicated and if someone is not familiar with Arabic pronunciation.
Transliteration standards
- Deutsche Morgenländische Gesellschaft (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (ISBN 0-87950-003-4). http://www.dmg-web.de/
- ISO/R 233 (1961). Replaced by ISO 233 in 1984 but still encountered.
- BS 4280 (1968): Developed by the British Standards Institute. http://www.bsi-global.com/index.xalter
- SATTS (1970s): One-to-one mapping to Latin Morse equivalents; used by US military.
- UNGEGN (1972): http://www.eki.ee/wgrs/rom1_ar.pdf
- DIN-31635 (1982): Developed by the Deutsches Institut für Normung (German Institute for Standardization).
- ISO 233 (1984).
- Qalam (1985): A system that focuses upon preserving the spelling, rather than the pronunciation, and uses mixed case. http://eserver.org/langs/qalam.txt
- ISO 233-2(1993). Simplified transliteration.
- Buckwalter Transliteration (1990s): Developed at Xerox by Tim Buckwalter http://www.qamus.org/transliteration.htm; doesn't require unusual diacritics. http://www.xrce.xerox.com/competencies/content-analysis/arabic/info/buckwalter-about.html
- ALA-LC (1997). http://www.loc.gov/catdir/cpso/romanization/arabic.pdf
- SAS: Spanish Arabists School (José Antonio Conde and others, early 19th century onwards). http://www.sumadrid.es/ariza/alandalus/Transli.htm
Comparison table
| Letter | Name | SATTS | UNGEGN | ALA-LC | DIN | ISO | ISO/R | Qalam | SAS | SM | IPA |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ﺀ | hamza | E | ʼ, — | —, ’ | ʾ | ˈ, ˌ | —, ’ | ' | ʾ | ' | /ʔ/ |
| ﺍ | ʼalif | A | ā | ʾ | ā | aa | a, i, u; ā | aa | /a(ː)/ | ||
| ﺏ | bāʼ | B | b | b | b | b | b | /b/ | |||
| ﺕ | tāʼ | T | t | t | t | t | t | /t/ | |||
| ﺙ | ṯāʼ | C | th | ṯ | th | ṯ | ç | /θ/ | |||
| ﺝ | ǧīm, jīm, gīm | J | j | ǧ | j | ŷ | j | /ʤ/ / /g/ | |||
| ﺡ | ḥāʼ | H | ḩ | ḥ | ḥ | H | ḥ | ḥ | /ħ/ | ||
| ﺥ | ḫāʼ | O | kh | ḫ | ẖ | kh | j | x | /x/ | ||
| ﺩ | dāl | D | d | d | d | d | d | /d/ | |||
| ﺫ | ḏāl | Z | dh | ḏ | dh | ḏ | d | /g/ | |||
| ﺭ | rāʼ | R | r | r | r | r | r | /r/ | |||
| ﺯ | zāy | ; | z | z | z | z | z | /z/ | |||
| ﺱ | sīn | S | s | s | s | s | s | /s/ | |||
| ﺵ | šīn | : | sh | š | sh | š | š | /ʃ/ | |||
| ﺹ | ṣād | X | ş | ṣ | ṣ | S | ṣ | ṣ | /sˁ/ | ||
| ﺽ | ḍād | V | ḑ | ḍ | ḍ | D | ḍ | ḍ | /dˁ/ | ||
| ﻁ | ṭāʼ | U | ţ | ṭ | ṭ | T | ṭ | ṭ | /tˁ/ | ||
| ﻅ | ẓāʼ | Y | z̧ | ẓ | ẓ | Z | ẓ | đ? | /zˁ/ | ||
| ﻉ | ʻayn | ` | ʻ | ʿ | ` | ʿ | r | /ʕ/ | |||
| ﻍ | ġayn | G | gh | ġ | ḡ | gh | g | g | /ɣ/ | ||
| ﻑ | fāʼ | F | f | f | f | f | f | /f/ | |||
| ﻕ | qāf | Q | q | q | q | q | q | /q/ | |||
| ﻙ | kāf | K | k | k | k | k | k | /k/ | |||
| ﻝ | lām | L | l | l | l | l | l | /l/ | |||
| ﻡ | mīm | M | m | m | m | m | m | /m/ | |||
| ﻥ | nūn | N | n | n | n | n | n | /n/ | |||
| ﻩ | hāʼ | ~ | h | h | h | h | h | /h/ | |||
| ﻭ | wāw | W | w | w | w | w; ū | w; o | /w/, /uː/ | |||
| ﻱ | yāʼ | I | y | y | y | y; ī | y; e | /j/, /iː/ | |||
| ﺁ | ʼalif mamdūda | AEA | ā | ā, ʼā | ʾā | ʾâ | ā, ʾā | ā | 'aa | /ʔaː/ | |
| ﺓ | tāʼ marbūṭa | @ | h, t | h, t | ẗ | h, t | h, t | t; — | t | /a/, /at/ | |
| ﻯ | ʼalif maqṣūra | / | y | ā | ỳ | ae | à | à | /aː/ | ||
| ﻻ | lām ʼalif | LA | lā | lā | laʾ | lā | la | lʾ; lā | laa | /lː/ | |
| ال | ʼalif lām | AL | al- | al- | ʾˈal | al- | al | al- | al-; ál- | var. | |
Online
See also
- Arabic language
- Arabic alphabet
- Arabic grammar
- Arabic names
- Romanization
- Arabic Chat Alphabet
- Transliteration
External links
- [http://www.dhu-yazan.com/ Online en > ar transliteration tool in Arabic] & English interfaces.
- SATTS: Roman-to-Arabic mappings
- Omniglot: Arabic alphabet, pronunciation and language
- J'raxis·Com: The Arabic Script
- Table comparing Romanization systems
- Learn the Arabic Script Online
Arabic abjad
Unicode range U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF
U+FE70 to U+FEFF
ISO 15924 Arab (#160)
Note: This page may contain IPA phonetic symbols in Unicode.
..... Read more.
Unicode range U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF
U+FE70 to U+FEFF
ISO 15924 Arab (#160)
Note: This page may contain IPA phonetic symbols in Unicode.
..... Read more.
Alif (Arabic: ﺍ, pronounced ʾalif) is the first letter of the Arabic alphabet.
..... Read more.
..... Read more.
Bet, Beth, or Vet is the second letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Taw or Tav is the twenty-second and last letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Gimel is the third letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
..... Read more.
Dalet (
..... Read more.
..... Read more.
Resh is the twentieth letter of many Semitic alphabets, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Zayin (also spelled Zain or Zayn) is the seventh letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Shin (also spelled Šin or Sheen) is the twenty-first letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Tsade (also spelled Ṣādē or Tzadi or Sadhe or Tzaddik) is the eighteenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
..... Read more.
..... Read more.
Pe is the seventeenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Qoph or Qop (In Hebrew: Kuf) is the nineteenth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Kaph (also spelled Kap or Kaf) is the eleventh letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
Lamed or Lamedh is the twelfth letter in many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
MEM is a three-letter abbreviation with multiple meanings, as described below:
..... Read more.
- Maximum entropy method
- IATA airport code for Memphis International Airport
- β-Methoxyethoxymethyl ether, a protecting group in chemistry
..... Read more.
Nun is the fourteenth letter of many Semitic abjads, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
He is the fifth letter of many Semitic alphabets, including Phoenician , Aramaic, Hebrew
..... Read more.
..... Read more.
Waw (
..... Read more.
..... Read more.
Yodh (also spelled Yud or Yod) is the tenth letter of many Semitic alphabets, including Phoenician, Aramaic, Hebrew
..... Read more.
..... Read more.
%,
T also represented Z,
ayin also represented gh %,
S also represented D,
t also represented s.
..... Read more.
T also represented Z,
ayin also represented gh %,
S also represented D,
t also represented s.
..... Read more.
ḥarakāt (حركات — the singular is ḥaraka حركة) are the diacritic marks used to represent vowel sounds.
..... Read more.
..... Read more.
Hamza (ء) is a letter in the Arabic alphabet, representing the glottal stop [ʔ].
..... Read more.
..... Read more.
The Eastern Arabic numerals (also called Arabic-Indic numerals, Arabic Eastern Numerals) are the symbols (glyphs) used to represent the Hindu-Arabic numeral system in conjunction with the Arabic alphabet in Egypt, Iran, Afghanistan, Pakistan and parts of India, and also in
..... Read more.
..... Read more.
Abjad numerals are a decimal numeral system which was used in the Arabic-speaking world prior to the use of the Hindu-Arabic numerals from the 8th century, and in parallel with the latter until Modern times.
..... Read more.
..... Read more.
romanization (or Latinization, also spelled romanisation or Latinisation) is the representation of a word or language with the Roman (Latin) alphabet, or a system for doing so, where the original word or language uses a different writing system (or none).
..... Read more.
..... Read more.
al-‘Arabiyyah in written Arabic (Kufic script):
Pronunciation: /alˌʕa.raˈbij.ja/
Spoken in: Algeria, Bahrain, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman,
..... Read more.
Pronunciation: /alˌʕa.raˈbij.ja/
Spoken in: Algeria, Bahrain, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman,
..... Read more.
Latin alphabet
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
Unicode range See Latin characters in Unicode
ISO 15924 Latn
Note
..... Read more.
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
Unicode range See Latin characters in Unicode
ISO 15924 Latn
Note
..... Read more.