Miscellaneous
OCR Supported Languages
Get the list of languages the Copyleaks OCR engine supports for extracting text from images and scanned documents.
GET
This is not a list of supported languages for the API, but only for the OCR files scan
Response
- 200
200 OK - The supported language codes in ISO-639-1 standard.
OCR Supported Languages
These are the language codes supported by our OCR scan inISO-639-1 standard:
| Code | Language | Code | Language |
|---|---|---|---|
| af | Afrikaans | am | Amharic |
| ar | Arabic | az | Azerbaijani |
| be | Belarusian | bg | Bulgarian |
| bn | Bengali | bs | Bosnian |
| ca | Catalan | ceb | Cebuano |
| co | Corsican | cs | Czech |
| cy | Welsh | da | Danish |
| de | German | el | Greek |
| en | English | eo | Esperanto |
| es | Spanish | et | Estonian |
| eu | Basque | fa | Persian |
| fi | Finnish | fr | French |
| fy | Frisian | ga | Irish |
| gd | Scottish Gaelic | gl | Galician |
| gu | Gujarati | ha | Hausa |
| haw | Hawaiian | hi | Hindi |
| hmn | Hmong | hr | Croatian |
| ht | Haitian Creole | hu | Hungarian |
| hy | Armenian | id | Indonesian |
| ig | Igbo | is | Icelandic |
| it | Italian | iw | Hebrew |
| ja | Japanese | jw | Javanese |
| ka | Georgian | kk | Kazakh |
| km | Khmer | kn | Kannada |
| ko | Korean | ku | Kurdish |
| ky | Kyrgyz | la | Latin |
| lb | Luxembourgish | lo | Lao |
| lt | Lithuanian | lv | Latvian |
| ma | Marathi | mg | Malagasy |
| mi | Maori | mk | Macedonian |
| ml | Malayalam | mn | Mongolian |
| mr | Marathi | ms | Malay |
| mt | Maltese | my | Burmese |
| ne | Nepali | nl | Dutch |
| no | Norwegian | ny | Chichewa |
| pl | Polish | ps | Pashto |
| pt | Portuguese | ro | Romanian |
| ru | Russian | sd | Sindhi |
| si | Sinhala | sk | Slovak |
| sl | Slovenian | sm | Samoan |
| sn | Shona | so | Somali |
| sq | Albanian | sr | Serbian |
| st | Sesotho | su | Sundanese |
| sv | Swedish | sw | Swahili |
| ta | Tamil | te | Telugu |
| tg | Tajik | th | Thai |
| tl | Tagalog | tr | Turkish |
| uk | Ukrainian | ur | Urdu |
| uz | Uzbek | vi | Vietnamese |
| xh | Xhosa | yi | Yiddish |
| yo | Yoruba | zh-CN | Chinese (Simplified) |
| zh-TW | Chinese (Traditional) | zu | Zulu |
Frequently asked questions
What are OCR supported languages used for?
They apply only to OCR scans, where Copyleaks extracts text from images and scanned documents. This is not the general language list for plagiarism or AI detection.How do I get the current list of OCR languages?
CallGET https://api.copyleaks.com/v3/miscellaneous/ocr-languages-list. It is a public endpoint that needs no authentication. Copyleaks keeps adding languages, so load the list at runtime instead of hardcoding it.
What language code format does OCR use?
ISO-639-1 codes (for exampleen, fr, ar), with zh-CN for Simplified Chinese and zh-TW for Traditional Chinese.
Does OCR support non-Latin scripts like Arabic, Chinese, and Hindi?
Yes. The OCR engine supports 100+ languages, including Arabic (ar), Chinese (zh-CN, zh-TW), Hindi (hi), Japanese (ja), Korean (ko), and many more.
