# OCR Supported Languages

> Get the list of languages the Copyleaks OCR engine supports for extracting text from images and scanned documents.

<RequestExample>

```bash title="cURL" icon="terminal"
curl --request GET \
  --url https://api.copyleaks.com/v3/miscellaneous/ocr-languages-list
```

```python title="Python" icon="python"
from copyleaks.copyleaks import Copyleaks

# Public endpoint  no authentication required.
languages = Copyleaks.ocr_supported_langauges()
print(languages)
```

</RequestExample>

<ResponseExample>

```json 200 OK
["af", "sq", "az", "...", "zu"]
```

</ResponseExample>

Get a list of the supported languages for OCR

<Note>
This is not a list of supported languages for the API, but only for the OCR files scan
</Note>

## Response

<Tabs>
  <Tab title="200">
    <Check>**200 OK** - The supported language codes in ISO-639-1 standard.</Check>

    ```json
    ["af", "sq", "az", "...", "zu"]
    ```
  </Tab>
</Tabs>

---

## OCR Supported Languages

These are the language codes supported by our OCR scan in `ISO-639-1` standard:

<Tip>
  We keep updating the list with new languages so we recommend [loading the list in runtime](/reference/actions/miscellaneous/ocr-supported-languages) rather than copying it to your code.
</Tip>

| Code   | Language        | Code   | Language        |
|--------|---------------|--------|---------------|
| af     | Afrikaans     | am     | Amharic       |
| ar     | Arabic        | az     | Azerbaijani   |
| be     | Belarusian    | bg     | Bulgarian     |
| bn     | Bengali       | bs     | Bosnian       |
| ca     | Catalan       | ceb    | Cebuano       |
| co     | Corsican      | cs     | Czech         |
| cy     | Welsh         | da     | Danish        |
| de     | German        | el     | Greek         |
| en     | English       | eo     | Esperanto     |
| es     | Spanish       | et     | Estonian      |
| eu     | Basque        | fa     | Persian       |
| fi     | Finnish       | fr     | French        |
| fy     | Frisian       | ga     | Irish         |
| gd     | Scottish Gaelic | gl  | Galician     |
| gu     | Gujarati      | ha     | Hausa         |
| haw    | Hawaiian      | hi     | Hindi         |
| hmn    | Hmong        | hr     | Croatian      |
| ht     | Haitian Creole | hu    | Hungarian    |
| hy     | Armenian      | id     | Indonesian    |
| ig     | Igbo         | is     | Icelandic     |
| it     | Italian      | iw     | Hebrew        |
| ja     | Japanese     | jw     | Javanese      |
| ka     | Georgian     | kk     | Kazakh        |
| km     | Khmer        | kn     | Kannada       |
| ko     | Korean       | ku     | Kurdish       |
| ky     | Kyrgyz       | la     | Latin         |
| lb     | Luxembourgish | lo    | Lao           |
| lt     | Lithuanian   | lv     | Latvian       |
| ma     | Marathi      | mg     | Malagasy      |
| mi     | Maori        | mk     | Macedonian    |
| ml     | Malayalam    | mn     | Mongolian     |
| mr     | Marathi      | ms     | Malay         |
| mt     | Maltese      | my     | Burmese       |
| ne     | Nepali       | nl     | Dutch         |
| no     | Norwegian    | ny     | Chichewa      |
| pl     | Polish       | ps     | Pashto        |
| pt     | Portuguese   | ro     | Romanian      |
| ru     | Russian      | sd     | Sindhi        |
| si     | Sinhala      | sk     | Slovak        |
| sl     | Slovenian    | sm     | Samoan        |
| sn     | Shona        | so     | Somali        |
| sq     | Albanian     | sr     | Serbian       |
| st     | Sesotho      | su     | Sundanese     |
| sv     | Swedish      | sw     | Swahili       |
| ta     | Tamil        | te     | Telugu        |
| tg     | Tajik        | th     | Thai          |
| tl     | Tagalog      | tr     | Turkish       |
| uk     | Ukrainian    | ur     | Urdu          |
| uz     | Uzbek        | vi     | Vietnamese    |
| xh     | Xhosa        | yi     | Yiddish       |
| yo     | Yoruba       | zh-CN  | Chinese (Simplified) |
| zh-TW  | Chinese (Traditional) | zu | Zulu |

## Frequently asked questions

### What are OCR supported languages used for?

They apply only to OCR scans, where Copyleaks extracts text from images and scanned documents. This is not the general language list for plagiarism or AI detection.

### How do I get the current list of OCR languages?

Call `GET https://api.copyleaks.com/v3/miscellaneous/ocr-languages-list`. It is a public endpoint that needs no authentication. Copyleaks keeps adding languages, so load the list at runtime instead of hardcoding it.

### What language code format does OCR use?

ISO-639-1 codes (for example `en`, `fr`, `ar`), with `zh-CN` for Simplified Chinese and `zh-TW` for Traditional Chinese.

### Does OCR support non-Latin scripts like Arabic, Chinese, and Hindi?

Yes. The OCR engine supports 100+ languages, including Arabic (`ar`), Chinese (`zh-CN`, `zh-TW`), Hindi (`hi`), Japanese (`ja`), Korean (`ko`), and many more.
