This page describes the technical specifications of the Copyleaks API.
Page Definition
A page is defined as up to 250 words. This means that every 250 words (or portion thereof) in your document counts as one page for billing purposes.
How Page Counting Works:
- 1-250 words = 1 page
- 251-500 words = 2 pages
- 501-750 words = 3 pages
- etc.
Supported Plagiarism File Types
| Type | File Types List |
|---|
| Textual: | html, htm, txt, csv, rtf, xml, md |
| Non-Textual: | pdf, docx, doc, pptx, ppt, odt, chm, epub, odp, ppsx, pages, xlsx, xls, LaTeX |
| Source code: | ts, py, go, cs, c, h, idc, cpp, hpp, c++, h++, cc, hh, java, js, swift, rb, pl, php, sh, m, scala, css |
You can access this list programmatically, for more info click here.
Supported Textual File Types
All supported plagiarism file types are also supported when submitted online by URL.
Supported Image Types (OCR)
The supported image files are pdf, docx, gif, png, bmp, jpg and jpeg . The files must contain textual content. Upload only.
You can access this list programmatically, for more info click here.
Supported Plagiarism Languages
| Setting | Description |
|---|
| Supported Languages | All languages supported by Unicode, including English, Spanish, French, Portuguese, Arabic, Russian, German, Greek, Chinese, Japanese, and more. More info. |
| Supported OCR Languages | See full list here. |
| Supported Cross Languages | See full list here. |
| Maximum Document Length | The maximum length allowed is 2000 pages (500K words). |
File Size
| Description | Max Upload File Size |
|---|
HTML files (html, htm, …) | 5 MB |
Text files (txt, csv) and source-code | 3 MB |
Non-Textual Documents (pdf, doc, docx, …) | 50 MB |
Image Types (jpg, png, bmp, …) | 25 MB |
Rate Limit
An account by default has a rate of 10 requests per second. If you still need higher rates, feel free to contact us.
Rate Limit Exceeded, If your host has reached its API limit, you will receive the HTTP error 429 (Too Many Requests) and you will be unable to authenticate with the Copyleaks API for 5 minutes.
Maintenance Periods
When our servers are under maintenance you will receive a 503 HTTP status code. Please wait a full minute and try again.
For more information about the service status - Copyleaks System Status.
Time
| Setting | Value |
|---|
| Time Format | dd/MM/yyyy HH:mm:ss |
| Time Zone | UTC |
| Default HTTP Request Timeout | 110 seconds |
Scan Expiration
Your created scans using the /v3/submit endpoints will be stored in Copyleaks servers for a specific duration of time. You can control the expiration of your scans in your submit request. Make sure you save your data before it expires:
| Type | hours |
|---|
| Max Expiration | 2880 |
| Default Expiration | 2880 |
Frequently asked questions
How does Copyleaks count pages for billing?
A page is defined as up to 250 words. Every 250 words (or portion thereof) counts as one page, so 1-250 words is 1 page, 251-500 words is 2 pages, and so on.
What is the maximum file size I can submit?
It depends on the file type: 50 MB for non-textual documents (PDF, DOC, DOCX), 25 MB for images submitted to OCR, 5 MB for HTML files, and 3 MB for text and source-code files.
What is the maximum document length?
2000 pages, which is approximately 500,000 words.
What is the Copyleaks API rate limit?
10 requests per second by default. Exceeding it returns HTTP 429 (Too Many Requests) and blocks authentication for 5 minutes. Contact Copyleaks if you need a higher rate.
How long are scans stored before they expire?
Scans are stored for 2880 hours (120 days) by default, which is also the maximum. You can set a shorter expiration in the submit request, so save your results before they expire.