This page describes the technical specifications of the Copyleaks API.

Page Definition

A page is defined as up to 250 words. This means that every 250 words (or portion thereof) in your document counts as one page for billing purposes. How Page Counting Works:
  • 1-250 words = 1 page
  • 251-500 words = 2 pages
  • 501-750 words = 3 pages
  • etc.

Input Limits

Supported Plagiarism File Types

TypeFile Types List
Textual:html, htm, txt, csv, rtf, xml, md
Non-Textual:pdf, docx, doc, pptx, ppt, odt, chm, epub, odp, ppsx, pages, xlsx, xls, LaTeX
Source code:ts, py, go, cs, c, h, idc, cpp, hpp, c++, h++, cc, hh, java, js, swift, rb, pl, php, sh, m, scala, css
You can access this list programmatically, for more info click here.

Supported Textual File Types

All supported plagiarism file types are also supported when submitted online by URL.

Supported Image Types (OCR)

The supported image files are pdf, docx, gif, png, bmp, jpg and jpeg . The files must contain textual content. Upload only.
You can access this list programmatically, for more info click here.

Supported Plagiarism Languages

SettingDescription
Supported LanguagesAll languages supported by Unicode, including English, Spanish, French, Portuguese, Arabic, Russian, German, Greek, Chinese, Japanese, and more. More info.
Supported OCR LanguagesSee full list here.
Supported Cross LanguagesSee full list here.
Maximum Document LengthThe maximum length allowed is 2000 pages (500K words).

File Size

DescriptionMax Upload File Size
HTML files (html, htm, …)5 MB
Text files (txt, csv) and source-code3 MB
Non-Textual Documents (pdf, doc, docx, …)50 MB
Image Types (jpg, png, bmp, …)25 MB

Rate Limit

An account by default has a rate of 10 requests per second. If you still need higher rates, feel free to contact us.
Rate Limit Exceeded, If your host has reached its API limit, you will receive the HTTP error 429 (Too Many Requests) and you will be unable to authenticate with the Copyleaks API for 5 minutes.

Maintenance Periods

When our servers are under maintenance you will receive a 503 HTTP status code. Please wait a full minute and try again. For more information about the service status - Copyleaks System Status.

Time

SettingValue
Time Formatdd/MM/yyyy HH:mm:ss
Time ZoneUTC
Default HTTP Request Timeout110 seconds

Scan Expiration

Your created scans using the /v3/submit endpoints will be stored in Copyleaks servers for a specific duration of time. You can control the expiration of your scans in your submit request. Make sure you save your data before it expires:
Typehours
Max Expiration2880
Default Expiration2880

Frequently asked questions

How does Copyleaks count pages for billing?

A page is defined as up to 250 words. Every 250 words (or portion thereof) counts as one page, so 1-250 words is 1 page, 251-500 words is 2 pages, and so on.

What is the maximum file size I can submit?

It depends on the file type: 50 MB for non-textual documents (PDF, DOC, DOCX), 25 MB for images submitted to OCR, 5 MB for HTML files, and 3 MB for text and source-code files.

What is the maximum document length?

2000 pages, which is approximately 500,000 words.

What is the Copyleaks API rate limit?

10 requests per second by default. Exceeding it returns HTTP 429 (Too Many Requests) and blocks authentication for 5 minutes. Contact Copyleaks if you need a higher rate.

How long are scans stored before they expire?

Scans are stored for 2880 hours (120 days) by default, which is also the maximum. You can set a shorter expiration in the submit request, so save your results before they expire.