Skip to content
Features

Cross-Language Detection

The Copyleaks Cross-Language Detection is a powerful feature that enables you to identify plagiarism across different languages. It can detect content that has been translated from one language to another, helping catch sophisticated plagiarism attempts where text is copied, translated, and presented as original work.

This guide will walk you through the process of using Cross-Language Detection and understanding its capabilities.

Cross-Language Detection identifies content that has been translated from one language to another. For example, if someone takes content written in English, translates it to Spanish, and presents it as original work, Cross-Language Detection can identify this plagiarism attempt.

Cross-Language Detection uses advanced translation and semantic matching technology to:

  1. Analyze Source Document: The system processes your submitted document in its original language.

  2. Translation Analysis: The content is analyzed across different language databases.

  3. Semantic Matching: Beyond direct translation, the system looks for semantic similarities that might indicate translated plagiarism.

  4. Results Compilation: Findings are compiled into a comprehensive report that identifies potential matches across languages.

  • Catch Sophisticated Plagiarism: Identify plagiarism attempts that involve translation, which traditional plagiarism checkers would miss.
  • Multi-Language Support: Support for multiple source languages and an even wider range of result languages.
  • Seamless Integration: Use the same API workflow as regular plagiarism checks with additional parameters.
  • Detailed Reporting: Get precise information about cross-language matches with the same detailed reporting as standard plagiarism detection.
  1. Before you start, ensure you have the following:

  2. Choose your preferred method for making API calls.

    You can interact with the API using any standard HTTP client.

    For a quicker setup, we provide a Postman collection. See our Postman guide for instructions.

  3. To perform a scan, we first need to generate an access token. For that, we will use the login endpoint. The API key can be found on the Copyleaks API Dashboard.

    Upon successful authentication, you will receive a token that must be attached to subsequent API calls via the Authorization: Bearer <TOKEN> header. This token remains valid for 48 hours.

    POST https://id.copyleaks.com/v3/account/login/api
    Headers
    Content-Type: application/json
    Body
    {
    "email": "[email protected]",
    "key": "00000000-0000-0000-0000-000000000000"
    }

    Response

    {
    "access_token": "<ACCESS_TOKEN>",
    ".issued": "2025-07-31T10:19:40.0690015Z",
    ".expires": "2025-08-02T10:19:40.0690016Z"
    }
  4. For this guide, we’ll demonstrate document submission for writing assessment. Each submission requires a unique scanId for proper tracking and identification.

    POST https://api.copyleaks.com/v3/scans/submit/file/{scanId}
    Content-type: multipart/form-data
    Authorization: Bearer YOUR_LOGIN_TOKEN
    Request Body:
    {
    "base64": "<BASE64_ENCODED_FILE>",
    "filename": "my-document.pdf",
    "properties": {
    "sandbox": true,
    "scanning": {
    "crossLanguages": {
    "languages": [
    { code: "es" },
    { code: "fr" }
    ]
    }
    }
    }
    }

Cross-Language Detection supports a wide range of languages:

  • Source Languages: The document you upload can be in one of the supported source languages (Danish, Dutch, English, French, German, Italian, Portuguese, Russian, Spanish).

  • Result Languages: Copyleaks can detect plagiarism in over 30 target languages, including Albanian, Bulgarian, Chinese, Czech, German, Greek, Hindi, Japanese, Korean, and many more.

Cross-Language Detection uses additional credits based on the following model:

  1. Base Scan: The base scan in the document’s original language counts as normal (1 credit per 250 words).

  2. Additional Languages: Each additional language selected for cross-language detection will incur the same credit cost as the base scan.

For example, if your document is 1,000 words (4 credits) and you select two additional languages for cross-language detection (Spanish and French), the total cost would be:

  • Base scan: 4 credits
  • Spanish: 4 credits
  • French: 4 credits
  • Total: 12 credits

Cross-Language Detection is particularly valuable in several scenarios:

  • Academic Institutions: Universities with international student bodies can detect plagiarism regardless of the original content’s language.

  • Global Publishing: Publishers that operate in multiple regions can ensure content originality across language barriers.

  • Research Verification: Researchers can verify the originality of work when citing sources from different languages.

  • Content Licensing: Media companies can protect their intellectual property from unauthorized translations.

To maximize the effectiveness of Cross-Language Detection:

  1. Select Relevant Languages: Choose only the languages that are relevant to your use case to optimize credit usage.

  2. Use with Regular Plagiarism Detection: Cross-Language Detection works best as a complement to standard plagiarism detection.

  3. Review Results Carefully: Because translation can alter sentence structure and word choice, review cross-language matches with special attention to semantic similarity rather than exact matches.