Cross-Language Detection
The Copyleaks Cross-Language Detection is a powerful feature that enables you to identify plagiarism across different languages. It can detect content that has been translated from one language to another, helping catch sophisticated plagiarism attempts where text is copied, translated, and presented as original work.
This guide will walk you through the process of using Cross-Language Detection and understanding its capabilities.
Overview
Section titled “Overview”Cross-Language Detection identifies content that has been translated from one language to another. For example, if someone takes content written in English, translates it to Spanish, and presents it as original work, Cross-Language Detection can identify this plagiarism attempt.
How It Works
Section titled “How It Works”Cross-Language Detection uses advanced translation and semantic matching technology to:
-
Analyze Source Document: The system processes your submitted document in its original language.
-
Translation Analysis: The content is analyzed across different language databases.
-
Semantic Matching: Beyond direct translation, the system looks for semantic similarities that might indicate translated plagiarism.
-
Results Compilation: Findings are compiled into a comprehensive report that identifies potential matches across languages.
Key Benefits
Section titled “Key Benefits”- Catch Sophisticated Plagiarism: Identify plagiarism attempts that involve translation, which traditional plagiarism checkers would miss.
- Multi-Language Support: Support for multiple source languages and an even wider range of result languages.
- Seamless Integration: Use the same API workflow as regular plagiarism checks with additional parameters.
- Detailed Reporting: Get precise information about cross-language matches with the same detailed reporting as standard plagiarism detection.
🚀 Get Started
Section titled “🚀 Get Started”-
Before you begin
Section titled “Before you begin”Before you start, ensure you have the following:
- An active Copyleaks account. If you don’t have one, sign up for free.
- You can find your API key on the API Dashboard.
-
Installation
Section titled “Installation”Choose your preferred method for making API calls.
You can interact with the API using any standard HTTP client.
For a quicker setup, we provide a Postman collection. See our Postman guide for instructions.
Terminal window sudo apt-get install curlDownload it from curl.se
Terminal window brew install curlTerminal window pip install copyleaksTerminal window npm install plagiarism-checker -
To perform a scan, we first need to generate an access token. For that, we will use the login endpoint. The API key can be found on the Copyleaks API Dashboard.
Upon successful authentication, you will receive a token that must be attached to subsequent API calls via the Authorization: Bearer
<TOKEN>header. This token remains valid for 48 hours.POST https://id.copyleaks.com/v3/account/login/apiHeadersContent-Type: application/jsonBody{"key": "00000000-0000-0000-0000-000000000000"}Terminal window export COPYLEAKS_API_KEY="your-api-key-here"curl --request POST \--url https://id.copyleaks.com/v3/account/login/api \--header 'Accept: application/json' \--header 'Content-Type: application/json' \--data "{\"email\": \"${COPYLEAKS_EMAIL}\",\"key\": \"${COPYLEAKS_API_KEY}\"}"from copyleaks.copyleaks import CopyleaksAPI_KEY = "your-api-key-here"# Login to Copyleaksauth_token = Copyleaks.login(EMAIL_ADDRESS, API_KEY)print("Logged successfully!\nToken:", auth_token)const { Copyleaks } = require("plagiarism-checker");const API_KEY = "your-api-key-here";const copyleaks = new Copyleaks();// Login functionfunction loginToCopyleaks() {return copyleaks.loginAsync(EMAIL_ADDRESS, API_KEY).then((loginResult) => {console.log("Login successful!");console.log("Access Token:", loginResult.access_token);return loginResult;},(err) => {console.error('Login failed:', err);throw err;});}loginToCopyleaks();import com.copyleaks.sdk.api.Copyleaks;String API_KEY = "00000000-0000-0000-0000-000000000000";// Login to Copyleakstry {String authToken = Copyleaks.login(EMAIL_ADDRESS, API_KEY);System.out.println("Logged successfully!\nToken: " + authToken);} catch (CommandException e) {System.out.println("Failed to login: " + e.getMessage());System.exit(1);}Response
{"access_token": "<ACCESS_TOKEN>",".issued": "2025-07-31T10:19:40.0690015Z",".expires": "2025-08-02T10:19:40.0690016Z"} -
Submit for Cross-Language Analysis
Section titled “Submit for Cross-Language Analysis”Submission Methods
You can submit content for analysis using multiple methods:
For this guide, we’ll demonstrate document submission for writing assessment. Each submission requires a unique
scanIdfor proper tracking and identification.POST https://api.copyleaks.com/v3/scans/submit/file/{scanId}Content-type: multipart/form-dataAuthorization: Bearer YOUR_LOGIN_TOKENRequest Body:{"base64": "<BASE64_ENCODED_FILE>","filename": "my-document.pdf","properties": {"sandbox": true,"scanning": {"crossLanguages": {"languages": [{ code: "es" },{ code: "fr" }]}}}}Terminal window curl -X POST "https://api.copyleaks.com/v3/scans/submit/file/my-scan-123" \-H "Authorization: Bearer <YOUR_AUTH_TOKEN>" \-H "Content-Type: multipart/form-data" \-F 'properties={"sandbox": true,"scanning": {"crossLanguages": {"languages": [{ code: "es" },{ code: "fr" }]}}}'from copyleaks.copyleaks import Copyleaksfrom copyleaks.models.submit.document import FileDocumentfrom copyleaks.models.submit.properties.scan_properties import ScanPropertiesscan_id = "my-scan-123"file_path = "my-document.pdf"# Create document to scanfile_submission = FileDocument(file_path)file_submission.set_sandbox(True)# Configure cross-language detectionproperties = ScanProperties()properties.set_scanning_cross_languages(["es", "fr"]) # Spanish and Frenchfile_submission.set_properties(properties)# Submit for scanningresponse = Copyleaks.submit_file(auth_token, scan_id, file_submission)print(response)const { Copyleaks, CopyleaksFileSubmissionModel } = require('plagiarism-checker');async function submitWithCrossLanguage() {try {// Initialize Copyleaksconst copyleaks = new Copyleaks();// Login to get the authentication tokenconst scanId = `cross-lang-scan-${Date.now()}`;// Create a file submission modelconst fileToSubmit = './my-document.pdf';const submission = new CopyleaksFileSubmissionModel(fileToSubmit);submission.sandbox = true;// Set cross language propertiessubmission.properties = {scanning: {crossLanguages: {languages: [{ code: "es" },{ code: "fr" }]}}};// Submit the file for scanningconst response = await copyleaks.submitFileAsync(loginResult, scanId, submission);console.log("Submission successful:", response);} catch (error) {console.error("An error occurred:", error);}}submitWithCrossLanguage();
Supported Languages
Section titled “Supported Languages”Cross-Language Detection supports a wide range of languages:
-
Source Languages: The document you upload can be in one of the supported source languages (Danish, Dutch, English, French, German, Italian, Portuguese, Russian, Spanish).
-
Result Languages: Copyleaks can detect plagiarism in over 30 target languages, including Albanian, Bulgarian, Chinese, Czech, German, Greek, Hindi, Japanese, Korean, and many more.
Pricing
Section titled “Pricing”Cross-Language Detection uses additional credits based on the following model:
-
Base Scan: The base scan in the document’s original language counts as normal (1 credit per 250 words).
-
Additional Languages: Each additional language selected for cross-language detection will incur the same credit cost as the base scan.
For example, if your document is 1,000 words (4 credits) and you select two additional languages for cross-language detection (Spanish and French), the total cost would be:
- Base scan: 4 credits
- Spanish: 4 credits
- French: 4 credits
- Total: 12 credits
Use Cases
Section titled “Use Cases”Cross-Language Detection is particularly valuable in several scenarios:
-
Academic Institutions: Universities with international student bodies can detect plagiarism regardless of the original content’s language.
-
Global Publishing: Publishers that operate in multiple regions can ensure content originality across language barriers.
-
Research Verification: Researchers can verify the originality of work when citing sources from different languages.
-
Content Licensing: Media companies can protect their intellectual property from unauthorized translations.
Best Practices
Section titled “Best Practices”To maximize the effectiveness of Cross-Language Detection:
-
Select Relevant Languages: Choose only the languages that are relevant to your use case to optimize credit usage.
-
Use with Regular Plagiarism Detection: Cross-Language Detection works best as a complement to standard plagiarism detection.
-
Review Results Carefully: Because translation can alter sentence structure and word choice, review cross-language matches with special attention to semantic similarity rather than exact matches.