Understanding the Problem
When working with document databases, you may encounter scenarios where:- An author’s current document matches against their previous work from earlier submissions.
- Multiple versions or drafts of the same document are flagged against each other.
- Legitimate self-referencing or building upon previous work is incorrectly identified as plagiarism.
Prevention Strategy: Smart Scan ID Structure
The best strategy is to design a strategicscanId for each submission. A well-structured ID makes it easy to include or exclude specific groups of documents from a scan.
Example ID Structures
Basic Structure:<AUTHOR_ID>-<DOCUMENT_ID>(e.g.,author123-essay1,emp456-report2)
<ORGANIZATION_ID>-<AUTHOR_ID>-<DOCUMENT_ID>(e.g.,acmeuni-author123-essay1,techcorp-emp456-proposal)
- Exclude by author: Use
author123-*oremp456-*. - Include by organization: Use
acmeuni-*ortechcorp-*. - Focus on document types: Use
*-finalor*-report.
Using Exclude Patterns
Use theproperties.scanning.exclude.idPattern parameter to exclude specific patterns from your scan results. The * character acts as a wildcard.
Exclude by ID Pattern
author123-.
Exclude by Domain
Exclude by Text Phrases
Using Include Patterns
Use theproperties.scanning.include.idPattern parameter to only include specific patterns in your scan results. This is useful for limiting comparisons to specific groups, like an organization or a class.
acmeuni-.
Implementation Examples
Example 1: Exclude Same Author’s Previous Work
Example 2: Compare Only Within Same Organization
Best Practices
- ** Plan your ID structure**: Design scan ID patterns from the beginning.
- ** Be specific**: Use precise patterns to avoid excluding too much or too little.
- ** Test patterns**: Verify your patterns work correctly with sample data.
- ** Document conventions**: Maintain clear documentation of your ID structure for your team.
- ** Keep it short**: Remember the 36-character limit.
Next Steps
Submit File Documentation
Learn how to submit files with your custom scan IDs.
Compare Multiple Documents
Learn about cross-document comparison strategies.
Support
Should you require any assistance or have inquiries about implementing author conflict prevention, please contact Copyleaks Support or ask a question on Stack Overflow with thecopyleaks-api tag.
