AI Detector

Detect AI-Generated Images in PDFs

The Copyleaks AI Image Detection API is a powerful tool to determine if a given image was generated or partially generated by an AI.

As a consumer, you might have a PDF with images. If you want to scan those images separately from the PDF, this guide is for you.

This guide will walk you through the process of extracting images from a pdf file and submitting them for AI detection and understanding the results.

🚀 Get Started

Before you begin
Section titled “Before you begin”

Before you start, ensure you have the following:
- An active Copyleaks account. If you don’t have one, sign up for free.
- You can find your API key on the API Dashboard.
Installations
Section titled “Installations”

Install the relevant packages using pip install -U PyMuPDF Pillow copyleaks.
Login
Section titled “Login”

To perform a scan, we first need to generate an access token. For that, we will use the login endpoint. The API key can be found on the Copyleaks API Dashboard.

Upon successful authentication, you will receive a token that must be attached to subsequent API calls via the Authorization: Bearer <TOKEN> header. This token remains valid for 48 hours.
POST https://id.copyleaks.com/v3/account/login/api Headers Content-Type: application/json Body { "email": "[email protected]", "key": "00000000-0000-0000-0000-000000000000" }
Terminal window
export COPYLEAKS_EMAIL="[email protected]" export COPYLEAKS_API_KEY="your-api-key-here" curl --request POST \ --url https://id.copyleaks.com/v3/account/login/api \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --data "{ \"email\": \"${COPYLEAKS_EMAIL}\", \"key\": \"${COPYLEAKS_API_KEY}\" }"
from copyleaks.copyleaks import Copyleaks EMAIL_ADDRESS = "[email protected]" API_KEY = "your-api-key-here" # Login to Copyleaks auth_token = Copyleaks.login(EMAIL_ADDRESS, API_KEY) print("Logged successfully!\nToken:", auth_token)
const { Copyleaks } = require("plagiarism-checker"); const EMAIL_ADDRESS = "[email protected]"; const API_KEY = "your-api-key-here"; const copyleaks = new Copyleaks(); // Login function function loginToCopyleaks() { return copyleaks.loginAsync(EMAIL_ADDRESS, API_KEY).then( (loginResult) => { console.log("Login successful!"); console.log("Access Token:", loginResult.access_token); return loginResult; }, (err) => { console.error('Login failed:', err); throw err; } ); } loginToCopyleaks();
import com.copyleaks.sdk.api.Copyleaks; String EMAIL_ADDRESS = "[email protected]"; String API_KEY = "00000000-0000-0000-0000-000000000000"; // Login to Copyleaks try { String authToken = Copyleaks.login(EMAIL_ADDRESS, API_KEY); System.out.println("Logged successfully!\nToken: " + authToken); } catch (CommandException e) { System.out.println("Failed to login: " + e.getMessage()); System.exit(1); }
Response
```
{
    "access_token": "<ACCESS_TOKEN>",
    ".issued": "2025-07-31T10:19:40.0690015Z",
    ".expires": "2025-08-02T10:19:40.0690016Z"
}
```
Save this token! It’s valid for 48 hours and can be reused for subsequent API calls.

Extracting images from a PDF file

Next, we are going to extract all the images from the PDF. The function below will take a pdf file path and extract all its images to a specified directory.

The following example takes the input PDF file and outputs all its nested images to the output_folder directory

import os
import fitz  # package by the PyMuPDF module
from pathlib import Path

def extract_images(pdf_path: str, output_folder: str = "Extracted-Images") -> list[str]:
    """
    Extract all images from a PDF file.

    Args:
        pdf_path: Path to PDF file
        output_folder: Output folder for images

    Returns:
        List of extracted image paths as strings
    """
    os.makedirs(output_folder, exist_ok=True)

    extracted = []
    pdf_name = Path(pdf_path).stem

    pdf = None
    try:
        pdf = fitz.open(pdf_path)

        print(f"Processing: {pdf_path}")
        print(f"Pages: {len(pdf)}")

        image_count = 0

        for page_num in range(len(pdf)):
            page = pdf[page_num]
            images = page.get_images(full=True)

            print(f"Page {page_num + 1}: {len(images)} image(s)")

            for img_index, img in enumerate(images):
                xref = img[0]
                base_image = pdf.extract_image(xref)
                image_bytes = base_image["image"]
                ext = base_image["ext"]

                image_count += 1
                filename = f"{pdf_name}_page{
                    page_num + 1}_img{img_index + 1}.{ext}"
                path = os.path.join(output_folder, filename)
                try:
                    with open(path, "wb") as f:
                        f.write(image_bytes)
                except Exception as e:
                    print(f"  ✗ Error: {e}")

                extracted.append(path)
                print(f"  ✓ {filename}")

    except Exception as e:
        print(f"✗ Error: {e}")
        return []
    finally:
        if pdf is not None:
            pdf.close()
    print(f"\n✓ Extracted {image_count} images")
    return extracted

if __name__ == "__main__":
    extract_images("my_file.pdf", "output_dir")

Submit for Analysis

Once we have the extracted images, you can submit them for analysis.

We are going to use the AI Image Detector Endpoint to send an image for analysis.

AI Detection scan

This function takes your image, converts it to base64, and submits it via the SDK’s ImageDetectionClient. The SDK handles authentication and HTTP transport.

import os
import base64
import uuid
from copyleaks.clients.image_detection_client import ImageDetectionClient
from copyleaks.models.ai_image_detection import (
    CopyleaksAiImageDetectionRequestModel,
    CopyleaksAiImageDetectionModels,
)

def detect(image_path: str, auth_token: str):
    """Detect AI content in image using the Copyleaks SDK."""
    try:
        with open(image_path, 'rb') as f:
            image_data = base64.b64encode(f.read()).decode('utf-8')
    except Exception as e:
        print(f"  ✗ Error reading image: {e}")
        return None

    scan_id = str(uuid.uuid4())
    payload = CopyleaksAiImageDetectionRequestModel(
        base64=image_data,
        filename=os.path.basename(image_path),
        model=CopyleaksAiImageDetectionModels.AI_IMAGE_1_ULTRA,
        sandbox=False,
    )

    client = ImageDetectionClient()
    return client.submit(auth_token, scan_id, payload)

if __name__ == "__main__":
    from pathlib import Path
    path = Path('directory_path')
    for entry in path.iterdir():
        if entry.is_file():
            print(detect(entry.name, 'auth_token'))

Interpreting The Response
Section titled “Interpreting The Response”

See the Interpreting The Response page on Detecting AI-Generated Images

🎉Congratulations!

You have successfully submitted images from PDF for AI detection.

Of course, you are free to fit this code to your needs.

🗺️ Next Steps

API Reference Explore the full API reference for the AI Detection endpoint.

AI Logic Learn how to use AI logic can help you interpret the results of AI text detection.

Accuracy & 3rd Party Evaluations Discover how Copyleaks AI Detector maintains top accuracy in third-party evaluations.

Detect AI-Generated Images in PDFs

Ask AI

🚀 Get Started

Before you begin

Installations

Login

Extracting images from a PDF file

Submit for Analysis

AI Detection scan

Interpreting The Response

🎉Congratulations!

🗺️ Next Steps