Knwler is an enterprise document intelligence platform that extracts structured knowledge graphs from documents using Large Language Models. It identifies entities, relationships, and topics from PDFs, text files, and Markdown — producing interactive reports and exports for graph analytics platforms.

Can Knwler run fully on-premise?

Yes. Knwler supports fully air-gapped operation via Ollama with open-weight models. No data leaves your infrastructure, making it suitable for classified and regulated environments with strict data sovereignty requirements.

What languages does Knwler support?

Knwler auto-detects document language and supports English, German, French, Spanish, and Dutch out of the box. Additional languages can be added by extending a single configuration file.

What export formats are available?

Knwler exports to JSON, GML, GraphML, and interactive HTML. These can be imported directly into Neo4j, Gephi, yEd, Memgraph, SurrealDB, or used to generate vector embeddings for semantic search.

How much does it cost to process a document?

Using cloud LLMs (OpenAI GPT-4o), processing costs approximately $0.20 per 20-page document. Running on-premise with local models is completely free after initial setup. Intelligent caching means re-runs cost nothing.

Batch processing

Both OpenAI and Google Gemini provide a Batch API that processes requests asynchronously at roughly 50 % of real-time pricing. Knwler ships a dedicated batch processor for each provider. Give either one a directory of documents and it will run the complete knowledge-graph pipeline — chunking, schema discovery, extraction, consolidation, community labelling — and write per-document graph.json + index.html files when done.

	OpenAI Batch	Gemini Batch
Command	`batch-openai run`	`batch-gemini run`
API key env var	`OPENAI_API_KEY`	`GEMINI_API_KEY` / `GOOGLE_API_KEY`
State database	`batch.db`	`batch_gemini.db`
Extra dependency	—	`pip install google-genai`
Cost vs real-time	~50 %	~50 %
SLA	24 h	24 h

How it works

Both processors share the same three-round architecture. All LLM calls within a round are batched into a single API job, which is submitted and polled until complete before the next round begins.

flowchart TD
    A([document directory]) --> S[Scan & chunk all files\nSQLite state init]

    S --> R1

    subgraph Round 1 - Discovery
        R1[Build JSONL\nlanguage + schema per doc]
        R1 -->|submit batch job| API1[Batch API]
        API1 -->|poll until complete| P1[Parse responses\nstore language + schema]
    end

    P1 --> R2

    subgraph Round 2 - Processing
        R2[Build JSONL\ntitle + summary + rephrase + extraction per chunk]
        R2 -->|submit batch job| API2[Batch API]
        API2 -->|poll until complete| P2[Parse responses\nstore per-chunk graphs]
    end

    P2 --> R3

    subgraph Round 3 - Finalisation
        R3[Build JSONL\nconsolidation summaries + community labels]
        R3 -->|submit batch job| API3[Batch API]
        API3 -->|poll until complete| P3[Parse responses\nbuild final graph]
    end

    P3 --> OUT[Write graph.json + index.html\nper document]
    OUT -->|optional| R4[Round 4 — Cross-file consolidation]

Each round is resumable — if the process is interrupted, re-running the same command picks up exactly where it left off using the SQLite state database.

OpenAI Batch Processing

Setup

export OPENAI_API_KEY="your-key-here"

No extra dependencies are needed beyond knwler’s standard install.

Commands

Run (or resume) the full pipeline

python main.py batch-openai run \
    --input ./documents \
    --output ./results

Check status of a running or completed pipeline

python main.py batch-openai status \
    --input ./documents \
    --output ./results

Run cross-file consolidation on an already-processed output directory

python main.py batch-openai consolidate \
    --input ./documents \
    --output ./results

Options

Flag	Default	Description
`--input` / `-i`	—	Directory of source documents
`--output` / `-o`	—	Output directory
`--discovery-model`	`gpt-4o-mini`	Model for schema/language discovery
`--extraction-model`	`gpt-4o-mini`	Model for extraction and consolidation
`--template`	`default`	HTML report template
`--consolidate`	off	Also run a 4th round to merge all document graphs

Examples

# Use larger models for discovery and extraction
python main.py batch-openai run \
    -i ./docs -o ./out \
    --discovery-model gpt-4o \
    --extraction-model gpt-4o-mini

# Process and consolidate all documents in one go
python main.py batch-openai run \
    -i ./pdfs -o ./batching \
    --consolidate

# Consolidate after the fact
python main.py batch-openai consolidate \
    -i ./pdfs -o ./batching

API limits

Limit	Value
Requests per batch	50 000
Batch file size	200 MB
Token budget per batch	2 000 000 (with 10 % safety buffer)

OpenAI commits to completing batch jobs within 24 hours; in practice most finish in minutes.

Gemini Batch Processing

Setup

Install the Google AI SDK:

pip install google-genai

Export your API key:

export GEMINI_API_KEY="your-key-here"
# or
export GOOGLE_API_KEY="your-key-here"

Commands

Run (or resume) the full pipeline

python main.py batch-gemini run \
    --input ./documents \
    --output ./results

Check status of a running or completed pipeline

python main.py batch-gemini status \
    --input ./documents \
    --output ./results

Run cross-file consolidation on an already-processed output directory

python main.py batch-gemini consolidate \
    --input ./documents \
    --output ./results

Options

Flag	Default	Description
`--input` / `-i`	—	Directory of source documents
`--output` / `-o`	—	Output directory
`--discovery-model`	`gemini-3.1-flash-lite-preview`	Model for schema/language discovery
`--extraction-model`	`gemini-3.1-flash-lite-preview`	Model for extraction and consolidation
`--template`	`default`	HTML report template
`--consolidate`	off	Also run a 4th round to merge all document graphs

Examples

# Use a larger model
python main.py batch-gemini run \
    -i ./docs -o ./out \
    --discovery-model gemini-3-flash-preview \
    --extraction-model gemini-3-flash-preview

# Process and consolidate in one shot
python main.py batch-gemini run \
    -i ./pdfs -o ./batching \
    --consolidate

# Consolidate a completed run
python main.py batch-gemini consolidate \
    -i ./pdfs -o ./batching

How Gemini batches are submitted

The Gemini processor uses the google-genai SDK and the Gemini Batch API. For each round it:

Serialises all prompts as a JSONL file (one JSON object per request).
Uploads the file via the Gemini File API.
Creates a batch job referencing the uploaded file.
Polls with exponential backoff (initial 30 s, max 5 min) until the job reaches a terminal state.
Downloads and parses the output JSONL.

Terminal states: JOB_STATE_SUCCEEDED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_EXPIRED.

API limits

Limit	Value
Batch file size	2 GB

Supported file types

Both processors accept: .pdf, .txt, .md, .text, .markdown

State database & resumability

Each processor writes a SQLite database to the output directory (batch.db for OpenAI, batch_gemini.db for Gemini). It tracks:

Per-document: text content, chunks, language, schema, per-chunk extraction results, consolidated graph.
Per-round: batch job ID / job name, status, request count, timestamps.

If the process is killed or a round fails, simply re-run the same command — it will skip completed rounds and resume from the first incomplete one.

[!IMPORTANT] The database is designed for crash recovery, not incremental updates. If you add new documents or want to reprocess from scratch, delete the output directory first.

Cross-file consolidation (Round 4)

When --consolidate is passed (or the consolidate sub-command is used), a fourth batch round merges all per-document graphs into a single consolidated_graph.json:

All entity and relation descriptions across all documents are aggregated.
Duplicate descriptions are summarised via a batch LLM call.
A final merged graph with community detection is written to the output directory.

You can also consolidate after the fact using knwler’s standard consolidation command:

python main.py consolidate --dir ./results --output ./merged

Choosing between real-time and batch

Scenario	Recommendation
Single document, interactive use	`extract` (real-time)
A few documents	`extract` (real-time, parallel)
Tens or hundreds of documents	`batch-openai run` or `batch-gemini run`
Cost is the primary concern	Batch (either provider)
Lowest latency required	Real-time