Knwler CLI
Knwler can be used via your terminal or as a Python package. See the API page for more details on how to use individual methods.
The CLI contains a lot of functionality and rather than listing all possible actions, you will find below concrete examples. Note that if you have installed Knwler via pipx you can simply use knwler ... rather than uv run main.py ....
There is a demo which uses the Human Rights declaration as short pdf:
uv run main.py demoThis will run the whole pipeline with all options:
- download the pdf
- parse the pdf to markdown
- chunk the text
- infer a schema
- rephrase the chunks
- extract a little knowledge graph for every chunk
- consolidate the graphs
- extract a title
- extract a summary
- render a report (html).
The cli_demo.py contains all of this and can serve as a guide on how to use Knwler as a Python package. You can use any of the steps individually or leave out what you don’t need (e.g. the rephrasing of chunks).
Of course, the main reason you would try Knwler is to extract your own document:
uv run main.py -f https://knwler.com/pdfs/HumanRights.pdfand if you have a local file:
uv run main.py -f ./HumanRights.pdfThis will output lots of files in a results directory:
graph.gmla format convenient for graph visualization and graph analyticsgraph.jsoncontains everything you need for downstream tasksindex.htmla rendering of thegraph.jsondatalog.htnlthe log in html formatlog.txtthe log in text format.
You can output things to a different directory with
uv run main.py -f ./HumanRights.pdf --output ./stuffEvery time you run the same document a new directory will be created, you can override this with:
uv run main.py -f ./HumanRights.pdf --output ./stuff --overwriteThe above commands will all use Ollama as backend. If you want OpenAI instead simply use:
uv run main.py -f ./HumanRights.pdf --output ./stuff --overwrite --openaiand make sure the API key is in your environment. If not, use
export OPENAIAPI_KEY=sk-....and similarly for Anthropic:
uv run main.py -f ./HumanRights.pdf --output ./stuff --overwrite --anthropicwhich, according to our benchmarks, will give you the highest qualirty output.
The rendered HTML uses a template and you can switch to another one via:
uv run main.py -f ./HumanRights.pdf --output ./stuff --overwrite --anthropic --template columnsNote that this will use the cached LLM exchanges and you won’t have to pay again for a different style.
If you wish to understand the quality a model’s output you can use the benchmark utility:
uv run main.py benchmark runThe grid.json file contains the items which will be benchmarked. This will render a comprehensive report and sort the results based on a knowledge yield score.
Performance Tips
- Run Ollama with
OLLAMA_NUM_PARALLEL=8 ollama serveto fully saturate--concurrentrequests locally. - Tune
--max-tokens(default 400) — smaller values create more chunks and finer-grained graphs at the cost of more LLM calls; larger values do the opposite. - The cache is enabled by default. Re-runs with different export flags (e.g. adding
--html-report) are instant and free. - For large document sets, use
--dirwith--consolidateto batch-process and merge in a single command.