Is pdf2md free to use?

Yes, pdf2md is presented as an open-source project, so the tool itself is free to use. pdf2md still requires your own Docker and GPU-capable infrastructure, so the real cost is the compute and ops layer around it.

How does pdf2md compare to Marker?

pdf2md is more opinionated about local inference because it wraps VLM backends inside Docker and exposes a pure Go CLI. Marker is the better choice if you want a broader community baseline and a simpler single-tool workflow, while pdf2md is stronger when you want explicit control over model backends and container boundaries.

Does pdf2md support scanned PDFs?

Yes, pdf2md supports OCR-oriented backends such as `dots-ocr` and `paddleocr-vl-1.5-gguf`, so it can handle scanned or image-heavy PDFs better than text-only extractors. pdf2md still depends on scan quality, page DPI, and layout complexity, so bad source PDFs will produce weaker Markdown.

Can pdf2md run without a GPU?

pdf2md is designed around GPU-backed Docker inference, and the repository lists `nvidia-container-toolkit` as a prerequisite. pdf2md is not intended as a CPU-only document converter, so a GPU host is the practical deployment path.

What does pdf2md output?

pdf2md produces Markdown and JSON from the same PDF conversion pipeline. That makes pdf2md useful when you need human-readable text for docs and machine-readable structure for downstream indexing or automation.

When should I use the paddleocr-vl-1.5-gguf model?

Use pdf2md with `paddleocr-vl-1.5-gguf` when you want the two-stage path with ONNX layout detection followed by `llama.cpp` recognition. pdf2md uses that model when you care about block-level structure and want to avoid installing `onnxruntime` directly on the host.

pdf2md: Best PDF Conversion CLI Tools for Developers in 2026

pdf2md turns PDFs into structured Markdown through local Docker-backed VLM pipelines, so you can keep documents on-prem while extracting layout-aware text with minimal host dependencies.

What Is pdf2md?

pdf2md is a Go-based PDF-to-Markdown command-line tool built by ninehills for developers, researchers, and ops teams that need to convert PDF documents into structured Markdown with local Docker-backed VLM inference. pdf2md is one of the best PDF Conversion CLI Tools for Developers, and it ships with 3 model backends, 6 platform binaries, and 78 tests across 13 packages.

The core idea is simple: keep the host thin, push model execution into containers, and return Markdown plus JSON without forcing Python or on-host CUDA setup. That design makes pdf2md a good fit for batch processing, reproducible document pipelines, and private workflows where PDFs cannot leave your infrastructure.

Quick Overview

pdf2md is a narrow, engineering-first converter, not a general document platform.

Attribute	Details
Type	PDF Conversion CLI Tools
Best For	developers, researchers, and platform teams
Language/Stack	Go, Docker, vLLM, llama.cpp, ONNX, MuPDF
License	N/A
GitHub Stars	N/A
Pricing	Open-Source
Last Release	v0.1 — N/A

Who Should Use pdf2md?

pdf2md fits teams that want deterministic, local PDF extraction rather than SaaS OCR.

Solo developers building document ingestion pipelines who want a single binary and do not want to assemble a Python stack.
Platform and infra teams that need repeatable batch conversion jobs running through Docker with explicit model selection and port control.
Research teams processing papers, reports, or scans that need layout-aware output in Markdown and JSON for downstream NLP or indexing.
Security-conscious orgs that cannot send PDFs to third-party APIs and want inference to stay inside their own GPU host.

Not ideal for:

CPU-only laptops that cannot run the Docker GPU path or do not have nvidia-container-toolkit available.
Teams wanting a hosted GUI instead of a terminal workflow and container orchestration.
Users who need zero container setup because pdf2md still depends on Docker even though the host binary is pure Go.

Key Features of pdf2md

Three inference backends — pdf2md supports dots-ocr, logics-parsing-v2, and paddleocr-vl-1.5-gguf. That gives you three different trade-offs between layout-aware OCR, HTML-structured parsing, and two-stage block recognition.
Pure Go single binary — the CLI is compiled from Go and does not require Python, onnxruntime, or a local CUDA toolkit install. The host footprint stays small, while model execution happens in containers.
Docker-managed model serving — pdf2md launches and talks to vLLM, llama.cpp, and an ONNX service over HTTP. That makes the runtime boundary explicit and keeps model-specific dependency drift out of your workstation.
Two-stage PaddleOCR-VL pipeline — the paddleocr-vl-1.5-gguf path renders PDF pages, runs ONNX-based layout detection, crops blocks, then sends them to llama.cpp for recognition. The result is merged into Markdown plus JSON.
Multi-platform releases — the project publishes prebuilt binaries for linux, macOS, and Windows across amd64 and arm64. That is a practical fit for CI runners, desktops, and self-hosted build agents.
Tunable batch execution — flags like --dpi, --concurrency, --timeout, --port, and --model-dir make it usable for both ad hoc runs and large document queues. The defaults are sensible for local GPU inference but still configurable for production jobs.
Explicit project structure — the repo separates pdf rendering, docker orchestration, inference clients, layout mappings, and markdown merge logic. That separation makes the codebase easier to audit than a monolithic shell script.

pdf2md vs Alternatives

pdf2md is best when you want a local, containerized conversion pipeline with model choice and minimal host dependencies.

Tool	Best For	Key Differentiator	Pricing
pdf2md	Local PDF-to-Markdown batch conversion	Pure Go CLI that orchestrates Dockerized VLM backends and outputs Markdown + JSON	Open-Source
Marker	General PDF-to-Markdown extraction	Strong open-source baseline with a broader community footprint and simpler workflow	Open-Source
Docling	Document conversion and downstream parsing	Broader document processing library mindset, better if you need more than a CLI	Open-Source
Nougat	Scientific paper conversion	Research-oriented OCR-to-Markdown pipeline tuned for academic PDFs	Open-Source

Pick Marker if you want the most widely discussed open-source alternative and can accept its opinionated pipeline. Pick Docling if your document workflow needs a richer conversion library rather than a focused CLI.

Pick Nougat if your corpus is mostly scientific papers and you care about academic-text extraction patterns. Pick pdf2md when you want the cleanest split between a Go orchestrator and containerized inference, especially for private or batch-heavy jobs.

If you are comparing more command-line workflows, browse all CLI Tools or browse all DevOps Automation tools for adjacent options.

How pdf2md Works

pdf2md uses a pipeline architecture instead of a single monolithic parser. The Go binary renders PDF pages, sends page images to a model backend over HTTP, and then merges the returned structure into Markdown and JSON. That separation matters because the CLI stays portable while the heavy inference layers remain isolated in Docker containers.

The design is intentionally backend-driven. dots-ocr routes through vLLM, logics-parsing-v2 uses a similar VLM path, and paddleocr-vl-1.5-gguf adds an ONNX layout detector before a llama.cpp recognition stage. The main abstraction is page-level document structure: render, detect blocks, infer text, then stitch the result back together with layout hints preserved where possible.

# get the binary and run a first conversion
curl -sL https://github.com/ninehills/pdf2md/releases/download/v0.1/pdf2md_0.1_linux_amd64.tar.gz | tar xz
./pdf2md --model paddleocr-vl-1.5-gguf --output ./output paper.pdf

That command downloads the prebuilt binary, then runs the two-stage model path against paper.pdf and writes output files into ./output. If the model weights are not already present, pdf2md uses its configured model directory, so you should expect a first run to be slower than later runs.

Pros and Cons of pdf2md

Pros:

Low host dependency count — the repo explicitly avoids requiring Python, local onnxruntime, or a manual CUDA installation on the host.
Multiple model paths — you can choose between OCR-heavy and parsing-heavy backends depending on document layout and accuracy needs.
Deterministic CLI workflow — flags such as --model, --dpi, --concurrency, and --output make automation straightforward.
Docker isolation — model serving is containerized, which reduces environment drift across developer machines and CI runners.
Cross-platform binaries — prebuilt releases cover linux, macOS, and Windows on amd64 and arm64.
Good for batch jobs — the architecture is shaped around repeatable conversions rather than one-off interactive use.

Cons:

Docker is mandatory — you do not get a truly standalone host-only binary path because inference depends on containers.
GPU-centric setup — the documented prerequisite is nvidia-container-toolkit, so CPU-only environments are not the intended primary use case.
Model weight management — you still need to manage local model directories and container images, which adds operational overhead.
Not a GUI product — users who want drag-and-drop workflows or browser-based review will need a different tool.
Accuracy depends on document quality — scanned PDFs, complex tables, and low-resolution pages still require tuning --dpi and model choice.

Getting Started with pdf2md

The fastest path is to download the release binary, confirm Docker and GPU access, then run the CLI against one PDF. The project also supports source builds with go build, which is useful if you want to pin a commit or modify the pipeline.

# prerequisites
docker --version && nvidia-smi

# install the released binary
curl -sL https://github.com/ninehills/pdf2md/releases/download/v0.1/pdf2md_0.1_linux_amd64.tar.gz | tar xz
./pdf2md --help

# first conversion
./pdf2md paper.pdf

After that first run, expect pdf2md to pull or use the selected model container and write the converted artifacts into the working directory unless you pass -o. If you are processing many files, set --concurrency deliberately and point --model-dir at a persistent path so repeated runs do not re-fetch weights.

Verdict

pdf2md is the strongest option for local PDF-to-Markdown conversion when Docker and a GPU are acceptable. Its biggest strength is the clean separation between a pure Go orchestrator and multiple inference backends; its biggest caveat is the operational cost of running containerized models. Choose pdf2md if you want private, reproducible document extraction.

pdf2md: Best PDF Conversion CLI Tools for Developers in 2026

What Is pdf2md?

Quick Overview

Who Should Use pdf2md?

Key Features of pdf2md

pdf2md vs Alternatives

How pdf2md Works

Pros and Cons of pdf2md

Getting Started with pdf2md

Verdict

Frequently Asked Questions

You Might Also Like

olcrtc-manager-panel: Open-Source DevOps Automation for VPS

DEEIX Chat: Best AI Workspaces for Enterprise Teams in 2026

OSIRIS: Best OSINT Platforms for Security Researchers in 2026