1. Choose the engine that matches the document
Each engine has a clear sweet spot. Default to Standard; switch up only when you can name the reason. When to use each engine:- Standard: typed PDFs, born-digital invoices, text-layer PDFs from accounting systems. Cheapest at 1 credit per page.
- Vision: handwriting, low-quality scans, photos of paper, complex multi-column layouts, unusual fonts. 2 credits per page.
- KeyValuePair: forms with explicit label/value pairs (W-9s, ACORD forms, intake forms) where you want layout-aware key extraction. 1 credit per page plus LLM cost.
If a Standard parse already produces accurate text, switching to Vision adds cost without improving downstream results. Test cheap first.
2. Don’t reach for parse before extract by default
The extract node already runs OCR internally. Adding a parse step in front of it is only useful when:- You want to route documents based on raw text content before deciding whether to extract them.
- You need the parsed text as a separate output for search, archiving, or audit.
- You are running the same document through several extract schemas and want to OCR once.
3. Tier up only when results justify the cost
Document quality is usually skewed: 90% of incoming documents are clean, 10% are messy. Running every document on the most expensive engine wastes money on the easy cases. A more economical pattern:- Default to Standard parse for everything.
- Add a validation step downstream that checks for missing required fields or malformed values.
- On validation failure, route the document through a review node where a human can re-run with Vision or correct the data.
The Precision dial (
Small, Medium, High) only affects the KeyValuePair engine, where it trades cost for the LLM tier used to read fields. Standard and Vision are flat-priced.4. Use parse to enable content-based routing
Parse becomes a clear win when the next step is a route or classify node deciding what to do with the document. Examples:- Drop documents that don’t contain a vendor name on page 1.
- Split mailroom batches: route invoices to the AP pipe, contracts to legal, everything else to a generic intake.
- Skip extract entirely for documents that look like cover pages or duplicates.
5. Keep an eye on whitespace and layout artifacts downstream
OCR preserves visual layout, including page headers, footers, and watermarks. If a downstream node uses that text for matching, those artifacts can throw off:- Regex matches in route filters (a header copyright line catching every page).
- Embedding similarity in AI classify (boilerplate dominating the signal).
- Field descriptions that try to “find a value near the heading X”.
Common pitfalls
Parse + extract on the same document = double OCR
Parse + extract on the same document = double OCR
Adding parse upstream of extract runs OCR twice and bills both. Only chain them when you need parse’s text output for a separate purpose (routing, archiving, multiple schemas off one OCR pass). For a single-shot extract, drop parse.
Vision engine on clean digital PDFs wastes credits
Vision engine on clean digital PDFs wastes credits
Vision is for handwriting, photos, and ambiguous layouts. Born-digital PDFs with a real text layer parse correctly on Standard at half the cost. Run a small batch on Standard before committing to Vision globally.
KeyValuePair on free-form prose returns empty pairs
KeyValuePair on free-form prose returns empty pairs
KeyValuePair expects label/value structure (forms, intake sheets). Letters, contracts, and reports have no consistent labels and produce a sparse or empty result. Use Standard or Vision and let extract pull the fields you need.
Expensive parse engine paired with a sloppy schema
Expensive parse engine paired with a sloppy schema
Switching from Standard to Vision improves OCR accuracy but cannot fix a schema with vague descriptions or missing enums. If extract results are wrong on clean documents, the schema is the bottleneck, not parse. See schema design.
Related
Parse action
Configuration reference for the parse node
Extract best practices
How to design schemas that produce reliable results
Route action
Use parsed text to filter or branch
Credits
Per-engine credit costs