Skip to main content
The parse node converts document pages into raw text. The principles below help you choose an engine, manage cost on long documents, and feed clean text to downstream extract or route nodes.

1. Choose the engine that matches the document

Each engine has a clear sweet spot. Default to Standard; switch up only when you can name the reason. When to use each engine:
  • Standard: typed PDFs, born-digital invoices, text-layer PDFs from accounting systems. Cheapest at 1 credit per page.
  • Vision: handwriting, low-quality scans, photos of paper, complex multi-column layouts, unusual fonts. 2 credits per page.
  • KeyValuePair: forms with explicit label/value pairs (W-9s, ACORD forms, intake forms) where you want layout-aware key extraction. 1 credit per page plus LLM cost.
If a Standard parse already produces accurate text, switching to Vision adds cost without improving downstream results. Test cheap first.

2. Don’t reach for parse before extract by default

The extract node already runs OCR internally. Adding a parse step in front of it is only useful when:
  • You want to route documents based on raw text content before deciding whether to extract them.
  • You need the parsed text as a separate output for search, archiving, or audit.
  • You are running the same document through several extract schemas and want to OCR once.
If none of those apply, skip parse and let extract handle the OCR. You’ll save credits and keep the pipeline simpler.

3. Tier up only when results justify the cost

Document quality is usually skewed: 90% of incoming documents are clean, 10% are messy. Running every document on the most expensive engine wastes money on the easy cases. A more economical pattern:
  • Default to Standard parse for everything.
  • Add a validation step downstream that checks for missing required fields or malformed values.
  • On validation failure, route the document through a review node where a human can re-run with Vision or correct the data.
This keeps the average per-document cost at 1 credit/page while still recovering the hard cases.
The Precision dial (Small, Medium, High) only affects the KeyValuePair engine, where it trades cost for the LLM tier used to read fields. Standard and Vision are flat-priced.

4. Use parse to enable content-based routing

Parse becomes a clear win when the next step is a route or classify node deciding what to do with the document. Examples:
  • Drop documents that don’t contain a vendor name on page 1.
  • Split mailroom batches: route invoices to the AP pipe, contracts to legal, everything else to a generic intake.
  • Skip extract entirely for documents that look like cover pages or duplicates.
Without parse, route and classify can only branch on metadata (file name, sender, size). Parse gives them the document’s actual content.

5. Keep an eye on whitespace and layout artifacts downstream

OCR preserves visual layout, including page headers, footers, and watermarks. If a downstream node uses that text for matching, those artifacts can throw off:
  • Regex matches in route filters (a header copyright line catching every page).
  • Embedding similarity in AI classify (boilerplate dominating the signal).
  • Field descriptions that try to “find a value near the heading X”.
When you see noisy results, inspect the parsed text in the run detail and tighten downstream rules accordingly. There is no built-in header/footer stripper; rely on specific filters or anchor your descriptions to less-repetitive cues.

Common pitfalls

Adding parse upstream of extract runs OCR twice and bills both. Only chain them when you need parse’s text output for a separate purpose (routing, archiving, multiple schemas off one OCR pass). For a single-shot extract, drop parse.
Vision is for handwriting, photos, and ambiguous layouts. Born-digital PDFs with a real text layer parse correctly on Standard at half the cost. Run a small batch on Standard before committing to Vision globally.
KeyValuePair expects label/value structure (forms, intake sheets). Letters, contracts, and reports have no consistent labels and produce a sparse or empty result. Use Standard or Vision and let extract pull the fields you need.
Switching from Standard to Vision improves OCR accuracy but cannot fix a schema with vague descriptions or missing enums. If extract results are wrong on clean documents, the schema is the bottleneck, not parse. See schema design.

Parse action

Configuration reference for the parse node

Extract best practices

How to design schemas that produce reliable results

Route action

Use parsed text to filter or branch

Credits

Per-engine credit costs