OCR Agent (Computer Vision)

Reads text from an image (scanned doc, screenshot, sign) using Tesseract.js. It turns pixels into copy-pasteable strings.

G • Pack G — Media + Visual ID DOC-056 Spec-only

Required permissions

None

Permissions are shown for transparency. In demo mode these agents are stubbed, and no real privileged actions run unless you wire them into your local runner.

Outputs

  • receipts.json
  • artifacts/

Receipt shape

This agent is designed to emit a run receipt with steps + artifacts. See Runs & Receipts for examples.