
from opendatalab
Run, validate, and parse OmniDocBench document parsing evaluations with Docker/conda workflows and result parsing.
python pdf_validation.py --config ... inside recommended Docker image ghcr.io/zeng-weijun/omnidocbench-eval:repro-ubuntu2204.\n- Parse and summarize *_metric_result.json, *_run_summary.json, *_stage_execution.json, and *_runtime_environment.json.\n- Troubleshooting: Docker access, CDM OOMs, ImageMagick policy issues, GT capitalization, and prediction folder nesting.\n\n## Recommended prompts / usage\n- "How do I run OmniDocBench end2end evaluation on my GT JSON and markdown predictions?"\n- "Help me parse an OmniDocBench result directory and extract overall/text/formula/table scores."\n- "I get CDM OOMs — how should I set cdm_workers on a 4CPU/8GB node?"\n\n## Installation / scripts\nThe skill bundles scripts: scripts/generate_end2end_config.py and scripts/parse_results.py for config generation and result parsing. Follow Docker workflow in the skill to avoid local dependency issues.\nThis skill has not been reviewed by our automated audit pipeline yet.