Invoice Entity Bounding Box Mapping (Duplicate Handling)

Name: Invoice Entity Bounding Box Mapping (Duplicate Handling)
Rating: 74 (1 reviews)
Author: ecnu-icalk

Trust Score 74/100

Patches OCR invoice entity-to-bounding-box mapping to avoid shared boxes for duplicate values, reverse-search amounts sections, and ensure coordinate uniqueness

triggers:duplicate bounding boxinvoice ocramounts_and_tax reverseentity mappingbounding box uniqueness

GitHub SKILL.md

What it does

Adds concrete logic to OCR invoice entity mapping workflows to handle duplicate entity values safely: when the same entity value appears multiple times, the algorithm assigns distinct bounding boxes (no reuse), uses memoization to track taken boxes, and falls back to the next-best match when overlaps occur. For amounts_and_tax sections it reverses the search order (bottom-up) to better match invoice layouts. Multi-token entities get sequence-aware matching and overlap checks so tokens don't claim the same coordinates.

When to use it

Use when extracting structured fields from scanned invoices or receipts where the same textual value can appear multiple times (e.g., repeated amounts, line-item names). It's useful during OCR post-processing to increase mapping accuracy and avoid misattributing coordinates.

What's included

Scripts: none detected (has_scripts=false).
References: none bundled (has_references=false).
Instructions: clear operational rules: duplicate handling via memoization/dynamic programming, reversing dataframe for amounts sections, multi-token sequence matching, coordinate uniqueness constraints, and test/validation guidance.

Compatible agents

Relevant to Python-capable coding assistants (Codex, Copilot, GPT-style code assistants) and OCR pipelines that run post-processing scripts. Recommended for teams working with Tesseract/ocr-dataframes or CV-assisted extraction pipelines.

Audit Summary

A prompt-only skill that instructs an LLM to modify OCR invoice entity-to-bounding-box mapping code for duplicate handling. No scripts bundled — purely a structured prompt template with operational rules for dynamic programming, reverse dataframe search, and coordinate uniqueness. Well-defined constraints but no runnable code, examples, or error handling guidance.

Watch Out

No runnable scripts — entirely a prompt template
Requires existing OCR/invoice processing codebase to modify
Narrow use case limited to invoice bounding-box duplicate scenarios

Notes

Prompt-only skill from the AutoSkill research project (ecnu-icalk/autoskill). Clean from a security perspective as there are no scripts or executable code. Limited practical value as a standalone skill since it only provides instructions for modifying code that must already exist elsewhere.

Information

Repository: autoskill
Stars: 458

Trust Score

Overall74

Security100

Code Quality55

Architecture40

Usefulness28

More from autoskill

Extract Circuit Netlist Edge Features

Extract structured edge features from a bipartite circuit netlist (NetworkX MultiGraph), normalizing device/net ordering, mapping terminal colors, and detecting

Generate Multilingual Sentences with Contextual Definitions

Produce multiple example sentences using a target word in varied meanings, translate them into a target language, and show the contextual meaning of the word fo

Ultimate Assistant — Comprehensive Response

Produces methodical, highly detailed step-by-step answers that integrate scientific and non-scientific perspectives for complex questions.

Dynamic C Compilation Script Generator

Generates cross-platform shell scripts (Bash, PowerShell, or Batch) that accept source and output filenames to compile C programs with gcc, including basic argu

Provide Raw YouTube Video ID

Return only a valid existing YouTube video ID string (no URL, no commentary) — useful for tools that need raw IDs for downstream operations.

时空见证式哀伤外化对话

A grief-focused therapeutic micro-skill using dual-perspective narration to externalize and name core sorrow narratives and surface small supportive exceptions.

Health Condition Factor Analysis (with Citations)

Analyze a health condition for a defined population by listing categorized factors (health effects, QoL, behavioral, environmental, predisposing) with strict co

Genetic Algorithm for Rastrigin Function (Beginner Python)

Beginner-friendly Jupyter-ready Python implementation and explanation of a Genetic Algorithm to optimize the Rastrigin function using roulette-wheel selection a

Generate Twitch Game Outreach

Generates personalized, authentic outreach messages for Twitch streamers to request game keys or coverage, adapting tone while maintaining a human, community...

Empirical & Philosophical Film Analysis

Produce evidence-based film analyses that explain formal film techniques and connect them to philosophical themes, with clear definitions of technical terms.

Related Skills

Extract Circuit Netlist Edge Features

Extract structured edge features from a bipartite circuit netlist (NetworkX MultiGraph), normalizing device/net ordering, mapping terminal colors, and detecting

Markdrop

Convert PDFs to structured Markdown or interactive HTML and generate AI-powered descriptions for images and tables using multiple LLM providers.

FastAPI Project Templates

Creates production-ready FastAPI project scaffolds with async patterns, DI, middleware, and testing best practices for high-performance APIs.

Unitree Robot Controller

Control and command Unitree robots (GO1/GO2/G1/H1) via OpenClaw: initialization, basic motion commands, and sensor integrations.

Alpha Forge Pre-Ship Quality Gates

Pre-merge quality gates for PRs that validate RNG determinism, forked URLs, runtime parameter ranges, and manifest synchronization to reduce review cycles.

Plotly (Interactive Python Visualizations)

Interactive Python visualization skill for building hoverable, zoomable, and embeddable charts (Plotly Express + Graph Objects) for EDA, dashboards, and present

Dr. Manhattan - Prediction Market Trading

Unified CCXT-style API and tools for discovering, analyzing, and trading prediction markets across Polymarket, Kalshi, Opinion, Limitless, and Predict.fun; incl

Manim Idea to Export

Turn plain-language concepts into production-ready ManimCE scene blueprints, runnable code, preview renders, and final export commands for video assets.

Invoice Entity Bounding Box Mapping (Duplicate Handling)

What it does

When to use it

What's included

Compatible agents