
Nu Plugin Topology
by danielbodnar
High-performance content topology and deduplication engine for Nushell and CLI.
What it does
Nu Plugin Topology provides advanced data profiling, SimHash fingerprinting, and stratified sampling to organize large collections of text-based data. It is primarily used for deduplicating bookmarks, GitHub stars, and files by identifying content similarity regardless of word order.
Tools
- Fingerprint: Computes 64-bit SimHash fingerprints for JSON records to detect duplicates.
- Sample: Extracts representative subsets of data using random, stratified, systematic, or reservoir sampling.
- Analyze: Generates field-level statistics including cardinality and type distribution.
- Similarity: Measures string distance using Levenshtein, Jaro-Winkler, or Cosine metrics.
- Normalize URL: Cleans URLs by stripping tracking parameters and fragments for better deduping.
Installation
Build the binary and add to your path:
{
"mcpServers": {
"topology": {
"command": "topology",
"args": []
}
}
}
Supported hosts
- Nushell
- CLI
Quick install
cargo build --release --features plugin,cliInformation
- Pricing
- free
- Published
- 5/10/2026
- stars





