What This Is
PatMatch is a specialized parts matching system built in 2019 for Kaman Industrial. It helps sales reps match customer part descriptions ("SKF sealed bearing 6203") to catalog items (SKF 6203-2RS1).
The current system is a C# Excel add-in with 50+ hand-written regex rules. It works 85% of the time. The other 15% requires manual database lookups — and that's expensive.
Context: This is an internal modernization project, not a customer-facing product.
Data: 50K+ parts catalog, 18K brand relationships, years of user corrections stored.
Users: Kaman Industrial sales reps processing RFQs (requests for quote).
The Problem: Brittle Pattern Matching
The current regex-based system fails on variations that humans handle easily. Here are real examples from the code:
What this costs:
• Sales reps manually search database for failed lookups
• ~50 minutes per day per sales rep wasted on edge cases
• $47,600/year in lost productivity (measured 2019-2026)
Why It Breaks
The 2019 system uses hand-written regex patterns stored in a database. Each brand has custom rules:
The problem: Every new part format requires a new regex rule. After 6 years in production, we have 50+ brand-specific patterns — and they still miss edge cases.
- Typos break exact string matching
- Word order must be exact ("DODGE bearing" ≠ "bearing DODGE")
- Decimal handling is fragile ("33.5" sometimes becomes "335" instead of "3350")
- No fuzzy matching or confidence scores
The AI Solution
Use AI to understand part descriptions semantically
• LLMs (GPT-4o-mini/Claude Haiku) extract brand + part number from unstructured text
• Vector embeddings handle typos, word order, abbreviations automatically
• Target: 95% accuracy (vs 85% today), with confidence scores
What changes: The matching engine (regex → AI embeddings)
What stays: Excel interface, database structure, user workflow
Key improvement: Learns from corrections automatically. When sales reps fix a wrong match, the system updates — no manual regex editing.
The Business Case
Current accuracy: ~85% match rate (15% require manual work)
AI target: 95% match rate (5% manual work)
Time savings: 50 min/day → 5 min/day per sales rep
The test: Run AI on 500 gold-standard test cases (real customer queries). Measure: Does it beat current system on edge cases (typos, word order, complex descriptions)?
Technical Architecture
Current System (2019):
- C# Excel add-in (NetOffice)
- SQL Server with 6-level waterfall matching
- 50+ brand-specific regex patterns
- BlobHistory cache, PartHistory user corrections
- 50K+ parts catalog, 18K brand relationships
Proposed AI System (2026):
- Python + FastHTML web backend
- OpenAI embeddings + GPT-4o-mini for parsing
- Vector database (Pinecone/Weaviate) for fuzzy matching
- LangGraph agents for multi-step reasoning
- REST API + Excel add-in (keep familiar UI)
- PostgreSQL + vector store
What we preserve: All domain knowledge (measurement conversions, brand hierarchies, user corrections, interchange groups). The waterfall matching logic stays — we just make each level smarter.
De-Risking Strategy
Phase 1: Validation Spike (2 weeks, $1,900)
- Build gold-standard test dataset (500 expert-labeled cases)
- Baseline current system accuracy on edge cases
- Test AI on same dataset
- Side-by-side comparison: precision, recall, latency
- Go/no-go decision: Does AI beat current system? If yes → Phase 2. If no → stop.
Phase 2: Full Build (10 weeks, $14,000)
- Only execute if Phase 1 proves AI superiority
- Build production system with web UI + API
- Shadow deployment (run both systems in parallel)
- Gradual rollout with user feedback loop
- A/B testing before full cutover
Total risk exposure: $1,900 to validate. Only commit $14K after proof.
Decision Point
Approve $1,900 for 2-week validation spike
We'll test AI on real Kaman parts data and show you if it actually solves the edge case problem. If it does, we build it. If it doesn't, we know definitively — and we only spent $1,900 to find out.
Supporting Documentation
Analysis completed January 2026 (3 technical documents, 150KB+):
- README.md - Current system architecture, 6-level waterfall, glossary of bearing terms
- PATMATCH_MODERNIZATION.md (64KB) - AI redesign: vector search, LLM parsing, agentic matching
- PATMATCH_EVALUATION_FRAMEWORK.md (68KB) - Testing methodology, 500-case gold standard, success criteria
- PATMATCH_BEFORE_AFTER.md (22KB) - Transformation analysis, failure modes, expected improvements
Test data available: PM Samples for Scott.xlsx, QA Items.xlsx, 50K parts catalog (ItemMaster.tsv), historical queries (PartHistory/BlobHistory tables)