Same card, five different IDs. I built a resolver.
MTG pricing data is fragmented across marketplaces with no shared identity. This platform unifies it, runs ML forecasts, and surfaces trading signals with confidence scores.
mtg.dataviking.techThe Interesting Problem
Identity resolution across marketplaces that don't agree on anything
Magic: The Gathering has 25,000+ unique cards, many with multiple printings across 100+ sets. TCGPlayer, CardMarket, and CardKingdom each assign their own IDs, use different naming conventions, and categorize conditions differently. "Lightning Bolt (Alpha, NM)" on one platform might be "Lightning Bolt - LEA (Near Mint)" on another and "Lightning Bolt [Limited Edition Alpha]" on a third.
The identity resolution problem is the foundation everything else depends on. You can't compare prices across marketplaces if you can't reliably determine you're looking at the same card. Fuzzy string matching gets you 80% of the way; the last 20% requires metadata alignment — set codes, collector numbers, foil flags, and condition grade mapping.
Once you have unified identity, the ML layer becomes tractable. Each card has a clean time series across all marketplaces, which feeds into ensemble forecasting: EWMA captures momentum, linear regression captures trends, mean reversion captures overreaction. The ensemble combines them with confidence-weighted voting, and every prediction is explainable — you can see which model drove the signal and why.
Pipeline Architecture
From raw marketplace data to actionable trading signals.
TCGPlayer ---\ CardMarket ---|--- scrapers --> raw prices CardKingdom --/ | v [Identity Resolver] fuzzy match + metadata alignment > canonical card ID | v [dbt Transform Layer] clean > dedupe > aggregate > feature engineering | v [ML Ensemble] EWMA + regression + mean reversion > confidence-weighted forecast | v [Signal Engine] momentum + forecast + spread analysis > Buy / Hold / Sell with confidence
Multi-Source Scrapers
TCGPlayer, CardMarket, CardKingdom price feeds with rate limiting and change detection
Card Resolver
Fuzzy matching + metadata alignment across marketplaces with different naming conventions
dbt + Dagster
Modular transforms with orchestrated scheduling, testing, and full data lineage
Ensemble Models
EWMA, regression, mean reversion combined with confidence-weighted voting
Supabase + Workers
Serverless API on Cloudflare Workers with PostgreSQL and real-time subscriptions
What Makes It Different
The technical decisions behind the platform.
Cross-Marketplace Identity Resolution
The same card has different IDs on TCGPlayer, CardMarket, and CardKingdom. Different names, different metadata, different condition grades. The resolver unifies them into a single canonical identity.
ML Ensemble Forecasting
EWMA, linear regression, and mean reversion models each capture different price dynamics. The ensemble combines them with confidence-weighted voting — no black-box, every prediction is explainable.
dbt + Dagster Pipeline
Modular SQL transforms in dbt, orchestrated by Dagster with full observability. Each transform is testable, documented, and versioned. Pipeline failures are isolated and recoverable.
Buy/Sell Signal Engine
Every card gets a data-driven rating with confidence scores. Signals combine momentum, forecast direction, and cross-marketplace spread to surface mispriced cards before the market corrects.
Market Momentum Tracking
Price velocity calculated over 7-day, 30-day, and 90-day windows. Spot breakout cards, declining staples, and sideways movers. Momentum is the leading indicator most MTG tools ignore.
Buylist Comparison
Compare what stores are paying for cards across marketplaces in one view. The spread between retail and buylist prices is where real trading value hides.
Built With
Explore the data
The platform is in beta. Browse real-time pricing, forecasts, and trading signals across the MTG market.
mtg.dataviking.tech