B2B Marketplace · Data Products
Salary Transparency & Structured Job Data
Improving salary transparency by building better structured job data.
Structured DataTaxonomyGround TruthMarketplaceData Quality
At a glance
−28%
Estimation error
94%
Salary coverage
+11%
Apply rate uplift
Overview
Salary ranges shown to job seekers depended on the quality of employer-provided job data. My work focused on improving taxonomy, structured attributes and data quality so salary estimates became more reliable.
Problem
Job postings arrived with inconsistent or incomplete structured data. Salaries were estimated from sparse, noisy inputs — role titles that meant different things, missing seniority levels, conflicting location mappings. The result was estimates that felt wrong to users and eroded trust.
Discovery
- —Mapped every field that fed into the salary model and scored its completeness
- —Interviewed job seekers to understand which estimates felt credible and why
- —Audited employer posting flows to find where structured data was being lost
- —Traced data lineage from ingestion through to the final estimate
Constraints
- —Three legacy ingestion systems with different schemas
- —No additional engineering headcount
- —Changes had to ship without breaking existing estimates
Product strategy
- —Treat structured data as the product surface, not just a backend concern
- —Build a unified taxonomy before improving the model
- —Use confidence scores to decide when to show, hide or qualify an estimate
Solution
01
Unified job taxonomy mapping titles, seniority and location to canonical values
02
New structured data pipeline with validation at ingestion
03
Confidence scoring that surfaced source quality alongside the estimate
Engineering
Worked closely with data and engineering teams to land the new schema behind a feature flag, backfill historical postings and run a shadow comparison before flipping traffic.
Lessons learned
- —Better AI starts with better ground truth, not better algorithms
- —Provenance matters as much as the number itself
- —Confidence intervals are a product surface, not just a metric
Reflection
"The most rewarding part was watching trust become measurable. Once users understood where a number came from, even imperfect estimates became useful."