- type
- concept
- created
- Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
- updated
- Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
- sources
- raw/articles/PRD
- tags
- matching algorithm scoring core-mechanic
Matching Algorithm
Overview
The matching algorithm is the core intelligence of the marketplace. It evaluates every combination of available surplus items against active buyer specifications, producing a composite score from 0 to 100 that determines match quality. The algorithm uses hard disqualification gates (paper type, GSM, width) before computing soft scores (grade, geography), ensuring only physically viable matches are surfaced.
The Five Scoring Functions
1. Paper Type (Weight: 30%)
A binary gate returning 100 (match) or 0 (disqualification). Paper type must match exactly -- there is no "close" paper type. A buyer of wiki/entities/kraftliner cannot use wiki/entities/testliner. If the surplus paper type does not equal the BuyerSpec paper type, the entire match is discarded with no further computation.
Valid paper types: kraftliner, testliner, fluting, duplex, triplex, sack_kraft, white_top_testliner, coated_board, mg_kraft, greaseproof, tissue.
2. GSM / Grammage (Weight: 25%)
Scored against the buyer's min/max GSM range using tolerance bands:
| Condition | Score |
|---|---|
| Within buyer's exact range (center of range) | 100 |
| Within buyer's exact range (edge of range) | 80-100 (linear) |
| Within +/-5% beyond stated range | 60 |
| Within +/-10% beyond stated range | 30 |
| Outside all tolerances | 0 (disqualification) |
Example: Buyer spec 120-200 GSM. Surplus at 160 GSM scores 100. Surplus at 125 GSM scores ~86. Surplus at 205 GSM scores 60. Surplus at 215 GSM scores 30. Surplus at 250 GSM scores 0.
If the GSM score is 0, the match is discarded entirely.
3. Width (Weight: 20%)
Scored against the buyer's width range with a hard ceiling at machine max width:
| Condition | Score |
|---|---|
| Exceeds machine_max_width_mm | 0 (hard disqualification) |
| Within buyer's width range | 100 |
| Below minimum by <=5% | 70 (buyer might accept narrower) |
| Below minimum by 5-10% | 40 |
| Above maximum by <=5% (within machine limit) | 50 (can be cut) |
| Outside all tolerances | 0 (disqualification) |
The machine max width is a physical constraint -- a roll that does not fit the machine cannot be used regardless of other specs. If width score is 0, the match is discarded.
4. Quality Grade (Weight: 15%)
Scored against the buyer's list of acceptable grades using a hierarchy: A (prime) > B (near-prime) > C (off-grade).
| Condition | Score |
|---|---|
| Grade is in buyer's acceptable list | 100 |
| Surplus grade is higher than required | 90 (better quality than needed) |
| One grade below lowest acceptable | 40 (buyer might accept at discount) |
| Two or more grades below | 0 |
5. Geography (Weight: 10%)
Scored on proximity between surplus origin and buyer location:
| Condition | Score |
|---|---|
| Same country | 100 (ideal for truck shipping, especially intra-EU) |
| Same region (e.g., both EU) | 70 |
| Adjacent regions (e.g., EU to Middle East) | 40 |
| Distant regions | 20 |
Adjacent region pairs: EU-Middle East, EU-Africa, North America-South America, Asia-Middle East, Asia-Oceania, Middle East-Africa.
Composite Score Calculation
The composite score is computed as:
overall = (paper_type_score * 0.30) + (gsm_score * 0.25) + (width_score * 0.20) + (grade_score * 0.15) + (geo_score * 0.10)
Three hard disqualification gates are applied sequentially before computing the full composite:
- Visibility check (see wiki/concepts/geographic-visibility-system) -- checked first to avoid wasted computation
- Paper type must be non-zero
- GSM must be non-zero
- Width must be non-zero
If any gate fails, the function returns None (no match).
Match Classification
| Type | Score Range | Description |
|---|---|---|
| Exact | >= 90 | Near-perfect fit for buyer's specs |
| Close | 70-89 | Good fit, minor deviations |
| Partial | 50-69 | Usable but notable compromises |
| Below threshold | < 50 | Not surfaced to buyer |
Match Execution Flow
- New surplus item ingested (or buyer spec updated)
- Fetch all active BuyerSpecs with matching paper_type (pre-filter eliminates ~80%)
- For each BuyerSpec: check visibility, calculate composite score
- Scores >= 50 create MatchResult records
- Scores >= 80 become candidates for exclusivity
- Sort matches by overall_score descending
- Queue newsletter generation for all matched buyers
- Log match statistics
Price Check
After scoring, the algorithm checks if the surplus price (adjusted for the buyer's geographic region) falls within the buyer's max_price_per_mt. This is an informational flag (price_within_budget), not a hard disqualifier -- a buyer might pay more for a perfect spec match.
Performance Targets
| Scenario | Scale | Target |
|---|---|---|
| Single item vs all specs | 1 x 500 specs | < 1 second |
| Full batch ingestion (50 items) | 50 x 500 specs | < 10 seconds |
| Full re-match (all surplus vs all specs) | 5000 x 500 specs | < 60 seconds |
Optimization strategies: pre-filter by paper_type (index scan), pre-filter by visibility region, batch scoring in Django queryset operations, cache static region lookups, use Celery for async matching on large batches.
Sources
- raw/articles/PRD -- sections 7.1-7.6, 6.8
Related
- wiki/concepts/spec-based-matching -- the philosophy behind specification-based matching
- wiki/concepts/48-hour-exclusivity-window -- triggered by scores >= 80
- wiki/concepts/geographic-visibility-system -- visibility pre-filter
- wiki/concepts/quality-grades -- grade hierarchy used in scoring
- wiki/concepts/newsletter-generation -- consumes match results
- wiki/concepts/container-assembly -- uses match results for bin-packing