type: concept
created: Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
updated: Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
sources: raw/articles/PRD
tags: matching algorithm scoring core-mechanic

Matching Algorithm

abstract

The matching algorithm scores surplus-buyer pairs across five weighted dimensions (paper type 30%, GSM 25%, width 20%, grade 15%, geography 10%), applies a minimum threshold of 50, and classifies results as exact, close, or partial.

Overview

The matching algorithm is the core intelligence of the marketplace. It evaluates every combination of available surplus items against active buyer specifications, producing a composite score from 0 to 100 that determines match quality. The algorithm uses hard disqualification gates (paper type, GSM, width) before computing soft scores (grade, geography), ensuring only physically viable matches are surfaced.

The Five Scoring Functions

1. Paper Type (Weight: 30%)

A binary gate returning 100 (match) or 0 (disqualification). Paper type must match exactly -- there is no "close" paper type. A buyer of wiki/entities/kraftliner cannot use wiki/entities/testliner. If the surplus paper type does not equal the BuyerSpec paper type, the entire match is discarded with no further computation.

Valid paper types: kraftliner, testliner, fluting, duplex, triplex, sack_kraft, white_top_testliner, coated_board, mg_kraft, greaseproof, tissue.

2. GSM / Grammage (Weight: 25%)

Scored against the buyer's min/max GSM range using tolerance bands:

Condition	Score
Within buyer's exact range (center of range)	100
Within buyer's exact range (edge of range)	80-100 (linear)
Within +/-5% beyond stated range	60
Within +/-10% beyond stated range	30
Outside all tolerances	0 (disqualification)

Example: Buyer spec 120-200 GSM. Surplus at 160 GSM scores 100. Surplus at 125 GSM scores ~86. Surplus at 205 GSM scores 60. Surplus at 215 GSM scores 30. Surplus at 250 GSM scores 0.

If the GSM score is 0, the match is discarded entirely.

3. Width (Weight: 20%)

Scored against the buyer's width range with a hard ceiling at machine max width:

Condition	Score
Exceeds machine_max_width_mm	0 (hard disqualification)
Within buyer's width range	100
Below minimum by <=5%	70 (buyer might accept narrower)
Below minimum by 5-10%	40
Above maximum by <=5% (within machine limit)	50 (can be cut)
Outside all tolerances	0 (disqualification)

The machine max width is a physical constraint -- a roll that does not fit the machine cannot be used regardless of other specs. If width score is 0, the match is discarded.

4. Quality Grade (Weight: 15%)

Scored against the buyer's list of acceptable grades using a hierarchy: A (prime) > B (near-prime) > C (off-grade).

Condition	Score
Grade is in buyer's acceptable list	100
Surplus grade is higher than required	90 (better quality than needed)
One grade below lowest acceptable	40 (buyer might accept at discount)
Two or more grades below	0

5. Geography (Weight: 10%)

Scored on proximity between surplus origin and buyer location:

Condition	Score
Same country	100 (ideal for truck shipping, especially intra-EU)
Same region (e.g., both EU)	70
Adjacent regions (e.g., EU to Middle East)	40
Distant regions	20

Adjacent region pairs: EU-Middle East, EU-Africa, North America-South America, Asia-Middle East, Asia-Oceania, Middle East-Africa.

Composite Score Calculation

The composite score is computed as:

overall = (paper_type_score * 0.30) + (gsm_score * 0.25) + (width_score * 0.20) + (grade_score * 0.15) + (geo_score * 0.10)

Three hard disqualification gates are applied sequentially before computing the full composite:

Visibility check (see wiki/concepts/geographic-visibility-system) -- checked first to avoid wasted computation
Paper type must be non-zero
GSM must be non-zero
Width must be non-zero

If any gate fails, the function returns None (no match).

Match Classification

Type	Score Range	Description
Exact	>= 90	Near-perfect fit for buyer's specs
Close	70-89	Good fit, minor deviations
Partial	50-69	Usable but notable compromises
Below threshold	< 50	Not surfaced to buyer

Match Execution Flow

New surplus item ingested (or buyer spec updated)
Fetch all active BuyerSpecs with matching paper_type (pre-filter eliminates ~80%)
For each BuyerSpec: check visibility, calculate composite score
Scores >= 50 create MatchResult records
Scores >= 80 become candidates for exclusivity
Sort matches by overall_score descending
Queue newsletter generation for all matched buyers
Log match statistics

Price Check

After scoring, the algorithm checks if the surplus price (adjusted for the buyer's geographic region) falls within the buyer's max_price_per_mt. This is an informational flag (price_within_budget), not a hard disqualifier -- a buyer might pay more for a perfect spec match.

Performance Targets

Scenario	Scale	Target
Single item vs all specs	1 x 500 specs	< 1 second
Full batch ingestion (50 items)	50 x 500 specs	< 10 seconds
Full re-match (all surplus vs all specs)	5000 x 500 specs	< 60 seconds

Optimization strategies: pre-filter by paper_type (index scan), pre-filter by visibility region, batch scoring in Django queryset operations, cache static region lookups, use Celery for async matching on large batches.

Sources

raw/articles/PRD -- sections 7.1-7.6, 6.8

wiki/concepts/spec-based-matching -- the philosophy behind specification-based matching
wiki/concepts/48-hour-exclusivity-window -- triggered by scores >= 80
wiki/concepts/geographic-visibility-system -- visibility pre-filter
wiki/concepts/quality-grades -- grade hierarchy used in scoring
wiki/concepts/newsletter-generation -- consumes match results
wiki/concepts/container-assembly -- uses match results for bin-packing

Matching Algorithm

Overview

The Five Scoring Functions

1. Paper Type (Weight: 30%)

2. GSM / Grammage (Weight: 25%)

3. Width (Weight: 20%)

4. Quality Grade (Weight: 15%)

5. Geography (Weight: 10%)

Composite Score Calculation

Match Classification

Match Execution Flow

Price Check

Performance Targets

Sources

Related