Epic B2B-060: Product Catalog Intelligence

Goal: Enable mills to onboard their product catalog effortlessly via PDF datasheet upload, Excel import, or manual entry — with AI-powered extraction, product matching, and admin review.

Why: Mills won't type 30 fields into a form. They already have datasheets. We already have 8,919 processed documents and 4,246 catalog products in the Extractor pipeline. Connect the dots.

Dependencies:


Ticket Dependency Graph

B2B-060 (Sync Bridge)
    ↓
B2B-061 (Datasheet Upload Model + API)
    ↓
B2B-062 (AI Extraction Integration)
    ↓
B2B-063 (Product Matching Engine)
    ↓
B2B-064 (Admin Review Dashboard) ←── B2B-063
    ↓
B2B-065 (Mill Upload UI)          ←── B2B-062
    ↓
B2B-066 (Clone Product)           ←── (independent, B2B-051 only)
    ↓
B2B-067 (Excel Template Import)   ←── B2B-061
    ↓
B2B-068 (Empty State + Onboarding Flow) ←── B2B-065, B2B-067

B2B-060: Extractor → Marketplace Sync Bridge

Priority: P0 Type: Backend Depends on: B2B-051 (Product CRUD) ✅

What

Management command that imports catalog_products and mills from the Extractor's SQLite DB into the Marketplace's PostgreSQL Product and Mill models. One-time bulk import + re-runnable (idempotent via fingerprint/extractor_id mapping).

Acceptance Criteria

API Contract

N/A — management command only.

Playwright Test Expectations

N/A — backend only. Verify via:

python manage.py sync_extractor --source /path/to/paper_data.db --dry-run
# Outputs: "Would import X products, create Y mills"
python manage.py sync_extractor --source /path/to/paper_data.db
# Outputs: "Imported X products..."
python manage.py sync_extractor --source /path/to/paper_data.db
# Re-run outputs: "Imported 0 new, skipped X duplicates"

Files to Touch


B2B-061: Datasheet Upload Model + API

Priority: P0 Type: Backend Depends on: B2B-060

What

New DatasheetUpload model and API endpoints for mills (and admins) to upload PDF datasheets. Tracks upload → processing → review → accepted/rejected lifecycle.

Acceptance Criteria

API Contract

Upload:

POST /api/datasheets/upload/
Content-Type: multipart/form-data
Body: file=<PDF>, mill=<uuid> (optional, admin only)
Response 201: {
  "id": "uuid",
  "status": "pending",
  "original_filename": "Navigator_Premium_Specs.pdf",
  "created_at": "2026-03-13T16:00:00Z"
}

List:

GET /api/datasheets/?status=review&mill=<uuid>&page=1
Response 200: { "count": 42, "results": [...] }

Review:

PATCH /api/datasheets/{id}/review/
Body: {"action": "accept", "admin_notes": "Looks good"}
Response 200: {"id": "...", "status": "accepted", ...}

Playwright Test Expectations

test('admin can see datasheet upload list', async ({ page }) => {
  // Login as admin
  await page.goto('/manage/datasheets');
  await expect(page.locator('h1')).toContainText('Datasheet');
  // Table or empty state should be visible
  await expect(page.locator('[data-testid="datasheet-list"], [data-testid="empty-state"]')).toBeVisible();
});

test('admin can filter datasheets by status', async ({ page }) => {
  await page.goto('/manage/datasheets');
  await page.locator('[data-testid="status-filter"]').click();
  await page.locator('mat-option:has-text("Review")').click();
  // URL should update with ?status=review
  await expect(page).toHaveURL(/status=review/);
});

Files to Touch


B2B-062: AI Extraction Integration (Extractor Pipeline)

Priority: P0 Type: Backend Depends on: B2B-061

What

When a DatasheetUpload is created (status=pending), a Celery task sends the PDF to the Extractor pipeline (localhost:8925) for processing. When extraction completes, the extracted specs are stored back on the DatasheetUpload record.

Acceptance Criteria

API Contract

Internal Celery task. Extractor API:

POST http://localhost:8925/upload
Content-Type: multipart/form-data
Body: file=<PDF>
Response: {"job_id": "abc123", "status": "queued"}

GET http://localhost:8925/status/abc123
Response: {"job_id": "abc123", "status": "done", "products": [...]}

Playwright Test Expectations

test('uploaded datasheet shows processing status', async ({ page }) => {
  // After upload, navigate to datasheet detail
  await page.goto('/manage/datasheets/{id}');
  // Should show processing indicator or extracted results
  await expect(page.locator('[data-testid="extraction-status"]')).toBeVisible();
});

Files to Touch


B2B-063: Product Matching Engine

Priority: P0 Type: Backend Depends on: B2B-062, B2B-060

What

After extraction, each extracted product is matched against existing Products in the Marketplace DB. Uses a scoring algorithm: exact match (same mill + paper_type + GSM + width) → close match (same type + GSM, different mill/width) → no match (new product). Results stored in DatasheetUpload.matched_products.

Acceptance Criteria

API Contract

Internal service. Output stored on DatasheetUpload:

{
  "matched_products": [
    {
      "extracted": {"name": "Kraftliner 120", "gsm": 120, "paper_type": "kraftliner", ...},
      "match_type": "exact",
      "confidence": 0.97,
      "matched_product_id": "uuid-of-existing-product",
      "matched_product_name": "Kraftliner Brown 120"
    },
    {
      "extracted": {"name": "Special Anti-Grease 45", "gsm": 45, ...},
      "match_type": "new",
      "confidence": 0.15,
      "matched_product_id": null,
      "matched_product_name": null
    }
  ]
}

Playwright Test Expectations

test('datasheet detail shows matched products with confidence', async ({ page }) => {
  await page.goto('/manage/datasheets/{id}');
  // Should show product match cards
  await expect(page.locator('[data-testid="match-card"]').first()).toBeVisible();
  // Each card shows confidence badge
  await expect(page.locator('[data-testid="confidence-badge"]').first()).toBeVisible();
});

Files to Touch


B2B-064: Admin Review Dashboard

Priority: P0 Type: Frontend Depends on: B2B-063

What

Admin-only screen showing all datasheet uploads as an inbox. For each datasheet: original PDF link, extracted products, AI classification, match results with confidence scores. Admin can accept (creates/links products), edit (modify before accepting), or reject.

Acceptance Criteria

Playwright Test Expectations

test('admin sees datasheet inbox', async ({ page }) => {
  await loginAsAdmin(page);
  await page.goto('/manage/datasheets');
  await expect(page.locator('h1')).toContainText('Datasheet');
  await expect(page.locator('table, [data-testid="datasheet-list"]')).toBeVisible();
});

test('admin can open datasheet detail', async ({ page }) => {
  await loginAsAdmin(page);
  await page.goto('/manage/datasheets');
  await page.locator('tr').nth(1).click(); // click first row
  await expect(page.locator('[data-testid="extracted-products"]')).toBeVisible();
  await expect(page.locator('[data-testid="match-card"]').first()).toBeVisible();
});

test('admin can accept a high-confidence match', async ({ page }) => {
  await loginAsAdmin(page);
  await page.goto('/manage/datasheets/{id}');
  const card = page.locator('[data-testid="match-card"]').first();
  await card.locator('button:has-text("Accept")').click();
  await expect(card.locator('[data-testid="status-badge"]')).toContainText('Accepted');
});

test('admin can bulk accept all high-confidence matches', async ({ page }) => {
  await loginAsAdmin(page);
  await page.goto('/manage/datasheets/{id}');
  await page.locator('button:has-text("Accept All")').click();
  await expect(page.locator('[data-testid="accepted-count"]')).toBeVisible();
});

test('admin can reject a datasheet with notes', async ({ page }) => {
  await loginAsAdmin(page);
  await page.goto('/manage/datasheets/{id}');
  await page.locator('button:has-text("Reject")').click();
  await page.locator('[data-testid="reject-notes"]').fill('Wrong mill');
  await page.locator('button:has-text("Confirm Reject")').click();
  await expect(page.locator('[data-testid="status-badge"]')).toContainText('Rejected');
});

test('mill user cannot access datasheets admin', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/datasheets');
  // Should redirect or show 403
  await expect(page).not.toHaveURL('/manage/datasheets');
});

Files to Touch


B2B-065: Mill Datasheet Upload UI

Priority: P1 Type: Frontend Depends on: B2B-062

What

Mill-facing upload page. Mill user drags/drops a PDF datasheet, sees processing status, and gets confirmation when extraction is complete and sent for admin review.

Acceptance Criteria

Playwright Test Expectations

test('mill user can upload a PDF datasheet', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/datasheets/upload');
  await expect(page.locator('[data-testid="upload-zone"]')).toBeVisible();
  await expect(page.locator('text=PDF')).toBeVisible();
});

test('upload zone rejects non-PDF files', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/datasheets/upload');
  // Try uploading a .txt file
  const fileInput = page.locator('input[type="file"]');
  await fileInput.setInputFiles({ name: 'test.txt', mimeType: 'text/plain', buffer: Buffer.from('hello') });
  await expect(page.locator('[data-testid="error-message"]')).toContainText('PDF');
});

test('mill user sees their upload history', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/datasheets');
  // Should only see own mill's datasheets
  await expect(page.locator('table, [data-testid="datasheet-list"]')).toBeVisible();
});

Files to Touch


B2B-066: Clone Product

Priority: P1 Type: Full-stack Depends on: B2B-051 only

What

"Clone" button on product detail/list that duplicates a product with all its specs, opens the edit form pre-filled. Mill user changes GSM (or other fields) and saves as new product. Killer UX for mills that produce the same paper in 6 different weights.

Acceptance Criteria

API Contract

POST /api/products/{id}/clone/
Response 201: { "id": "new-uuid", "name": "Kraftliner 120 (Copy)", ... }

Playwright Test Expectations

test('clone button creates a copy and opens edit form', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/products/{id}');
  await page.locator('button:has-text("Clone")').click();
  // Should navigate to edit form of new product
  await expect(page).toHaveURL(/\/manage\/products\/.*\/edit/);
  // Name should contain "(Copy)"
  const nameInput = page.locator('input[formControlName="name"]');
  await expect(nameInput).toHaveValue(/\(Copy\)/);
});

test('cloned product is independent from original', async ({ page }) => {
  // Clone, change GSM, save
  await loginAsMill(page);
  await page.goto('/manage/products/{id}');
  await page.locator('button:has-text("Clone")').click();
  await page.locator('input[formControlName="gsm"]').fill('150');
  await page.locator('input[formControlName="name"]').fill('Kraftliner 150');
  await page.locator('button[type="submit"]').click();
  await expect(page.locator('.notification')).toContainText('created');
});

Files to Touch


B2B-067: Excel Template Import

Priority: P1 Type: Full-stack Depends on: B2B-061

What

Downloadable .xlsx template with sample paper product data + dropdown validations. Upload endpoint parses the Excel, validates rows, shows preview with errors highlighted, and imports valid products on confirm.

Acceptance Criteria

Playwright Test Expectations

test('download template button returns xlsx file', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/products');
  const [download] = await Promise.all([
    page.waitForEvent('download'),
    page.locator('button:has-text("Template")').click(),
  ]);
  expect(download.suggestedFilename()).toMatch(/\.xlsx$/);
});

test('excel upload shows preview with validation', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/products/import');
  // Upload a valid xlsx
  await page.locator('input[type="file"]').setInputFiles('test-fixtures/sample-products.xlsx');
  await expect(page.locator('[data-testid="preview-table"]')).toBeVisible();
  await expect(page.locator('[data-testid="valid-count"]')).toContainText(/\d+/);
});

test('confirm import creates products', async ({ page }) => {
  await loginAsMill(page);
  await page.goto('/manage/products/import');
  await page.locator('input[type="file"]').setInputFiles('test-fixtures/sample-products.xlsx');
  await page.locator('button:has-text("Import")').click();
  await expect(page.locator('.notification')).toContainText('imported');
});

Files to Touch


B2B-068: Empty State + Onboarding Flow

Priority: P2 Type: Frontend Depends on: B2B-065, B2B-067

What

When a mill user logs in and has zero products, show a guided onboarding screen instead of an empty table. Three clear paths: upload PDF datasheet, import from Excel, or add manually.

Acceptance Criteria

Playwright Test Expectations

test('new mill user sees onboarding empty state', async ({ page }) => {
  await loginAsNewMill(page); // mill with 0 products
  await page.goto('/manage/products');
  await expect(page.locator('[data-testid="empty-state"]')).toBeVisible();
  await expect(page.locator('text=Upload Datasheet')).toBeVisible();
  await expect(page.locator('text=Import from Excel')).toBeVisible();
  await expect(page.locator('text=Add Manually')).toBeVisible();
});

test('empty state disappears when products exist', async ({ page }) => {
  await loginAsMill(page); // mill WITH products
  await page.goto('/manage/products');
  await expect(page.locator('[data-testid="empty-state"]')).not.toBeVisible();
  await expect(page.locator('table')).toBeVisible();
});

test('dashboard shows catalog empty warning for new mill', async ({ page }) => {
  await loginAsNewMill(page);
  await page.goto('/manage');
  await expect(page.locator('[data-testid="catalog-empty-card"]')).toBeVisible();
});

Files to Touch


Summary: Build Order

Sprint Tickets What Estimated
Sprint 1 B2B-060, B2B-066 Sync bridge + Clone product 1 day
Sprint 2 B2B-061, B2B-062 Upload model + AI extraction 2 days
Sprint 3 B2B-063, B2B-064, B2B-065 Matching + Admin dashboard + Mill upload UI 3 days
Sprint 4 B2B-067, B2B-068 Excel import + Onboarding flow 2 days

Total: ~8 days of dev work (can parallel backend/frontend in sprints 2-4).


Tech Notes

Extractor Pipeline Reference

Field Mapping: Extractor → Marketplace

Extractor (catalog_products) Marketplace (Product)
name name
category paper_type (needs mapping)
gsm gsm
coating coating
color color
width_mm width_mm
height_mm length_mm
presentation form
certifications (comma-sep) certifications (JSON array)
fiber_source fiber_type
quality_grade quality
product_code product_code
brand brand
fingerprint extractor_fingerprint (new)
mill_id + mill_name mill (FK)

Category Mapping: Extractor → Marketplace paper_type

Extractor category Marketplace paper_type
Kraftliner kraftliner
Testliner testliner
Fluting Medium fluting
Coated Paper C2S coated
Uncoated Woodfree writing
Newsprint newsprint
Folding Boxboard (FBB) board
Solid Bleached Board (SBS) board
Kraft Paper kraft
Thermal Paper thermal
NCR / Carbonless Paper ncr
Greaseproof Paper greaseproof
(others) other