type: entity
created: Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
updated: Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
sources: raw/notes/projectbrief, raw/notes/techContext
tags: scraper crawlee fastapi saas data-collection sibling-project

B2BPaper Scraper

abstract

The B2BPaper Scraper is a sibling project to the main marketplace, built with Crawlee and FastAPI, responsible for data collection from paper industry sources and an in-progress SaaS pivot.

Overview

The B2BPaper Scraper is a separate project that collects data from paper industry sources to feed the marketplace. It lives in its own repository at /home/claude/customers/b2bpaper-scraper/ and is distinct from the main wiki/entities/b2bpaper marketplace application.

Tech Stack

Scraping framework: Crawlee (Node.js-based web scraper)
API layer: FastAPI (Python)
Purpose: Automated data collection from paper industry websites, price indices, supplier catalogs, and other sources

Build Status

The scraper project has its own story-based build tracker:

24 out of 56 stories complete at last count
Stopped at SaaS-060 -- the project was paused during a SaaS pivot phase
The most recent work involved buyer RFQ creation (target price, deadline, mill notifications)

SaaS Pivot

The scraper project was undergoing a pivot to become a SaaS product -- not just a data collection tool for the marketplace, but a standalone service. This pivot was in progress when development was paused. Key SaaS-related work included:

Mill tier system (B2B-401)
Mill trust score engine (B2B-400)
Fee calculation engine (B2B-300)
Document generation (B2B-311)
Buyer RFQ creation (SaaS-060)

Relationship to Main Marketplace

The scraper serves as the Data Bootstrap Layer in the wiki/concepts/four-layer-architecture. While the main marketplace handles entity management, matching, and output, the scraper populates the initial data -- product catalogs, mill information, pricing data -- that the marketplace then operates on.

Monitoring

The main marketplace project includes monitoring dashboards for the scraper:

docs/scrape-monitor.html -- scraper status dashboard
docs/explore.html -- scraped data browser

Sources

raw/notes/projectbrief -- mentions scraper as part of the tech ecosystem
raw/notes/techContext -- references data collection pipeline

wiki/entities/b2bpaper -- the main marketplace this scraper feeds
wiki/entities/deploystaff -- built and maintained by Rafael's team
wiki/concepts/four-layer-architecture -- scraper feeds the Data Bootstrap Layer
wiki/concepts/six-core-entities -- scraper populates Products and Mills entities

B2BPaper Scraper

Overview

Tech Stack

Build Status

SaaS Pivot

Relationship to Main Marketplace

Monitoring

Sources

Related