- type
- entity
- created
- Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
- updated
- Tue Apr 07 2026 02:00:00 GMT+0200 (Central European Summer Time)
- sources
- raw/notes/projectbrief, raw/notes/techContext
- tags
- scraper crawlee fastapi saas data-collection sibling-project
B2BPaper Scraper
Overview
The B2BPaper Scraper is a separate project that collects data from paper industry sources to feed the marketplace. It lives in its own repository at /home/claude/customers/b2bpaper-scraper/ and is distinct from the main wiki/entities/b2bpaper marketplace application.
Tech Stack
- Scraping framework: Crawlee (Node.js-based web scraper)
- API layer: FastAPI (Python)
- Purpose: Automated data collection from paper industry websites, price indices, supplier catalogs, and other sources
Build Status
The scraper project has its own story-based build tracker:
- 24 out of 56 stories complete at last count
- Stopped at SaaS-060 -- the project was paused during a SaaS pivot phase
- The most recent work involved buyer RFQ creation (target price, deadline, mill notifications)
SaaS Pivot
The scraper project was undergoing a pivot to become a SaaS product -- not just a data collection tool for the marketplace, but a standalone service. This pivot was in progress when development was paused. Key SaaS-related work included:
- Mill tier system (B2B-401)
- Mill trust score engine (B2B-400)
- Fee calculation engine (B2B-300)
- Document generation (B2B-311)
- Buyer RFQ creation (SaaS-060)
Relationship to Main Marketplace
The scraper serves as the Data Bootstrap Layer in the wiki/concepts/four-layer-architecture. While the main marketplace handles entity management, matching, and output, the scraper populates the initial data -- product catalogs, mill information, pricing data -- that the marketplace then operates on.
Monitoring
The main marketplace project includes monitoring dashboards for the scraper:
docs/scrape-monitor.html-- scraper status dashboarddocs/explore.html-- scraped data browser
Sources
- raw/notes/projectbrief -- mentions scraper as part of the tech ecosystem
- raw/notes/techContext -- references data collection pipeline
Related
- wiki/entities/b2bpaper -- the main marketplace this scraper feeds
- wiki/entities/deploystaff -- built and maintained by Rafael's team
- wiki/concepts/four-layer-architecture -- scraper feeds the Data Bootstrap Layer
- wiki/concepts/six-core-entities -- scraper populates Products and Mills entities