System Design - data-engineering
#01 Data Engineering

Build Amazon's Product Search Ranking Premium

Design a search ranking system that incorporates real-time inventory, pricing, and sales velocity signals into results, personalizes rankings per user, and re-indexes a product catalog of 500 million items.

Read
#02 Data Engineering

Build Amazon's Review Aggregation Pipeline Premium

Design a review processing pipeline that deduplicates submissions, filters fraudulent reviews using behavioral signals, computes rolling star averages, and surfaces the most helpful reviews per product.

Read
#03 Data Engineering

Build Gmail's Spam Detection Pipeline Premium

Design a spam classification system that processes 300 billion emails per day, classifies each one in under 100ms before delivery, and continuously retrains models as spammers adapt their tactics.

Read
#04 Data Engineering

Build Google Photos Duplicate Detection Pipeline Premium

Design a pipeline that detects near-duplicate photos across 28 billion uploads per day using perceptual similarity rather than byte-level matching, and does so without storing a full copy of every image.

Read
#05 Scalability

Build Instagram Explore Real-Time Recommendation Engine Premium

Design a content discovery system that generates a personalized Explore grid for 2 billion users, refreshes recommendations as users scroll, and surfaces trending content within minutes of it going viral.

Read
#06 Data Engineering

Build Instagram Stories Expiry and Archival Pipeline Free

Design a system that automatically expires 500 million Stories after 24 hours, moves them to cold archival storage, and lets users retrieve archived Stories on demand without impacting live traffic.

Read
#07 Data Engineering

Build Netflix's Recommendation Engine Premium

Design a personalization engine that generates a unique homepage for 280 million users, blends collaborative filtering with content-based signals, and updates recommendations within hours of new viewing behavior.

Read
#08 Scalability

Build a QR Code Generation and Analytics Service Free

Design a system that generates dynamic QR codes at scale, tracks every scan with device and location metadata, and lets users update the destination URL without regenerating the QR code.

Read
#09 Observability

Build a Real-Time Log Aggregation Pipeline Premium

Design a log ingestion and querying system that handles 1 million events per second, supports full-text search with sub-second latency, and retains 90 days of logs without breaking the budget.

Read
#10 Scalability

Build Twitter's Trending Topics Pipeline Premium

Design a system that detects emerging viral signals from a stream of 500,000 tweets per second, computes trending topics per region, and refreshes trends every 60 seconds without reprocessing the full history.

Read