System Design

System Design Interview Questions

Learn how to approach system design interviews end-to-end. Covers scalability, load balancing, databases, caching, message queues, and how to structure your answer under time pressure.

System design interviews test your ability to think through the architecture of large-scale distributed systems — the kind that serve millions of users. Unlike coding rounds, there is no single correct answer. Interviewers are evaluating your structured thinking, your awareness of trade-offs, and whether you can drive a conversation toward a coherent design.

The most common mistake candidates make is jumping straight into components before understanding requirements. Before drawing a single box, clarify scope, estimate scale, and define constraints. Everything flows from that.

Key Concepts

Requirements & Scope Clarification

Always start by asking questions. What are the functional requirements? What is the scale (requests per second, data volume, user count)? What are the non-functional requirements — availability, consistency, latency targets? Spending 3-5 minutes here prevents 20 minutes of designing the wrong system.

Capacity Estimation

Back-of-envelope calculations signal that you think about scale concretely. Estimate QPS (queries per second), storage requirements, and bandwidth. Rule of thumb: 1M DAU × 10 requests/day ≈ 100 QPS. A tweet-sized object (280 bytes + metadata) × 500M tweets/day ≈ 150 GB/day of write traffic.

Load Balancing & Horizontal Scaling

A single server has a ceiling. Load balancers distribute traffic across a fleet of stateless application servers. Horizontal scaling (more machines) beats vertical scaling (bigger machine) for availability and cost. Consistent hashing minimises cache misses and data movement when nodes are added or removed.

Databases: SQL vs NoSQL

SQL databases (Postgres, MySQL) give you ACID transactions and relational joins — ideal for complex queries and financial data. NoSQL (Cassandra, DynamoDB, MongoDB) sacrifice some consistency for horizontal write scalability and flexible schemas. The choice depends on your access patterns, not a blanket preference.

Caching

A cache sits in front of your database and absorbs read traffic. Redis and Memcached are the standard choices. Cache-aside (read from cache, miss → read DB, populate cache) is the most common pattern. Know the invalidation strategies: TTL, write-through, write-back. Cache stampede is a failure mode worth mentioning.

Message Queues & Async Processing

Decoupling producers from consumers with a queue (Kafka, SQS, RabbitMQ) improves resilience and lets you absorb traffic spikes. Use queues for work that doesn't need a synchronous response: sending emails, processing uploads, fan-out notifications. Kafka's log-based model also enables event sourcing and replay.

Sample Interview Questions

How would you design a URL shortener like bit.ly?

What is the CAP theorem and what does it mean in practice?

How does a CDN improve performance and what are its limits?

Ready to test yourself?

Apply what you've read with a timed 10-question quiz on System Design.

Start System Design Quiz →