Top Interview Questions
System Design is the process of defining the architecture, components, modules, interfaces, and data flow of a system to meet specific requirements. It is a foundational discipline in software engineering that focuses on how different parts of a system interact and work together to achieve scalability, reliability, efficiency, and maintainability.
At its core, system design answers this question:
“How do we build a system that solves a problem effectively at scale?”
It goes beyond writing code—it involves planning the structure of a system before implementation. This includes deciding how data flows, how services communicate, how users interact with the system, and how the system handles growth, failures, and performance demands.
System design is commonly used when building large-scale applications such as social media platforms, e-commerce systems, banking software, and cloud-based services.
Architecture defines the high-level structure of the system. It determines:
Whether the system is monolithic or microservices-based
How components interact
Deployment strategies
Common architectural patterns include:
Monolithic Architecture: All components are tightly integrated into a single codebase
Microservices Architecture: System is divided into small, independent services
Client-Server Architecture: Separation between frontend (client) and backend (server)
Scalability refers to a system’s ability to handle increased load without performance degradation.
There are two types:
Vertical Scaling (Scaling Up): Adding more power (CPU, RAM) to a single machine
Horizontal Scaling (Scaling Out): Adding more machines to distribute the load
A well-designed system anticipates growth and can scale seamlessly.
Reliability ensures that a system functions correctly and consistently over time, even when components fail.
Techniques to improve reliability:
Redundancy (backup systems)
Failover mechanisms
Health checks and monitoring
Availability measures how often a system is operational and accessible.
High availability systems aim for minimal downtime, often expressed as:
99.9% uptime (three nines)
99.99% uptime (four nines)
This is achieved using:
Load balancing
Replication
Distributed systems
Performance is about how fast a system responds to requests.
Factors affecting performance:
Latency (time taken to respond)
Throughput (number of requests handled per second)
Optimization techniques include:
Caching
Database indexing
Efficient algorithms
Data storage is a critical aspect of system design.
Types of databases:
Relational Databases (SQL): Structured data (e.g., MySQL, PostgreSQL)
NoSQL Databases: Flexible schema (e.g., MongoDB, Cassandra)
Important considerations:
Data consistency
Data partitioning (sharding)
Replication
Load balancers distribute incoming traffic across multiple servers to prevent overload.
Benefits:
Improves performance
Increases availability
Prevents single points of failure
Caching stores frequently accessed data in fast storage (like memory) to reduce latency.
Examples:
Browser cache
CDN (Content Delivery Network)
In-memory caches like Redis
Security ensures that the system protects data and prevents unauthorized access.
Key areas:
Authentication (who are you?)
Authorization (what can you do?)
Data encryption
Secure APIs
Monitoring tracks system performance and health, while logging records events for debugging.
Tools help detect:
Failures
Bottlenecks
Security issues
Designing a system typically follows a structured approach:
Functional requirements (what the system should do)
Non-functional requirements (performance, scalability, etc.)
Identify components (frontend, backend, database)
Define interactions
Create architecture diagrams
Define data models
Specify APIs
Design internal logic
Analyze potential weak points
Plan optimizations
Improve design based on feedback
Adapt to new requirements
Let’s take a simple example to understand system design in practice.
Convert long URLs into short ones
Redirect users to original URLs
Handle millions of requests
User sends a URL → server generates a short code
Store mapping in a database
When user accesses short URL → fetch original URL → redirect
Generating unique short codes
Handling high traffic
Ensuring fast redirection
Use hashing or base62 encoding
Cache frequently accessed URLs
Use distributed databases
System design is critical because:
Scalability Planning
Prevents systems from breaking under growth
Cost Efficiency
Optimizes infrastructure usage
Maintainability
Makes systems easier to update and debug
Reliability
Ensures consistent performance
User Experience
Faster, more reliable systems lead to better UX
Some widely used design patterns include:
Microservices Pattern
Event-Driven Architecture
CQRS (Command Query Responsibility Segregation)
API Gateway Pattern
Circuit Breaker Pattern
These patterns help solve recurring design problems efficiently.
System designers use various tools:
Cloud Platforms: AWS, Azure, Google Cloud
Containerization: Docker, Kubernetes
Databases: MySQL, MongoDB
Messaging Systems: Kafka, RabbitMQ
These tools help build scalable and distributed systems.
Designing systems is complex due to:
Trade-offs (e.g., consistency vs availability)
Changing requirements
Handling large-scale data
Ensuring security
Engineers must balance these factors carefully.
System design is a crucial skill for building modern software systems. It involves making strategic decisions about architecture, scalability, performance, and reliability. A well-designed system not only meets current requirements but is also prepared for future growth and challenges.
Whether you're designing a small application or a large-scale distributed system, understanding system design principles helps you create efficient, robust, and scalable solutions.
Answer:
System Design is the process of defining the architecture, components, modules, interfaces, and data flow of a system to meet specific requirements.
It involves:
Breaking down a problem into smaller parts
Deciding how components interact
Ensuring scalability, reliability, and performance
π Example: Designing a URL shortener like Bitly
Answer:
A typical system includes:
Client → Browser / mobile app
Server → Handles business logic
Database → Stores data
Cache → Speeds up responses
Load Balancer → Distributes traffic
π Think of it like:
User → API → Server → DB → Response
Answer:
Scalability is the system’s ability to handle increasing traffic.
Vertical Scaling
Increase power of one machine (CPU/RAM)
Limited and expensive
Horizontal Scaling
Add more machines
Preferred in modern systems
Answer:
A Load Balancer distributes incoming requests across multiple servers.
Prevents overload on one server
Improves availability
Ensures high performance
Round Robin
Least Connections
IP Hash
Answer:
| Feature | SQL | NoSQL |
|---|---|---|
| Structure | Structured | Flexible |
| Schema | Fixed | Dynamic |
| Scaling | Vertical | Horizontal |
| Examples | MySQL, PostgreSQL | MongoDB, Cassandra |
π Use:
SQL → Transactions (banking)
NoSQL → Large-scale apps (social media)
Answer:
Caching stores frequently accessed data in memory for faster access.
Reduces DB load
Improves response time
Redis
Memcached
π Example: Storing user profile data temporarily
Answer:
CDN (Content Delivery Network) delivers content from servers closer to users.
Faster loading
Reduced latency
π Example: Images/videos served from nearby server
Answer:
CAP theorem states a distributed system can only guarantee 2 of 3:
Consistency → Same data everywhere
Availability → Always responds
Partition Tolerance → Works despite network failures
π Trade-offs:
CP (Banking systems)
AP (Social media)
Answer:
A Message Queue enables asynchronous communication between services.
Decouples services
Improves scalability
Kafka
RabbitMQ
π Example: Order placed → queue → payment service processes later
Answer:
Indexing improves query speed by creating a lookup structure.
Without index → scan entire table
With index → direct lookup
π Like a book index π
Answer:
| Monolith | Microservices |
|---|---|
| Single codebase | Multiple services |
| Easy to start | Scalable |
| Hard to scale | Complex |
π Freshers should say:
Start monolith → move to microservices when scaling needed
Answer:
API (Application Programming Interface) allows systems to communicate.
REST
GraphQL
π Example:
Frontend → API → Backend → DB
Answer:
Limits number of requests per user.
Prevent abuse
Avoid server overload
π Example:
100 requests/min per user
Answer:
Sharding splits database into smaller parts across servers.
Handles large data
Improves performance
Answer:
Replication copies data across multiple servers.
Master-Slave
Multi-Master
π Benefits:
High availability
Backup
Answer:
Input long URL
Output short URL
API Server
Database
Hashing function
User enters URL
Generate unique short ID
Store mapping in DB
Redirect when accessed
Answer:
Upload photos
View feed
Like/comment
App server
Media storage (images/videos)
Database (users/posts)
CDN (fast delivery)
Answer:
Latency → Time to process one request
Throughput → Requests per second
π Good system = low latency + high throughput
Answer:
System continues working even if components fail.
Replication
Backup servers
Failover systems
Answer (very important):
Clarify requirements
Define scale
High-level design
Database design
Identify bottlenecks
Suggest improvements
Focus on clarity over complexity
Use simple diagrams (mentally)
Communicate your thinking
Don’t jump to microservices immediately
Always discuss trade-offs
Functional
Shorten long URLs
Redirect short URL → original URL
Custom aliases (optional)
Expiration support
Non-functional
High availability
Low latency (redirects must be fast)
Scalable (millions of URLs/day)
Components
API service
URL encoding service
Database
Cache (Redis)
Load balancer
Use Base62 encoding
Auto-increment ID → encode to short string
Example:
ID: 125 → "cb"
Alternative:
Distributed ID generators (Snowflake)
Table: urls
- id (PK)
- short_code
- long_url
- created_at
- expiry
User hits short URL
Check cache
If miss → DB lookup
Redirect (HTTP 301/302)
Generate ID
Encode
Store mapping
Return short URL
Use caching layer (Redis) for hot URLs
DB sharding by id
CDN for global redirects
DB read-heavy → solved by cache
Hot URLs → cache + CDN
Functional
1-1 messaging
Group messaging
Online/offline delivery
Message history
Non-functional
Low latency
High reliability
Eventual consistency acceptable
Client → WebSocket → Chat servers
Message queue (Kafka)
Storage (NoSQL like Cassandra)
Persistent WebSocket connections
Each user connected to a chat server
Send Message
Client → server
Server → Kafka
Consumer → store in DB
Push to recipient
Store messages
Deliver when user reconnects
Use Cassandra/DynamoDB
Partition by user_id
Stateless chat servers
Horizontal scaling
Partition queues by user
Message ordering
Duplicate delivery → use message IDs
Fan-out for groups
Limit requests per user/IP
Example: 100 requests/minute
Tokens added at fixed rate
Request consumes token
Why good?
Allows burst traffic
Smooth rate limiting
More accurate
More memory usage
Use atomic operations
Key: user_id
Store count + timestamp
Distributed rate limiter using Redis cluster
Local cache for performance
Clock synchronization
Distributed consistency
Post content
Follow users
View feed
Fetch posts at request time
Pros
Less storage
Cons
Slow reads
Push posts to followers
Pros
Fast reads
Cons
Expensive writes
Push for normal users
Pull for celebrities
Posts table
Feed table per user
Shard by user ID
Cache feeds
Celebrity problem (millions of followers)
Ranking algorithm
Routing
Authentication
Rate limiting
Logging
Gateway sits between client and services
JWT validation
Centralized control
Route to healthy instances
Kong
NGINX
AWS API Gateway
Stateless
Horizontal scaling
Upload/download files
Share files
Versioning
Metadata service
Blob storage (S3/HDFS)
CDN
Client uploads to storage
Metadata stored separately
Chunk large files
Store chunks independently
Distributed storage
Replication for durability
Consistency
Large file uploads
Deduplication
Book ride
Match drivers
Real-time location
Location service (GPS tracking)
Matching service
Pricing service
Nearest driver (geospatial queries)
Use QuadTree / GeoHash
WebSockets / MQTT
Partition by geography
Cache driver locations
Surge pricing
Real-time updates
High availability
Fast access
Fault tolerant
Consistent hashing
Replication
Redis Cluster
Memcached
Cache invalidation
Data consistency
Drive the discussion
Ask clarifying questions
Justify trade-offs
Think about scale (millions of users)
Discuss bottlenecks and solutions
Requirements
High-level design
Deep dive (data, APIs)
Scaling
Trade-offs
Bottlenecks