System Design

System Design

Top Interview Questions

About System Design

System Design is the process of defining the architecture, components, modules, interfaces, and data flow of a system to meet specific requirements. It is a foundational discipline in software engineering that focuses on how different parts of a system interact and work together to achieve scalability, reliability, efficiency, and maintainability.


What is System Design?

At its core, system design answers this question:
“How do we build a system that solves a problem effectively at scale?”

It goes beyond writing code—it involves planning the structure of a system before implementation. This includes deciding how data flows, how services communicate, how users interact with the system, and how the system handles growth, failures, and performance demands.

System design is commonly used when building large-scale applications such as social media platforms, e-commerce systems, banking software, and cloud-based services.


Key Components of System Design

1. Architecture Design

Architecture defines the high-level structure of the system. It determines:

  • Whether the system is monolithic or microservices-based

  • How components interact

  • Deployment strategies

Common architectural patterns include:

  • Monolithic Architecture: All components are tightly integrated into a single codebase

  • Microservices Architecture: System is divided into small, independent services

  • Client-Server Architecture: Separation between frontend (client) and backend (server)


2. Scalability

Scalability refers to a system’s ability to handle increased load without performance degradation.

There are two types:

  • Vertical Scaling (Scaling Up): Adding more power (CPU, RAM) to a single machine

  • Horizontal Scaling (Scaling Out): Adding more machines to distribute the load

A well-designed system anticipates growth and can scale seamlessly.


3. Reliability

Reliability ensures that a system functions correctly and consistently over time, even when components fail.

Techniques to improve reliability:

  • Redundancy (backup systems)

  • Failover mechanisms

  • Health checks and monitoring


4. Availability

Availability measures how often a system is operational and accessible.

High availability systems aim for minimal downtime, often expressed as:

  • 99.9% uptime (three nines)

  • 99.99% uptime (four nines)

This is achieved using:

  • Load balancing

  • Replication

  • Distributed systems


5. Performance

Performance is about how fast a system responds to requests.

Factors affecting performance:

  • Latency (time taken to respond)

  • Throughput (number of requests handled per second)

Optimization techniques include:

  • Caching

  • Database indexing

  • Efficient algorithms


6. Database Design

Data storage is a critical aspect of system design.

Types of databases:

  • Relational Databases (SQL): Structured data (e.g., MySQL, PostgreSQL)

  • NoSQL Databases: Flexible schema (e.g., MongoDB, Cassandra)

Important considerations:

  • Data consistency

  • Data partitioning (sharding)

  • Replication


7. Load Balancing

Load balancers distribute incoming traffic across multiple servers to prevent overload.

Benefits:

  • Improves performance

  • Increases availability

  • Prevents single points of failure


8. Caching

Caching stores frequently accessed data in fast storage (like memory) to reduce latency.

Examples:

  • Browser cache

  • CDN (Content Delivery Network)

  • In-memory caches like Redis


9. Security

Security ensures that the system protects data and prevents unauthorized access.

Key areas:

  • Authentication (who are you?)

  • Authorization (what can you do?)

  • Data encryption

  • Secure APIs


10. Monitoring and Logging

Monitoring tracks system performance and health, while logging records events for debugging.

Tools help detect:

  • Failures

  • Bottlenecks

  • Security issues


System Design Process

Designing a system typically follows a structured approach:

Step 1: Understand Requirements

  • Functional requirements (what the system should do)

  • Non-functional requirements (performance, scalability, etc.)

Step 2: High-Level Design (HLD)

  • Identify components (frontend, backend, database)

  • Define interactions

  • Create architecture diagrams

Step 3: Low-Level Design (LLD)

  • Define data models

  • Specify APIs

  • Design internal logic

Step 4: Identify Bottlenecks

  • Analyze potential weak points

  • Plan optimizations

Step 5: Review and Iterate

  • Improve design based on feedback

  • Adapt to new requirements


Example: Designing a URL Shortener

Let’s take a simple example to understand system design in practice.

Requirements:

  • Convert long URLs into short ones

  • Redirect users to original URLs

  • Handle millions of requests

High-Level Design:

  • User sends a URL → server generates a short code

  • Store mapping in a database

  • When user accesses short URL → fetch original URL → redirect

Challenges:

  • Generating unique short codes

  • Handling high traffic

  • Ensuring fast redirection

Solutions:

  • Use hashing or base62 encoding

  • Cache frequently accessed URLs

  • Use distributed databases


Importance of System Design

System design is critical because:

  1. Scalability Planning
    Prevents systems from breaking under growth

  2. Cost Efficiency
    Optimizes infrastructure usage

  3. Maintainability
    Makes systems easier to update and debug

  4. Reliability
    Ensures consistent performance

  5. User Experience
    Faster, more reliable systems lead to better UX


Common System Design Patterns

Some widely used design patterns include:

  • Microservices Pattern

  • Event-Driven Architecture

  • CQRS (Command Query Responsibility Segregation)

  • API Gateway Pattern

  • Circuit Breaker Pattern

These patterns help solve recurring design problems efficiently.


Tools and Technologies in System Design

System designers use various tools:

  • Cloud Platforms: AWS, Azure, Google Cloud

  • Containerization: Docker, Kubernetes

  • Databases: MySQL, MongoDB

  • Messaging Systems: Kafka, RabbitMQ

These tools help build scalable and distributed systems.


Challenges in System Design

Designing systems is complex due to:

  • Trade-offs (e.g., consistency vs availability)

  • Changing requirements

  • Handling large-scale data

  • Ensuring security

Engineers must balance these factors carefully.


Conclusion

System design is a crucial skill for building modern software systems. It involves making strategic decisions about architecture, scalability, performance, and reliability. A well-designed system not only meets current requirements but is also prepared for future growth and challenges.

Whether you're designing a small application or a large-scale distributed system, understanding system design principles helps you create efficient, robust, and scalable solutions.

Fresher Interview Questions

 

🧠 1. What is System Design?

Answer:
System Design is the process of defining the architecture, components, modules, interfaces, and data flow of a system to meet specific requirements.

It involves:

  • Breaking down a problem into smaller parts

  • Deciding how components interact

  • Ensuring scalability, reliability, and performance

πŸ‘‰ Example: Designing a URL shortener like Bitly


🧱 2. What are the key components of a system?

Answer:
A typical system includes:

  • Client → Browser / mobile app

  • Server → Handles business logic

  • Database → Stores data

  • Cache → Speeds up responses

  • Load Balancer → Distributes traffic

πŸ‘‰ Think of it like:
User → API → Server → DB → Response


βš–οΈ 3. What is Scalability?

Answer:
Scalability is the system’s ability to handle increasing traffic.

Types:

  • Vertical Scaling

    • Increase power of one machine (CPU/RAM)

    • Limited and expensive

  • Horizontal Scaling

    • Add more machines

    • Preferred in modern systems


⚑ 4. What is Load Balancing?

Answer:
A Load Balancer distributes incoming requests across multiple servers.

Why needed?

  • Prevents overload on one server

  • Improves availability

  • Ensures high performance

Algorithms:

  • Round Robin

  • Least Connections

  • IP Hash


πŸ—„οΈ 5. SQL vs NoSQL?

Answer:

Feature SQL NoSQL
Structure Structured Flexible
Schema Fixed Dynamic
Scaling Vertical Horizontal
Examples MySQL, PostgreSQL MongoDB, Cassandra

πŸ‘‰ Use:

  • SQL → Transactions (banking)

  • NoSQL → Large-scale apps (social media)


⚑ 6. What is Caching?

Answer:
Caching stores frequently accessed data in memory for faster access.

Benefits:

  • Reduces DB load

  • Improves response time

Tools:

  • Redis

  • Memcached

πŸ‘‰ Example: Storing user profile data temporarily


πŸ” 7. What is a CDN?

Answer:
CDN (Content Delivery Network) delivers content from servers closer to users.

Benefits:

  • Faster loading

  • Reduced latency

πŸ‘‰ Example: Images/videos served from nearby server


πŸ”„ 8. What is Consistency vs Availability? (CAP Theorem)

Answer:
CAP theorem states a distributed system can only guarantee 2 of 3:

  • Consistency → Same data everywhere

  • Availability → Always responds

  • Partition Tolerance → Works despite network failures

πŸ‘‰ Trade-offs:

  • CP (Banking systems)

  • AP (Social media)


πŸ“¬ 9. What is Message Queue?

Answer:
A Message Queue enables asynchronous communication between services.

Benefits:

  • Decouples services

  • Improves scalability

Examples:

  • Kafka

  • RabbitMQ

πŸ‘‰ Example: Order placed → queue → payment service processes later


πŸ” 10. What is Database Indexing?

Answer:
Indexing improves query speed by creating a lookup structure.

Example:

Without index → scan entire table
With index → direct lookup

πŸ‘‰ Like a book index πŸ“–


🧩 11. Monolith vs Microservices?

Answer:

Monolith Microservices
Single codebase Multiple services
Easy to start Scalable
Hard to scale Complex

πŸ‘‰ Freshers should say:

  • Start monolith → move to microservices when scaling needed


🌍 12. What is API?

Answer:
API (Application Programming Interface) allows systems to communicate.

Types:

  • REST

  • GraphQL

πŸ‘‰ Example:
Frontend → API → Backend → DB


πŸ”„ 13. What is Rate Limiting?

Answer:
Limits number of requests per user.

Why?

  • Prevent abuse

  • Avoid server overload

πŸ‘‰ Example:
100 requests/min per user


πŸ“Š 14. What is Sharding?

Answer:
Sharding splits database into smaller parts across servers.

Benefits:

  • Handles large data

  • Improves performance


πŸ” 15. What is Replication?

Answer:
Replication copies data across multiple servers.

Types:

  • Master-Slave

  • Multi-Master

πŸ‘‰ Benefits:

  • High availability

  • Backup


πŸ§ͺ 16. How would you design a URL Shortener? (Basic)

Answer:

Requirements:

  • Input long URL

  • Output short URL

Components:

  • API Server

  • Database

  • Hashing function

Flow:

  1. User enters URL

  2. Generate unique short ID

  3. Store mapping in DB

  4. Redirect when accessed


πŸ“Έ 17. How would you design Instagram (basic)?

Answer:

Features:

  • Upload photos

  • View feed

  • Like/comment

Components:

  • App server

  • Media storage (images/videos)

  • Database (users/posts)

  • CDN (fast delivery)


⏱️ 18. What is Latency vs Throughput?

Answer:

  • Latency → Time to process one request

  • Throughput → Requests per second

πŸ‘‰ Good system = low latency + high throughput


πŸ” 19. What is Fault Tolerance?

Answer:
System continues working even if components fail.

Achieved by:

  • Replication

  • Backup servers

  • Failover systems


🧠 20. How to approach system design in interview?

Answer (very important):

Step-by-step:

  1. Clarify requirements

  2. Define scale

  3. High-level design

  4. Database design

  5. Identify bottlenecks

  6. Suggest improvements


🎯 Pro Tips for Freshers

  • Focus on clarity over complexity

  • Use simple diagrams (mentally)

  • Communicate your thinking

  • Don’t jump to microservices immediately

  • Always discuss trade-offs

Experienced Interview Questions

 

1. Design a URL Shortener (like Bitly)

Requirements

Functional

  • Shorten long URLs

  • Redirect short URL → original URL

  • Custom aliases (optional)

  • Expiration support

Non-functional

  • High availability

  • Low latency (redirects must be fast)

  • Scalable (millions of URLs/day)


High-Level Design

Components

  • API service

  • URL encoding service

  • Database

  • Cache (Redis)

  • Load balancer


Key Design Decisions

1. ID Generation

  • Use Base62 encoding

  • Auto-increment ID → encode to short string
    Example:

ID: 125 → "cb"

Alternative:

  • Distributed ID generators (Snowflake)


2. Database Schema

Table: urls
- id (PK)
- short_code
- long_url
- created_at
- expiry

3. Read Flow

  1. User hits short URL

  2. Check cache

  3. If miss → DB lookup

  4. Redirect (HTTP 301/302)


4. Write Flow

  1. Generate ID

  2. Encode

  3. Store mapping

  4. Return short URL


Scaling

  • Use caching layer (Redis) for hot URLs

  • DB sharding by id

  • CDN for global redirects


Bottlenecks

  • DB read-heavy → solved by cache

  • Hot URLs → cache + CDN


2. Design a Chat System (like WhatsApp)

Requirements

Functional

  • 1-1 messaging

  • Group messaging

  • Online/offline delivery

  • Message history

Non-functional

  • Low latency

  • High reliability

  • Eventual consistency acceptable


High-Level Architecture

  • Client → WebSocket → Chat servers

  • Message queue (Kafka)

  • Storage (NoSQL like Cassandra)


Core Components

1. Connection Layer

  • Persistent WebSocket connections

  • Each user connected to a chat server


2. Message Flow

Send Message

  1. Client → server

  2. Server → Kafka

  3. Consumer → store in DB

  4. Push to recipient


3. Offline Handling

  • Store messages

  • Deliver when user reconnects


Data Storage

  • Use Cassandra/DynamoDB

  • Partition by user_id


Scaling

  • Stateless chat servers

  • Horizontal scaling

  • Partition queues by user


Challenges

  • Message ordering

  • Duplicate delivery → use message IDs

  • Fan-out for groups


3. Design a Rate Limiter

Requirements

  • Limit requests per user/IP

  • Example: 100 requests/minute


Approaches

1. Token Bucket (Recommended)

  • Tokens added at fixed rate

  • Request consumes token

Why good?

  • Allows burst traffic

  • Smooth rate limiting


2. Sliding Window

  • More accurate

  • More memory usage


Implementation (Redis)

  • Use atomic operations

  • Key: user_id

  • Store count + timestamp


Scaling

  • Distributed rate limiter using Redis cluster

  • Local cache for performance


Edge Cases

  • Clock synchronization

  • Distributed consistency


4. Design a News Feed System (like Instagram)

Requirements

  • Post content

  • Follow users

  • View feed


Feed Models

1. Pull Model

  • Fetch posts at request time

Pros

  • Less storage

Cons

  • Slow reads


2. Push Model (Fan-out on write)

  • Push posts to followers

Pros

  • Fast reads

Cons

  • Expensive writes


Hybrid Approach (Best)

  • Push for normal users

  • Pull for celebrities


Storage

  • Posts table

  • Feed table per user


Scaling

  • Shard by user ID

  • Cache feeds


Challenges

  • Celebrity problem (millions of followers)

  • Ranking algorithm


5. Design an API Gateway

Responsibilities

  • Routing

  • Authentication

  • Rate limiting

  • Logging


Architecture

  • Gateway sits between client and services


Features

1. Authentication

  • JWT validation

2. Rate Limiting

  • Centralized control

3. Load Balancing

  • Route to healthy instances


Tools

  • Kong

  • NGINX

  • AWS API Gateway


Scaling

  • Stateless

  • Horizontal scaling


6. Design a File Storage System (like Google Drive)

Requirements

  • Upload/download files

  • Share files

  • Versioning


High-Level Design

  • Metadata service

  • Blob storage (S3/HDFS)

  • CDN


Upload Flow

  1. Client uploads to storage

  2. Metadata stored separately


Storage Strategy

  • Chunk large files

  • Store chunks independently


Scaling

  • Distributed storage

  • Replication for durability


Challenges

  • Consistency

  • Large file uploads

  • Deduplication


7. Design a Ride Sharing System (like Uber)

Requirements

  • Book ride

  • Match drivers

  • Real-time location


Core Components

  • Location service (GPS tracking)

  • Matching service

  • Pricing service


Matching Algorithm

  • Nearest driver (geospatial queries)

  • Use QuadTree / GeoHash


Real-Time Updates

  • WebSockets / MQTT


Scaling

  • Partition by geography

  • Cache driver locations


Challenges

  • Surge pricing

  • Real-time updates

  • High availability


8. Design a Distributed Cache

Requirements

  • Fast access

  • Fault tolerant


Design

  • Consistent hashing

  • Replication


Tools

  • Redis Cluster

  • Memcached


Challenges

  • Cache invalidation

  • Data consistency


What Interviewers Expect at 4+ Years

You should:

  • Drive the discussion

  • Ask clarifying questions

  • Justify trade-offs

  • Think about scale (millions of users)

  • Discuss bottlenecks and solutions


Pro Tips

Always structure answers like:

  1. Requirements

  2. High-level design

  3. Deep dive (data, APIs)

  4. Scaling

  5. Trade-offs

  6. Bottlenecks