Steps to design a system

Step 1: Requirement clarifications

Example of desigining a Twitter-like service:

  • Will users of our service be able to post tweets and follow other people?
  • Should we also design to create and display the user's timeline?
  • Will users be able to search tweets?

Step 2: System interface definition

Step 3: Back-of-the-envelope estimation

Estimate the scale of the system we're going to design. This will also help later when will be focusing on scaling, partioning, load balancing and caching.

- How much storage do we need?
- What net work bandwith usage are we expecting?

https://www.freecodecamp.org/news/must-know-numbers-for-every-computer-engineer/ https://gist.github.com/jboner/28418322

Step 4: Defining data model

How data will flow among different components of the system. Identify various entities of the system, how they will interact with each other.

User: UserID, Name, Email, DoB, CreationData, LastLogin, etc.
Tweet: TweetID, Content, TweetLocation, NumberOfLikes, TimeStamp, etc.

Define which database shoud we use? What kind of block storage should we use to store photos and videos?

Step 5: High-level design

Draw a block diagram with 5 - 6 box representing the core components of our system.

On the backend, we need an efficient database that can store all the tweets and can support a huge number of reads. We will also needs a distributed file storage system for storing videos and files.

Step 6: Detailed design

Present different approaches, their pros and cons. Explain why we will prefer one approach on the other, tradeoffs between different options

  • Since we will be storing a massive amount of data, how should we partition our data to distribute it to multiple databases
  • How we handle hot users who tweet a lot or follows lots of people?
  • How much and at which layer should we introduce cache to speed things up?

Step 7: Identify and resolving bottlenecks

  • Is there any single point of failure in our system? What are we doing to mitigate it?
  • Do we have enough replicas of the data so that if we lose a few servers we can still serve our users?
  • How are we monitoring the performance of our service? Do we get alerts whenever critical components fail or their performance degrades?

Resources

https://gist.github.com/vasanthk/485d1c25737e8e72759f

Last updated on