Back-of-the-envelope Estimation System Design

Back-of-the-envelope Estimation System Design

Back-of-the-envelope estimation is a technique used to quickly approximate values and make rough calculations using simple arithmetic and basic assumptions.


Estimation Techniques

1) Rule of Thumb →

General principals applied to make good estimates. eg : 1 user generates 1MB of data on social media / day.

2) Approximations →

Rounding of complex calculations to powers of 10 or 2 to simply and get to the estimates easily. eg: 1 day = 10^5 seconds.

3) BreakDown and aggregation →

Breaking down bigger problems to smaller components and estimating them individually along with aggregating or combining them to reach the results. eg: Social media data = User Data + Multimedia Data + Metadata .

4) Sanity check →

Just having an overall check over the possibility of the estimates not varying a lot from reality is needed at last . For eg : The numbers achieved should match the original real life data.


Types of Estimations

1) Load Estimations

Designing a post generation social media platform. Daily Active Users ( DAU ) → 100 Million Avg. Posts → 10 per user per day Total posts → 100 M * 10 = 1B post/day

Hence Request rate per second = 1B / 10^5 requests/second = 10000 req/sec.

2) Storage Estimations

Twitter Storage DAU → 500 M 1 user = 3 tweets (avg)/day 1 tweet text ~ 250B 1 photo ~ 200KB [10% contain photo] 1 video ~ 300MB [5% contain video]

Total storage/day ~ 1500M * (250B + 20KB + 15KB) ~ 375 GB + 30TB + 225TB ~ 255TB

3) Bandwidth requirements

  • Estimate the daily amount of incoming data to the service.
  • Estimate the daily amount of outgoing data from the service.
  • Estimate the bandwidth in Gbps (Gigabits per second) by dividing the incoming and outgoing data by the number of seconds in a day.

4) Latency Estimation

For eg. API consist of RestCall 1 , Rest Call 2 , Rest Call 3

Total Latency → 50ms + 100ms + 150ms ~ 300ms [ if it is sequential ] → max(50,100,150) ~ 150ms [ if it is parallel ]

5) Resource Estimation

1 req ~ 10ms of CPU total req ~ 10000req/sec total cpu time ~ 10000 * 10 = 100000 ms/sec. 1 CPU can handle 1000ms/sec Total CPU core = 100000 / 1000 = 100