Scaling data structures

Time and space complexity of common data structures

Probabilistic data structures

Problem 1

Finding top-K frequent elements in a set, where the number of different values is very large.

The solution might be to use a hash table to count frequencies and a min-heap of elements ordered by frequencies of the elements.

Al alternative data structure is called count-min sketch: A two demensional matrix

Problem 2

Given a large number of elements, give a function that gives a subset of elements from the set, and the probability of the elements returned should be proportional to the total number of elements in the set

  • Give the top 10 types of phones sold.
  • Give the top 10 mobile phones in our network that make the most number of calls

Reservoir sampling

Last updated on