Scaling data structures
Time and space complexity of common data structures
Probabilistic data structures
Problem 1
Finding top-K frequent elements in a set, where the number of different values is very large.
The solution might be to use a hash table to count frequencies and a min-heap of elements ordered by frequencies of the elements.
Al alternative data structure is called count-min sketch
: A two demensional matrix
Problem 2
Given a large number of elements, give a function that gives a subset of elements from the set, and the probability of the elements returned should be proportional to the total number of elements in the set
- Give the top 10 types of phones sold.
- Give the top 10 mobile phones in our network that make the most number of calls
Reservoir sampling