Skip to content Skip to sidebar Skip to footer

How To Pick A Random Choice Using A Custom Probability Distribution

I have a list of US names and their respective names from the US census website. I would like to generate a random name from this list using the given probability. The data is here

Solution 1:

There is an O(1)-time method See this detailed description of Vose's "alias" method. Unfortunately, it suffers from high initialization cost. For comparative timings of simpler methods, see Eli Bendersky's blog post. More timings can be found in this from the Python issue tracker.

Solution 2:

These days it's practical to enumerate the entire US population (~317 million) if you really need O(1) lookup. Just pick a number up to 317 million and get the name from there. (317000000*4 bytes = 1.268GB)

I think there are lots of O(log n) ways. Is there a particular reason you need O(1) (They will use a lot less memory)

Post a Comment for "How To Pick A Random Choice Using A Custom Probability Distribution"