The main difference is:
-
The
random choices
function draws random elements from the sequence, eventually including duplicates since the same element in the sequence can be drawn multiple times (with replacement). -
The
random sample
function draws random elements from the sequence but without duplicates since once elements are picked they are removed from the sequence to sample (without replacement).
Imagine a lottery game. In the first case, we are putting back the ball into the drawing lot while on the second case, the ball is definitively removed from the set.
Note: without duplicated does not mean the same value cannot be seen several times in the resulting sampled sequence. If several balls hold the same value in the lot, and these balls are drawn, the occurrence will also be reflected in the result. But the same ball, once drawn, cannot be drawn ever again.
Examples: choices vs. sample
pool = "abcd"
print("".join((random.choices(pool, k=5))))
In the above example we are extracting 5 random elements from the pool of elements to pick from. Once drawn, the value is replaced in the pool so it can eventually be picked up again:
addaa
Note: since you have a replacement, you can extract more elements than the population originally contains. Hence k=5
while the sequence only contains 4 elements.
population = "abcd"
print("".join((random.sample(population, k=4))))
In the aforementioned example, we ask the random function to draw 4 elements from the population without replacement. This means that once the element is picked up, it is removed from the population:
abdc
Note: since you do not have a replacement, you cannot have k
to be greater than the length of your sequence. Should you try, you will get a ValueError: Sample larger than population or is negative
error raised at you.
Use-case Example: Alphanumeric Generation
To generate a sequence of 32 random alphanumeric values:
population = string.ascii_letters + string.digits
result = "".join((random.choices(population, k=32)))
print(result)
coqHR7HrsCsKcvGvmlClJI1OnWZjvwH9
Notes:
- It is always a very bad idea to use python’s randomness to generate passwords or other sensitive secrets since the
random
function is not really random. - Worse than that, never write your own random function as it is prone to vulnerabilities. Rather use a public and scientifically proved method (this is the beauty of maths: being capable of generating indecipherable secrets, with the generating method know by all).
- Even worse: never base the robustness of your encryption protocol on the secrecy of the generation method.
As least those are the (rare) few takeaways I still remember from my Master of Science in Computer Science specialized in Cybersecurity.
And you, what is your score on Root Me? 🏴☠️