What is the difference between random choices and random sample python functions.

The main difference is:

  • The random choices function draws random elements from the sequence, eventually including duplicates since the same element in the sequence can be drawn multiple times (with replacement).

  • The random sample function draws random elements from the sequence but without duplicates since once elements are picked they are removed from the sequence to sample (without replacement).

Imagine a lottery game. In the first case, we are putting back the ball into the drawing lot while on the second case, the ball is definitively removed from the set.

Note: without duplicated does not mean the same value cannot be seen several times in the resulting sampled sequence. If several balls hold the same value in the lot, and these balls are drawn, the occurrence will also be reflected in the result. But the same ball, once drawn, cannot be drawn ever again.

Examples: choices vs. sample

pool = "abcd"
print("".join((random.choices(pool, k=5))))

In the above example we are extracting 5 random elements from the pool of elements to pick from. Once drawn, the value is replaced in the pool so it can eventually be picked up again:

addaa

Note: since you have a replacement, you can extract more elements than the population originally contains. Hence k=5 while the sequence only contains 4 elements.

population = "abcd"
print("".join((random.sample(population, k=4))))

In the aforementioned example, we ask the random function to draw 4 elements from the population without replacement. This means that once the element is picked up, it is removed from the population:

abdc

Note: since you do not have a replacement, you cannot have k to be greater than the length of your sequence. Should you try, you will get a ValueError: Sample larger than population or is negative error raised at you.

Use-case Example: Alphanumeric Generation

To generate a sequence of 32 random alphanumeric values:

population = string.ascii_letters + string.digits
result = "".join((random.choices(population, k=32)))
print(result)
coqHR7HrsCsKcvGvmlClJI1OnWZjvwH9

Notes:

  • It is always a very bad idea to use python’s randomness to generate passwords or other sensitive secrets since the random function is not really random.
  • Worse than that, never write your own random function as it is prone to vulnerabilities. Rather use a public and scientifically proved method (this is the beauty of maths: being capable of generating indecipherable secrets, with the generating method know by all).
  • Even worse: never base the robustness of your encryption protocol on the secrecy of the generation method.

As least those are the (rare) few takeaways I still remember from my Master of Science in Computer Science specialized in Cybersecurity.

And you, what is your score on Root Me? 🏴‍☠️

Leave a Reply

Your email address will not be published. Required fields are marked *