python Archives - Page 3 of 4

Python keyword-only parameters

Similar to Python Positional-Only Parameters but the other way around: parameters placed on the right side of the * syntax parameter will be coerced into the keyword-only parameter type.

def f(a, *, b, c):
    print(a, b, c)

In the above excerpt, a can be given either as positional or keyword parameter. However, b and c do not have other options beside being pass through keyword arguments:

python> f(1, b=2, c=3)
1 2 3
python> f(a=1, b=2, c=3)
1 2 3

Should you try something else, it will most likely fails:

python> f(1, 2, 3)
TypeError: f() takes 1 positional argument but 3 were given

Notes:

Python does not allow positional arguments after keyword arguments because of the left-side/right-side of the * operator thingy.
*args are collecting as positional arguments.
**kwargs are collecting as keyword-arguments.

Python positional-only parameters

The Python positional-only parameter has been introduced with the Python 3.8 release. It is a way to specify which parameters must be positional-only arguments – i.e. which parameters cannot be given as keyword-parameters. It uses the / parameter syntax. Elements positioned on the left side will be then turned into positional-only parameters.

def f(a, b, /, c): ...

The above method will only accepts calls of the following form:

python> f(1, 2, 3)
python> f(1, 2, c=3)

Note: c can be either given a position or keyword argument.

On the contrary, the following calls will raise a TypeError: f() got some positional-only arguments passed as keyword arguments error as a and b – being on the left of / – cannot be exposed as possible keyword-arguments:

python> f(1, b=2, 3)
python> f(1, b=2, c=3)
python> f(a=1, b=2, 3)

Use cases

The / positional-only parameter syntax is helpful in the following use-cases:

It precludes keyword arguments from being used when the parameter’s name is not necessary, confusing or misleading e.g. len(obj=my_list). As stated in the documentation, the obj keyword argument would here impairs readability.
It allows the name of the parameter to be changed in the future without introducing breaking changes in he client as they can still be passed through **kwargs to be used in two ways:

def transform(name, /, **kwargs):
    print(name.upper())

In the above snippet, you can access the name‘s value either using name or kwargs.get("name").

Last but not the least, it can also be used to prevent overwriting pre-set keyword-arguments in methods that still need to accept arbitrary keyword arguments (via kwargs):

def initialize(name="unknown", /, **kwargs):
    print(name, kwargs)

The name keyword-argument is protected and could not be overwritten, even if mistakenly captured a second time by the kwargs argument:

python> initialize(name="olivier")
unknown {'name': 'olivier'}

Notes:

Python does not allow positional arguments after keyword arguments because of the left-side/right-side of the / operator thingy.
Python does not allow positional arguments after keyword arguments because of the left-side/right-side of the * operator thingy.
*args are collecting as positional arguments.
**kwargs are collecting as keyword-arguments.

Python Pickle Serialization

pickle allows you to serialize and de-serialize Python objects to save them into a file for future use. You can then read this file and extract the stored Python objects, de-serializing them so they can be integrated back into the code’s logic.

You just need the two basic commands: pickle.dump(object, file) and pickle.load(file).

Below, a round trip example:

import pickle

FILENAME = "tmp.pickle"
original_list = ["banana", "apple", "pear"]

with open(FILENAME, "wb") as file:
    pickle.dump(original_list, file)
file.close()

with open(FILENAME, "rb") as file:
    retrieved_list = pickle.load(file)
file.close()

print(retrieved_list) # ["banana", "apple", "pear"]

Python Itertools Cycle

The itertools.cycle() method is a nice way to iterate through an iterable – e.g. a list – indefinitely in a cycling way. When all the elements are exhausted, the elements are once again red from the beginning.

from itertools import cycle

duty_persons = ["Olivier", "David", "Goran",]
duty_persons_cycle = cycle(duty_persons)

for _ in range(6):
    print(next(duty_persons_cycle))

The above snippet will returned the following output:

Olivier
David
Goran
Olivier
David
Goran

As you can see in the above example, the cycle method is quite helpful in a couple of situations:

establishing a rolling list of duty-persons, rotating on a regular base;
can be used when scrapping the web to cycle between hosts to outrun anti-bot policies;
any use-case you might think of…

Python uuid

uuid is a python module providing immutable UUID (Universally Unique IDentifier URN Namespace) objects and functions e.g. uuid4().

The generated UUID objects are unique since it generates IDs based on time and computer hardware.

python> from uuid import uuid4
python> uuid_object = uuid4()
python> uuid_object
UUID('ce8d1fee-bc31-406b-8690-94c01caabcb6')

python> str(uuid_object)
'ce8d1fee-bc31-406b-8690-94c01caabcb6'

Those can be used to generate random strings that can serve as unique identifier across a given namespace – e.g. if you want to generate temporary filenames on the fly located under a specific folder:

for _ in range(10):
    file = open(f"/tmp/{uuid4()}.txt", "a")
    file.write("hello world!")
    file.close()

Note: despite the above snippet working like a charm, it is better to use open() with the context manager with to make sure the close() function will always be called should an error occurs during the write operation.

Hazardous: uuid1() compromises privacy since it uses the network address of the computer to generate the unique ID. Thus, it could be reverse-engineered and retrieved. To prevent this, always choose one of the latest functions e.g. uuid4() or uuid5() as the previous ones rapidly get depreciated.

Note: to know more about URIs, URLs and URNs https://olivierbenard.fr/change-the-url-of-a-git-repository/.

Python extend vs. append

append() appends the given object at the end of the list. The length is only incremented of +1.
extend() extends the list by appending the elements it contains one by one into the list. The length is incremented by the number of elements.

In more details

The append() method is straight forward:

python> a = [1,2,3]
python> a.append(4)
python> a
[1, 2, 3, 4]

However, what happens if you need to add multiple elements at the same time? You could call the append() method several times, but would it work to give the list of new elements to be added into the original list directly as a parameter? Let’s give a try:

python> a = [1,2,3]
python> a.append([4,5])
python> a
[1, 2, 3, [5, 6]] # vs. expected [1, 2, 3, 4, 5, 6]

The extend() method intends to solve this problem:

python> b = [1,2,3]
python> b.append([4,5])
python> b
[1, 2, 3, 4, 5]

Note: the + operator is equivalent to an extend call.

python> a = [1,2,3]
python> a += [4,5]
python> a
[1, 2, 3, 4, 5]

Python Walrus Operator

The walrus operator := is a nice way to avoid repetitions of function calls and statements. It simplifies your code. You can compare the two following code snippets:

grades = [1, 12, 14, 17, 5, 8, 20]

stats = {
    'nb_grades': len(grades),
    'sum': sum(grades),
    'mean': sum(grades)/len(grades)
}

grades = [1, 12, 14, 17, 5, 8, 20]

stats = {
    'nb_grades': (nb_grades := len(grades)),
    'sum': (sum_grades := sum(grades)),
    'mean': sum_grades/nb_grades
}

Note: The parentheses are mandatory for the plain assignment to work.

Same goes for function calls:

foo = "hello world!"
if (n := len(foo)) > 4:
    print(f"foo is of length {n}")

foo is of length 12

In the above snippet, the len() method has only been called once instead of twice. More generally, you can assign values to variables on the fly without having to call the same methods more than once.

Important: The python walrus operator := (officially known as assignment expression operator) has been introduced by Python 3.8. This mean, once implemented, your code won’t be backward compatible anymore.

Note: The “walrus operator” affective appellation is due to its resemblance to the eyes and tusks of a walrus.

More examples

python> [name for _name in ["OLIVIER", "OLIVIA", "OLIVER"] if "vi" in (name := _name.lower())]
['olivier', 'olivia']

In this example, we are iterating through the list, storing each item of the list into the temporary _name variable. Then, we apply the lower() string method on the _name object, turning the upper case string into lower case. Next, we store the lower case value into the name variable using the walrus operator. Finally, we filter the values using the predicate to only keep the names containing “vi”.

The alternative without the walrus operator would have been:

python> [name.lower() for name in ["OLIVIER", "OLIVIA", "OLIVER"] if "vi" in name.lower()]
['olivier', 'olivia']

As you can see, this above code snipped is less “optimized” as you call the len() method twice on the same object.

You can also use the walrus operator without performing any filtering. The following code works like a charm:

python> [name for _name in ["OLIVIER", "OLIVIA", "OLIVER"] if (name := _name.lower())]
['olivier', 'olivia', 'oliver']

However, this is highly counter-intuitive and calls for errors. The presence of if let’s assume that there is a conditional check and filtering in place. Which is not the case. The codebase does not benefit from such design since the following snippet – on top of being clearer – is strictly equivalent in term of outcome:

python> [name.lower() for name in ["OLIVIER", "OLIVIA", "OLIVER"]]
['olivier', 'olivia', 'oliver']

Note: Developers tend to be very smart people. We sometime like to show off our smarts by demonstrate our mental juggling abilities. Resist the tentation of writing complex code. Programming is a social activity (e.g. all the collaborative open source projects). Be professional and keep the nerdy brilliant workarounds for your personal projects. Follow the KISS principle: Keep It Simple Stupid. Clarity is all that matters. You want to maximize the codebase discoverability.

Last but not least, you can also use the walrus operator inside while conditional statements:

while (answer := int(input())) != 42:
    print("This is not the Answer to the Ultimate Question of Life, the Universe, and Everything.")

python> python script.py
7
This is not the Answer to the Ultimate Question of Life, the Universe, and Everything.
3
This is not the Answer to the Ultimate Question of Life, the Universe, and Everything.
42

Note: In the above examples we have used list comprehension. This article (in progress) explains this design in more detail.

One more thing

You cannot do a plain assignment with the walrus operator. At least, it is not that easy:

python> a := 42
  File "<stdin>", line 1
    a := 42
      ^^
SyntaxError: invalid syntax

For the above code snippet to work, you need to enclose the assignment expression around parentheses:

python> a = 42
python> (a := 18)
18
python> print(a)
18

Who told you that Software Developers were not sentimental? ❤️

Do not return null values

It is always a bad idea to write a method returning a null value because it requires the client to remember to check for null:

it is foisting problems upon the caller methods, postponing conditional checks and creating work for later on that one might forget to implement.
it invites errors; all it takes is one missing none checks to have your application spinning out of control.
if you are still tempted to return none from a method, consider throwing an exception or special-case object instead.

Note: this works well for methods returning iterable types e.g. list, set, strings… as you can just return an empty list. Returning an empty custom-object such as instantiated class is more hairy. In such edge-case only, you can return null.

from config.constants import REGISTERED_ITEMS

def retrieve_registered_item_information(item):
    if item is not None:
        item_id = item.get_id()
        if item_id is not None:
            return REGISTERED_ITEMS.get_item(item_id).get_info()

As a demonstration for our second aforementioned point, did you noticed the fact that there wasn’t a null check in the last line? What about the item not being retrieved among the REGISTERED_ITEMS but you, still trying to access the get_info() method of a None element? You will get an error for sure.

Example

You have the following structured json object you want to extract the id from:

{
    "id": "42",
    "name": "some_name",
    "data": [...]
}

def get_object_id(object: dict) -> str | None:
    candidate_id = None
    try:
        candidate_id = object["id"]
    except KeyError as message:
        logger.error(f"Error retrieving the object id: {message}")
    return candidate_id

The above method is not ideal:

You have a mixed type between str and None. You do not want your method to be schizophrenic but rather it to be type-consistent instead.
Some python versions do not accept type annotations with | operators. Python 3.9+ solves this problem.

Instead, always favour the following accessor method as a nice remedy:

def get_object_id(object: dict) -> str:
    candidate_id = ""
    try:
        candidate_id = object["id"]
    except KeyError as message:
        logger.error(f"Error retrieving the object id: {message}")
    return candidate_id

There are multiple reasons and benefits for that:

You remove the returned type ambiguity and the returned type is consistent. Whatever might happens, you always return a string value.
It removes the type annotation error you might get on the old python versions otherwise.

Caution: Last but not the least, note that python always implicitly returns a None value; wether you add a return statement or not. The three following code snippets are equivalent:

def foo():
    pass

def faa():
    return

def fii():
    return None

You can try it yourself:

python> result = foo()
python> print(result)
None

The advantage of explicitly using a return is that it acts as a break statement:

def fuu():
    return 42
    a = 5
    return a

python> fuu()
42

Notes:

We have used a logger object to handle logs for us. More on python logging in this article (in progress).
We have prefixed the names of our accessor methods using get. More on how to find meaningful names for your variables in this article (in progress).

As a conclusion, you do not want to rock the boat. Be careful when returning a null value and always favour other options. Your code will be way cleaner and you will minimize the chance of getting an error 🛶

What is the difference between random choices and random sample python functions.

The main difference is:

The random choices function draws random elements from the sequence, eventually including duplicates since the same element in the sequence can be drawn multiple times (with replacement).
The random sample function draws random elements from the sequence but without duplicates since once elements are picked they are removed from the sequence to sample (without replacement).

Imagine a lottery game. In the first case, we are putting back the ball into the drawing lot while on the second case, the ball is definitively removed from the set.

Note: without duplicated does not mean the same value cannot be seen several times in the resulting sampled sequence. If several balls hold the same value in the lot, and these balls are drawn, the occurrence will also be reflected in the result. But the same ball, once drawn, cannot be drawn ever again.

Examples: choices vs. sample

pool = "abcd"
print("".join((random.choices(pool, k=5))))

In the above example we are extracting 5 random elements from the pool of elements to pick from. Once drawn, the value is replaced in the pool so it can eventually be picked up again:

addaa

Note: since you have a replacement, you can extract more elements than the population originally contains. Hence k=5 while the sequence only contains 4 elements.

population = "abcd"
print("".join((random.sample(population, k=4))))

In the aforementioned example, we ask the random function to draw 4 elements from the population without replacement. This means that once the element is picked up, it is removed from the population:

abdc

Note: since you do not have a replacement, you cannot have k to be greater than the length of your sequence. Should you try, you will get a ValueError: Sample larger than population or is negative error raised at you.

Use-case Example: Alphanumeric Generation

To generate a sequence of 32 random alphanumeric values:

population = string.ascii_letters + string.digits
result = "".join((random.choices(population, k=32)))
print(result)

coqHR7HrsCsKcvGvmlClJI1OnWZjvwH9

Notes:

It is always a very bad idea to use python’s randomness to generate passwords or other sensitive secrets since the random function is not really random.
Worse than that, never write your own random function as it is prone to vulnerabilities. Rather use a public and scientifically proved method (this is the beauty of maths: being capable of generating indecipherable secrets, with the generating method know by all).
Even worse: never base the robustness of your encryption protocol on the secrecy of the generation method.

As least those are the (rare) few takeaways I still remember from my Master of Science in Computer Science specialized in Cybersecurity.

And you, what is your score on Root Me? 🏴‍☠️

Why using snake case

The snake case is a style of writing in which each space is replaced by an underscode and letters writen in lowercase:

this_is_what_the_snake_case_style_looks_like

Since the snake_case format is mandatory for some objects, it is then easier to stick to it and generalised its usage throughout.

It is important that you use the snake case because your python code might simply do not work otherwise:

from helpers.math-module import incr

def test_incr() -> None:
    result = incr(42)
    print(result == 43)

if __name__ == "__main__":
    test_incr()

> python main.py
File "path/to/snake_case_project/main.py", line 1
from helpers.math-module import incr
                 ^
SyntaxError: invalid syntax

Instead, change for the following syntax:

snake_case_project/
    ├── helpers
        ├── __init__.py
        └── math_module.py
    └── main.py

from helpers.math_module import incr

def test_incr() -> None:
    result = incr(42)
    print(result == 43)

if __name__ == "__main__":
    test_incr()

> python main.py
True

Admit that for a language like Python, the snake_case is rather well adapted! 🐍