MyPy missing imports

When running mypy on your codebase, you might sometimes encounter a similar error:

error: Library stubs not installed for "requests"

You can have a look at the official documentation on how to solve missing imports but the quickest way to solve it is to run the following:

mypy --install-types

You might also stumble across the similar untyped import issue:

module is installed, but missing library stubs or py.typed marker [import-untyped]

In that case, you can just create a mypy.ini file, populated with the following line:

echo "[mypy]\nignore_missing_imports = True" > mypy.ini

poetry local config file

Summary

When working with poetry, you can configure poetry with some specifics using the command line interface e.g.:

poetry config virtualenvs.create true

The recommendation here is to always hard-encode those configurations in a local file. This can be done adding a --local flag in the poetry config command:

poetry config --local virtualenvs.create true

The above command will create the a poetry.toml file at the root of your project, with the following information:

[virtualenvs]
create = true

This ensures that your configuration is always hard-encoded and replicable.

List the current poetry configuration

This can be done via a simple command:

poetry config --list

Use-case: system-git-client true

I rarely have to define poetry configurations, however it helps me to fix the following problem I encounter from time to time:

(1) I have a CI/CD pipeline running on Gitlab;

(2) This pipeline has 3 different stages: quality-checks, build and deploy;

(3) Each of these stages might require python dependencies. These dependencies are managed by poetry. You need the Gitlab CI/CD runner to install them within the virtual environment where the stages are gonna execute their scripts. This means, you need each job to run poetry install. E.g.:

variables:
  GIT_SUBMODULE_STRATEGY: recursive

stages:
  - test
  - build
  - deploy

quality-checks:
  image: "<your-custom-docker-path>/ci-cd-python-test-harness:latest"
  stage: test
  script:
    - <setup-gitlab-ssh-access>
    - <add-safe-git-directories>
    - poetry install
    - make checks

...

Note: here, the last line make checks triggers a Makefile action, running black, mypy, pylint and pytest on the codebase. You can have a look on what this make command looks like in the snippet immediately below:

black:
    poetry run black .

mypy:
    poetry run mypy <your-src-folder>

pylint:
    poetry run pylint <your-src-folder>

test:
    PYTHONWARNINGS=ignore poetry run pytest -vvvs <your-test-folder>

checks: black mypy pylint test

Thus, I have my CI/CD running poetry install before the CI/CD runner to be able to test my code in its virtual environment.

(4) Part of the poetry install will install all the dependencies my project contains, including git submodules. This is exactly where our issue lies!

$ poetry install
Creating virtualenv <your-repository>_-py3.10 in /home/gitlab-runner/.cache/pypoetry/virtualenvs
Installing dependencies from lock file
No git repository was found at ../../<your-submodule-repository>.git
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: exit code 1

In order for your project to be able to use git submodules and the CI/CD to run successfully, you need to run the following command:

poetry config --local experimental.system-git-client true

This creates a poetry.toml file with the following lines:

[experimental]
system-git-client = true

This trick should fix the No git repository was found error occurring in your CI/CD pipeline.

Use Python Fixtures in Classes

TL;DR: use scope, @pytest.mark.usefixtures and request.cls to define your fixture as attribute of the class.

With pytest you can use fixtures to have a nice delimitation of responsibilities within your test modules, sticking to the Arrange-Act-Assert pattern:

import pytest

@pytest.fixture()
def get_some_data():
    yield "get some data"

def test_reading_data(get_some_data):
    assert get_some_data == "get some data"

If the following code works, what about if you want to organize your tests functions within classes? Naively you would assume the following to be a fair implement:

import pytest

@pytest.fixture()
def get_some_data():
    yield "get some data"

class TestDummy(unittest.TestCase):

    def test_dummy(self, get_some_data):
        assert get_some_data() == "get some data"

Running poetry run pytest -vvvs tests/path/to/test_module.py will return the following error in the traceback:

E       TypeError: TestDummy.test_dummy() missing 1 required positional argument: 'get_some_data'

In order to use python fixture within a class, you need to edit the above snippet for the following as you cannot call fixtures directly:

import pytest

@pytest.fixture()
def get_some_data():
    yield "get some data"

class TestDummy(unittest.TestCase):

    @pytest.fixture(autouse=True)
    def _get_some_data(self, get_some_data):
        self.get_some_data = get_some_data

    def test_dummy(self):
        assert self.get_some_data == "get some data"

Note that _get_some_data will be called once per test by default which is inconvenient if you have to perform request through the network e.g. requests.get("https://www.google.com"). You can change this behaviour by adapting the scope:

@pytest.fixture(scope="module")
def get_some_data():
    yield "get some data"

@pytest.fixture(scope="class")
def define_get_data_attribute(request, get_some_data):
    request.cls._get_some_data = get_some_data

@pytest.mark.usefixtures("define_get_data_attribute")
class TestDummy(unittest.TestCase):

    def test_dummy(self):
        assert self._get_some_data == "get some data"

Note that the request object gives access to the requesting test context such as the cls attribute. More here.

mypy disable error

You can disable mypy error codes using the special comment # type: ignore[code, ...]. E.g.:

def f(): # type: ignore[call-arg]
    pass

To disable multiple codes in one line, simply separate the codes with a comma, e.g.:

def f(): # type: ignore[call-arg, arg-type]
    pass

You can also configure the mypy.ini configuration file directly to ignore specific error codes for the whole codebase. E.g.:

zsh> cat >> mypy.ini <<EOL
heredoc> [mypy]
heredoc> ignore_missing_imports = True
heredoc> EOL

More in this page: mypy.readthedocs.io/en/stable/config_file.html.

Note: it is never a good practice to deactivate the error messages mypy is raising. Always try to work on it. E.g. a too-many-arguments code on a method probably means that you are missing one intermediary method and should refactor your code.

Finally, mypy checks usually come in pair with black, pylint and unitests checks. You can combine error codes by-passing issued by multiple checkers on the same special inline comment:

@decorator() # type: ignore[call-arg, arg-type] # pylint: disable="no-value-for-parameter"
def f(): ...

More on how to disable pylint error checks: olivierbenard.fr/disable-pylint-error-checks.

Note: to know more about mypy error codes, you can visit mypy.readthedocs.io/en/stable/error_codes.html.

Pytest against a wide range of data with Python hypothesis

The Hypothesis Python pytest library allows you to run your python tests against a wild range of data matching a set of hypothesis. In other words, your test function is provided with data matching the setup specifications and runs your Code Under Test (CUT) against it.

It is a nice way to automatically discover edge cases in your code without you even having to think about it.

Let’s go through an example. Let’s say you want to test the following function:

def divide_list_elements(my_list, denominator):
    return [item/denominator for item in my_list]
python> divide_list_elements([2, 4, 6], 2)
[1.0, 2.0, 3.0]

If you are like me, you would have then implemented your test strategy manually, grouped under a class because it is neat:

import unittest

class TestDivideListElements(unittest.TestCase):

    def test_divide_list_elements_one_element(self):
        result = divide_list_elements([42], 2)
        assert result == [21.0]

    def test_divide_list_elements_no_element(self):
        result = divide_list_elements([], 4)
        assert result == []
zsh> poetry run pytest tests/test_hypothesis.py::TestDivideListElements
collected 2 items

tests/test_hypothesis.py::TestDivideListElements::test_divide_list_elements_no_element PASSED
tests/test_hypothesis.py::TestDivideListElements::test_divide_list_elements_one_element PASSED

======================= 2 passed in 0.13s =======================

Well, all good right? We could have stopped there.

Now, let’s say, instead of manually defining your inputs, you let the hypothesis library managing this for you:

from hypothesis import given
from hypothesis import strategies as st

@given(st.lists(st.integers()), st.integers())
def test_divide_list_elements(input_list, input_denominator):
    result = divide_list_elements(input_list, input_denominator)
    expected = list(map(lambda x: x/input_denominator, input_list))
    assert result == expected

Running the test leaves you with an unexpected outcome:

zsh> poetry run pytest tests/test_hypothesis.py
>   return [item/denominator for item in my_list]
E   ZeroDivisionError: division by zero
E   Falsifying example: test_divide_list_elements(
E       input_list=[0],
E       input_denominator=0,
E   )

tests/test_hypothesis.py:17: ZeroDivisionError

You have obviously forgot to check about the division by 0…

Here is what is so beautiful about hypothesis: it can discovers for you edge cases you have forgotten about.

Let’s (1) redact our function:

def divide_list_elements(my_list: list, denominator: int) -> list:
    assert denominator != 0
    return [item/denominator for item in my_list]

(2) change the tests and (3) add the faulty test-case into our testing suit:

import pytest
import unittest
from hypothesis import given, example
from hypothesis import strategies as st


@given(st.lists(st.integers()), st.integers())
@example(input_list=[42], input_denominator=0)
def test_divide_list_elements(input_list, input_denominator):
    if input_denominator == 0:
        with pytest.raises(AssertionError) as exc_info:
            divide_list_elements(input_list, input_denominator)
            expected = "assert 0 != 0"
            assert expected == str(exc_info.value)
    else:
        result = divide_list_elements(input_list, input_denominator)
        expected = list(map(lambda x: x/input_denominator, input_list))
        assert result == expected

(4) run the tests again:

zsh> poetry run pytest -s tests/test_hypothesis.py::test_divide_list_elements
collected 1 item

tests/test_hypothesis.py::test_divide_list_elements PASSED

========================= 1 passed in 0.28s =====================

Notes:

  • The assert denominator != 0 statement ensures our function is given correct preconditions (referring to The Pragmatic Programmer, design by contracts and crash early! “Dead Programs Tell No Lies: A dead program does a lot less damage than a crippled one.“)

  • The @example(input_list=[42], input_denominator=0) statement is using the example decorator, which ensures a specific example is always tested. Here we want to make sure this edge case we missed is always checked.

  • The with pytest.raises(AssertionError) ensures that whatever is in the next block of code should raise an AssertionError exception. If not exception is raised, the test fails.

To learn more about parametrization: Factorize your pytest functions using the parameterized fixture.

Factorize your pytest functions using the parameterized fixture.

The parametrized fixture is a convenient way to factorize your python test functions, avoid duplicates in your test code and help you stick to the DRY (Don’t Repeat Yourself) principle.

Note: you can use it after having installed the plugin via pip install parametrized.

Let’s demonstrate this with a quick and easy example. Let’s assume you have a function that returns the sum of the elements within a list:

def sum_list_elements(l):
    return sum(l)

You want to test the behavior of your function using pytest. In your Test Strategy, you want to test this function for different kind of inputs. A testing suit could look like:

def test_sum_list_no_elements():
    result = sum_list_elements([])
    assert result == 0

def test_sum_list_one_element():
    result = sum_list_elements([-2])
    assert result == -2

def test_sum_list_cancelling_elements():
    result = sum_list_elements([-3, 1, 2])
    assert result == 0

def test_sum_list_elements():
    result = sum_list_elements([1, 2, 3])
    assert result == 6

However, this means having a lot of redundant code. You can refactor the suit thanks to the parametrized fixture:

from parameterized import parameterized

@parameterized.expand([
    ([], 0),
    ([-2], -2),
    ([-3, 1, 2], 0),
    ([1, 2, 3], 6)
])
def test_sum_list_elements_suit(inputs, expected):
    result = sum_list_elements(inputs)
    assert result == expected

Here is the result of the tests:

zsh> poetry run pytest tests/test_parametrized.py
collected 4 items

tests/test_parametrized.py::test_sum_list_elements_suit_0 PASSED
tests/test_parametrized.py::test_sum_list_elements_suit_1 PASSED
tests/test_parametrized.py::test_sum_list_elements_suit_2 PASSED
tests/test_parametrized.py::test_sum_list_elements_suit_3 PASSED

======================= 4 passed in 0.01s =======================

To learn more about parametrization: Pytest Against a Wide Range of Data with Python hypothesis and Automatically Discover Edge Cases.

Disable pylint error checks

TL;DR: use # pylint: disable=error-type inline comments to disable error types or edit the .pylintrc file generated via pylint --generate-rcfile > .pylintrc.

If you are using pylint to run checks on the quality of your Python code, you might want to ignore some of the checks the tool is running on your codebase for you.

You can silence errors with inline comments (e.g. if you still want this check to be performed on your overall codebase but not for this particular snippet):

def f():
    pass

class NotAuthorized(Exception):
    def __init__(self, message=""):
        self.message = message
        super().__init__(self.message)

Running pylint on the above code gives you the following output:

1:0: C0116: Missing function or method docstring (missing-function-docstring)
1:0: C0103: Function name "f" doesn't conform to snake_case naming style (invalid-name)
4:0: C0115: Missing class docstring (missing-class-docstring)

On the opposite, the following snippets is rated 10/10 by pylint:

def f(): # pylint: disable=invalid-name, missing-function-docstring
    pass

class NotAuthorized(Exception): # pylint: disable=missing-class-docstring
    def __init__(self, message=""):
        self.message = message
        super().__init__(self.message)
Your code has been rated at 10.00/10 (previous run: 5.00/10, +5.00)

Note: you can disable multiple pylint errors with one single inline comment, using a comma as separator.

If you want to disable a specific error check for the whole codebase, you can create a .pylintrc at the root of your code:

zsh> poetry run pylint --generate-rcfile > .pylintrc

Then, navigate to the [MESSAGES CONTROL] section, editing the following lines with the error types you want to append:

disable=raw-checker-failed,
        bad-inline-option,
        locally-disabled,
        file-ignored,

Notes:

  • It is never a good practice to deactivating the error messages pylint is raising. Always try to work on it. For instance, too-man-arguments on a method probably means that you are missing one intermediary method and should refactor it.

  • pylint checks usually come in pair with black, mypy and unitests. You can group them to one target via a Makefile:

black:
    poetry run black --exclude=<excluded-folder> .

pylint:
    poetry run pylint .

mypy:
    poetry run mypy

test:
    poetry run pytest -vvs tests/

checks: black pylint mypy test
  • To keep my code DRY I avoid repeating information both in the code and in function/module docstrings. As explained in The Pragmatic Programmer by David Thomas & Andrew Hunt and Clean Code by Robert Martin, it makes the code less maintainable and enhance the risk of having the docstrings no longer aligned with the code (because not correctly updated). The code should be self-explanatory, well structured and sticking to good naming convention. The docstrings are only there to explain the why and not the how. Thus, why I often decide to silence the missing-function-docstring and missing-module-docstring since I will be force to add dummy docstrings otherwise.

Python lists with trailing comma

In Python, you might have stumbled across lists ending with a trailing comma. Surprisingly, Python allows it, considering it as a valid syntax:

python> ["banana", "apple", "pear",]
["banana", "apple", "pear"]

There are multiple advantages adopting this convention. Ending your Python list with a trailing comma makes the list easier to edit – reducing the clutter in the git diff outcome – and makes future changes (e.g. adding an item to the list) less error-prone.

Reducing git diff clutter

Especially when your list is multi-lines, having a trailing comma makes the list easier to edit, reducing the clutter in the git diff outcome your version control framework presents to you.

Changing the following list:

names = [
    "Charles de Gaulle",
    "Antoine de Saint-Exupéry",
]

to:

names = [
    "Charles de Gaulle",
    "Antoine de Saint-Exupéry",
    "Bernard Clavel",
]

only involves a one-line change:

names = [
    "Charles de Gaulle",
    "Antoine de Saint-Exupéry",
+   "Bernard Clavel",
]

versus a confusing 3 multi-lines difference git output otherwise:

names = [
    "Charles de Gaulle",
-   "Antoine de Saint-Exupéry"
+   "Antoine de Saint-Exupéry",
+   "Bernard Clavel"
]

No more breaking changes

Another advantage of having trailing commas in your Python lists is that it makes changes less error-prone (with the risk of missing a comma when adding a new item into the list):

names = [
    "Charles de Gaulle",
    "Antoine de Saint-Exupéry"
    "Bernard Clavel"
]

Note: the above list is syntactically valid but will not return the expected outcome. Instead, it will trigger an implicit string literal concatenation.

['Charles de Gaulle', 'Antoine de Saint-ExupéryBernard Clavel']

Multiline Python fstring statement

In Python, you can write a string on multiple lines to increase codebase readability:

python> message = (
    "This is one line, "
    "this one continues.\n"
    "This one is new."
)
python> message
'This is one line, this one continues.\nThis one is new.'
python> print(message)
This is one line, this one continues.
This one is new.

This is purely visual and relies on wrapping the split sliced string within a tuple.

This becomes particularly handy if you are using a Python code formatter (e.g. black, mypy and pylint usually come together). If so, you might have stumbled on the line-too-long error messages.

One more example

def greet(name: str) -> None:
    message = (
        f"Hello {name}, this line "
        f"and this one "
        f"will be displayed on the same line.\n"
        f"but not this one"
    )
    print(message)
python> greet("Olivier")
Hello Olivier, this line and this one will be displayed on the same line.
but not this one

Python keyword-only parameters

Similar to Python Positional-Only Parameters but the other way around: parameters placed on the right side of the * syntax parameter will be coerced into the keyword-only parameter type.

def f(a, *, b, c):
    print(a, b, c)

In the above excerpt, a can be given either as positional or keyword parameter. However, b and c do not have other options beside being pass through keyword arguments:

python> f(1, b=2, c=3)
1 2 3
python> f(a=1, b=2, c=3)
1 2 3

Should you try something else, it will most likely fails:

python> f(1, 2, 3)
TypeError: f() takes 1 positional argument but 3 were given

Notes:

  • Python does not allow positional arguments after keyword arguments because of the left-side/right-side of the * operator thingy.
  • *args are collecting as positional arguments.
  • **kwargs are collecting as keyword-arguments.