What is python __init__.py file for?

The Python __init__.py file serves two main functions:

  1. It is used to label a directory as a python package to make it visible so other python files can re-use the nested resources (e.g. the incr method defined inside helpers/file1.py):

    from helpers.file1 import incr
    
    result = incr(42)
    assert result == 43
    

    A side effect is that – with some not-recommended workarounds – developers do not have to care about the method’s location in your package hierarchy:

        helpers/
        ├── __init__.py
        ├── file1.py
        ├── file2.py
        ├── ...
        └── fileN.py
    

    For that, simply fill the __init__.py file with the following content:

    from file1 import *
    from file2 import *
    ...
    from fileN import *
    

    Therefore, even though it is always a good practice to explicitely mention the source, they can simply use:

    from helpers import incr
    
    result = incr(42)
    assert result == 43
    
  2. It is used to define variables or to initialise objects like logging at the package level and import time (to make them accesible at a global package level):

    from helpers.file3 import MY_VAR
    
    print(MY_VAR)
    

Still blur? Thereafter an easy example to understand:

First, let’s plot some context

You have the following project structure:

playground_packages
├── helpers/
    └── utils.py
└── main.py

The utils.py file contains:

def incr(n:list[float]) -> list[float]:
    return [x+1 for x in n]

if __name__ == "__main__":
    pass

Note: you could have also used the map and lambda methods instead. However, here is a nice example to show about list comprehension. The alternave version would have looked like:

list(map(lambda x: x+1, n))

The main.py file is looking like the following:

from helpers.utils import incr

def main() -> None:
    result = incr([1,2,3,4,5])
    print(result)

if __name__ == "__main__":
    main()

Notes:

  • Why we haven’t used import helpers.utils or import * is explained here (to do).
  • The if __name__ == "__main__" conditional statement is explained here (to do).

__init__.py to label a folder as Python package

Jumping back to our example, if you try to run the code with the current configuration, you will get the following error:

> python main.py
Traceback (most recent call last):
File "path/to/playground_package/main.py", line 1, in <module>
    from helpers.utils import incr
ModuleNotFoundError: No module named 'helpers'

This is because the helpers directory is not yet visible for Python. Python is actively looking for Python packages but cannot find any. A package is a folder that contains a __init__.py file.

Simply edit our current structure for the following:

playground_packages
├── helpers/
    ├── __init__.py
    └── utils.py
└── main.py

Now, it you try again, it will succeed:

> python main.py
[2, 3, 4, 5, 6]

The main take-away is:

If you want to split-up your code in different folders and files (to make your code more readable and debuggable), you must create a __init__.py file under each folder so they become visible for Python and can therefore be used and refered to in your code using import.

__init__.py to define global variables

In our previous example, the __init__.py file is empty. We can edit it, adding the following line:

MY_LIST = [2,4,6,8,10]

This variable is accessible even by the main function:

from helpers import MY_LIST
from helpers.utils import incr

def main() -> None:
    result = incr(MY_LIST)
    print(result)

if __name__ == "__main__":
    main()
> python main.py
[3, 5, 7, 9, 11]

Note: it is better to define variables in a config.py or constants.py file rather than in a __init__.py file. However, __init__.py becomes handy when it comes to instanciate objects such as logging or dynaconf. More on that will follow in another article.

You are now ready to fit your code together like Russian dolls 🪆

Use Poetry as Python Package Manager

Installing poetry is super easy. On macOS, simply run:

brew install poetry

Now, let’s have a look how to use it.

Poetry Cheat Sheet

I have gathered for you in this section the poetry commands you will always need. You can refer to this section later on and simply save the link for later use.

poetry init
poetry install
poetry update
poetry add <your-python-package>
poetry run

Getting started with an example

  1. Create your python project and move under its repository:

    mkdir playground-poetry && cd playground-poetry
    
  2. Init poetry. You will be prompted to fill-in the following configuration fields:

    > poetry init
    Package name [playground-poetry]:
    Version [0.1.0]:
    Description []:
    Author [None, n to skip]:
    License []:
    Compatible Python versions [^3.10]:
    Would you like to define your main dependencies interactively? (yes/no) [yes]
    Would you like to define your development dependencies interactively? (yes/no) [yes]
    Do you confirm generation? (yes/no) [yes]
    
  3. This will generate the pyproject.toml configuration file:

    [tool.poetry]
    name = "playground-poetry"
    version = "0.1.0"
    description = "A primer on poetry."
    authors = ["John Doe <john.doe@gmail.com>"]
    
    [tool.poetry.dependencies]
    python = "^3.10"
    
    [tool.poetry.dev-dependencies]
    
    [build-system]
    requires = ["poetry-core>=1.0.0"]
    build-backend = "poetry.core.masonry.api"
    

After generation, your project’s architecture should look like the following:

playground-poetry
└── pyproject.toml

Add a package

You can simply use the following command e.g.:

poetry add black

In our case, we wanted to install the python black code formatter.

Note: for more general codebase formatting, I recommend super-linter.

You will see that our pyproject.toml poetry configuration file has been updated as it now contains reference to the black package:

[tool.poetry]
name = "playground-poetry"
version = "0.1.0"
description = "A primer on poetry."
authors = ["John Doe <john.doe@gmail.com>"]

[tool.poetry.dependencies]
python = "^3.10"
black = "^22.10.0"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Note: you are wondering what the weird ^ sign stands for? You well soon find an article about it.

You can check that black is indeed accessible and installed within poetry virtual environment via:

> poetry run black --version
black, 22.10.0 (compiled: yes)
Python (CPython) 3.10.4

The pyproject.toml is not the only thing that has changed. If you have a look on our project’s architecture, you will see that it now contains an additional poetry.lock file:

playground-poetry
├── poetry.lock
└── pyproject.toml

Note: poetry is storing the state the same way terraform is doing. If you are new to poetry this might be too much details right now. If you want to know more about poetry.lock an article will follow soon!

Run python code on a poetry environment

Imagine that you have now a python code that requires a couple of dependencies (e.g. could be black, pandas, logging, etc.) to run. E.g.:

"""
A simple module containing maths methods.
"""


def add(number_1: float, number_2: float) -> float:
    """
    Add the numbers.
    Args:
        number_1 (float): the first number.
        number_2 (float): the second number.
    Returns:
        float: the sum of both numbers.

    >>> add(-2, 1)
    -1
    >>> add(42, 0)
    42
    """
    return number_1 + number_2


def main() -> None:
    """
    Main function.
    """

    res = add(4, 7)
    print(res)


if __name__ == "__main__":
    main()

with the following project’s architecture:

playground-poetry
├── playground_poetry
    ├── __init__.py
    └── main.py
├── poetry.lock
└── pyproject.toml

Note: fair enough, in our project we do not really need those dependencies at this point, but let’s say that this is just an extract and other parts in the code do actually use logging or pandas.

You have those dependencies installed on your poetry environment (you can see them on the pyproject.toml dependencies section).

You then need to execute your python code within the umbrella of this poetry virtual environment.

This is done using poetry run python <your-python-file>.

In our example:

> poetry run python playground_poetry/main.py
11

Note: we recommend you to have a similar architecture on your projects as it makes the development of python’s package easier, using the snake_case.

your-project-name
└── your_project_name
    ├── __init__.py
    └── main.py

Get started on a cloned poetry project

Now let’s say you already inherit from an existing poetry project with an already existing pyproject.toml and poetry.lock files.

The first time you need to instantiate the virtual environment, reading from the pyproject.toml file:

poetry install

This will create the poetry.lock file if not existing or resolves the dependencies if so.

You can also update the poetry.lock file if needed:

poetry update

Note: more info https://python-poetry.org/docs/cli/.

Run the extra mile using poetry run and a Makefile

Let’s improve our example project.

Have you noticed that to run our main.py file, you need to explicitly state the whole path:

poetry run python playground_poetry/main.py

You can make things better, editing the pyproject.toml file for the following:

[tool.poetry]
name = "playground-poetry"
version = "0.1.0"
description = "A primer on poetry."
authors = ["John Doe <john.doe@gmail.com>"]
packages = [{include="playground_poetry"}]

[tool.poetry.dependencies]
python = "^3.10"
black = "^22.10.0"
pandas = "^1.5.1"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

[tool.poetry.scripts]
main = "playground_poetry.main:main"

and from now on simply do the same thing as before but shorter (and faster) via:

> poetry run main
11

Note: this is thanks to the packages line, the __init__.py file nested under it that makes the main.py file visible and the [tool.poetry.scripts] layer.

But that’s not all: we can do even better. Let’s make this command even shorter, saving it under a Makefile command.

In our example project, let’s add a Makefile with the following lines:

main:
    poetry run main

The final structure should look like the following:

playground-poetry
├── playground_poetry
    ├── __init__.py
    └── main.py
├── Makefile
├── poetry.lock
└── pyproject.toml

Finally, you can run the main function using:

> make main
poetry run main
11

And you thought our main job were to “write” code? The less the more! 😇

What have you learned

  • You can create a poetry environment from scratch to manage your python dependencies.
  • You can reuse an existing one.
  • You can scale and automate using Makefile commands.
  • You got a short primer on Software Development Standardization and Best Principles.