Skip to content

Where is your installed Python packages?

homepage-banner

Introduction

I am writing this article because I have recently come across several frequently asked questions in the Python community.

  • Why does it say “executable file not found” when I have installed pip?
  • Why do I get a ModuleNotFound error when importing a module?
  • Why can I run my code in Pycharm but not in the command prompt?

How Python Finds Packages

Nowadays, many people have more than one Python installation on their computers, including multiple virtual environments. This can lead to forgetting to pay attention to the installation path of packages. Let’s first address the issue of finding packages. The answer to this question is simple, but many people are unaware of the underlying principle. Suppose the path to your Python interpreter is $path_prefix/bin/python. When you start a Python interactive environment or run a script using this interpreter, it will by default search for packages in the following locations:

  1. $path_prefix/lib (Standard library path)
  2. $path_prefix/lib/pythonX.Y/site-packages (Third-party library path, where X.Y corresponds to the major and minor version numbers of Python, such as 3.7, 2.6)
  3. Current working directory (result of the pwd command)

If you are using the default Python installation on Linux, $path_prefix is /usr. If you have compiled Python with default options, $path_prefix is /usr/local. From the second point above, you can see that the paths for third-party libraries may differ for different Python versions. If you upgrade Python from 3.6 to 3.7, for example, the previously installed third-party libraries will no longer be accessible. Of course, you can simply copy the entire folder over, and in most cases, you won’t encounter any issues.

Several Useful Functions

  • sys.executable: The path of the current Python interpreter being used.
  • sys.path: A list of paths to search for packages.
  • sys.prefix: The current $path_prefix being used.

Example:

>>> import sys
>>> sys.executable
'/home/frostming/.pyenv/versions/3.7.2/bin/python'
>>> sys.path
['', '/home/frostming/.pyenv/versions/3.7.2/lib/python37.zip', '/home/frostming/.pyenv/versions/3.7.2/lib/python3.7', '/home/frostming/.pyenv/versions/3.7.2/lib/python3.7/lib-dynload', '/home/frostming/.local/lib/python3.7/site-packages', '/mnt/d/Workspace/pipenv', '/home/frostming/.pyenv/versions/3.7.2/lib/python3.7/site-packages']
>>> sys.prefix
'/home/frostming/.pyenv/versions/3.7.2'

Adding Search Paths with Environment Variables

If the path of your package is not listed in the search path list above, you can add the path to the PYTHONPATH environment variable, with multiple paths separated by : (or ; on Windows).

However, be careful not to add paths of packages from different Python versions to the PYTHONPATH, for example, PYTHONPATH=/home/frostming/.local/lib/python2.7/site-packages, as the paths in PYTHONPATH take precedence over the default search paths, which may cause compatibility issues if using Python 3. In fact, it is best to avoid including any paths with site-packages in PYTHONPATH.

By the way, PATH is used to search for executable programs. If you run the command my_cmd in the terminal, the system will scan the paths in PATH one by one to check if my_cmd exists in any of those paths. So, if you receive a prompt saying that the program cannot be found or the command is not recognized, you should check if the path has been added to PATH.

How to Install Python Packages

To install Python packages, the most common method is to use pip. Even if you are using pipenv or poetry, the underlying tool is still pip, so the instructions apply to all. If you don’t have pip installed, please refer to this (https://pip.pypa.io/en/stable/installing/). If you have pip installed but cannot use the pip command, please refer to the previous section.

There are two ways to run pip:

  • pip ...
  • python -m pip ...

The first and second methods are similar, with the difference being that the first method uses the Python interpreter specified in the shebang of the pip file. In general, if your pip path is $path_prefix/bin/pip, the corresponding Python path would be $path_prefix/bin/python. If you are using a Unix system, you can find the Python interpreter path in the first line of the output from cat $(which pip). The second method explicitly specifies the location of Python. This rule applies to all executable Python programs. The process is illustrated in the following diagram.

So, when installing packages using pip without any custom configurations, they will be automatically installed under $path_prefix/lib/pythonX.Y/site-packages ($path_prefix is obtained from the previous paragraph), and executable programs will be installed under $path_prefix/bin. If you need to run my_cmd directly from the command line, remember to add it to the PATH.

Options in pip to change the installation location

  • prefix PATH: replace $path_prefix with the given value
  • root ROOT_PATH: prepend ROOT_PATH to $path_prefix. For example, root /home/frostming will change $path_prefix from /usr to /home/frostming/usr
  • target TARGET: directly specify the installation location to TARGET

Virtual Environment

A virtual environment is used to isolate the dependencies of different projects, allowing them to be installed in separate paths to prevent dependency conflicts. Once you understand how Python installs packages, it becomes easier to understand the principles behind virtual environments (using virtualenv or the venv module). In fact, running virtualenv myenv will create a copy of the Python interpreter in myenv/bin, as well as the directories myenv/lib and myenv/lib/pythonX.Y/site-packages (the venv module uses a different method, but the result is similar). After running source myenv/bin/activate, the myenv/bin directory is added to the beginning of the PATH, ensuring that this copied Python interpreter is prioritized in the search. As a result, when installing packages, the $path_prefix will be myenv, achieving the isolation of installation paths.

Impact of Script Running Method on Search Paths

From the above introduction, we can see that the direct reason for Python to find a package is sys.path, and the further reason is the path of sys.executable. After writing a program, we always need to run it. However, different running methods may affect sys.path and cause different behaviors. Let’s discuss this issue below.

Assuming your package structure is as follows

.
├── main.py
└── my_package
    ├── __init__.py
    ├── a.py
    └── b.py

The content of main.py:

import my_package.b

The content of the b.py file is very simple:

import sys
print("I'm b")
print(sys.path)

Now execute in the same directory as main.py.

$ python main.py
I'm b
['/home/frostming/test_path', ...]
$ python my_package/b.py
I'm b
['/home/frostming/test_path/my_package', ...]

The running method of python xxx.py is called direct running, in which the value of __name__ in the file will be specified as __main__. This is how the “Run File” function in IDE works. It can be seen that in this case, the first value of sys.path is the directory where the script file is located, which changes with the script path. Remember that we always execute tests in the directory /home/frostming/test_path.

Okay, so if we need to import a.py in b.py, where a.py contains a simple line print("I'm a"), how should we write the script in b.py?

  1. Easy!, import a. Alright, let’s run the above test again.
$ python main.py
ModuleNotFoundError: No module named 'a'
$ python my_package/b.py
I'm a
I'm b
['/home/frostming/test_path/my_package', ...]

The first test failed. If you have already read the previous content, this error is expected - sys.path does not have the directory /home/frostming/test_path/my_package where a.py is located, so it can’t find a.

  1. Change it to from my_package import a, and we won’t perform the test again because based on the same analysis, we can predict that the first one will run without any problems and the second one will throw an error saying my_package cannot be found. Note that since b is in the package my_package, we can use relative import here, and writing from . import a is equivalent to from my_package import a.

So, is there a way to run both of these without any errors? Yes. We need to understand that in a project, there are limited entry points, and in practice, it is unlikely to have executable code both at the top level and in subdirectories. We should put the main running logic in a file called main.py (it doesn’t have to be this exact name, for example, in a Django project, it is called manage.py). If we really need to run the code from a script in a subdirectory, we should use python -m <module_name>, and the import statement in b.py should be from my_package import a. Let’s see how it runs:

$ python main.py  # 和 python -m main 效果一样
I'm a
I'm b
['/home/frostming/test_path', ...]
$ python -m my_package.b
I'm a
I'm b
['/home/frostming/test_path', ...]

You can see that the contents of sys.path are the same in these two runs. The first value is the current directory where the program is running. This running method is called module mode. The argument following python -m is the module name separated by dots (.), not the path name. Because of this consistency, you can use the same import definition in all parts of your project, regardless of which script you are in. That’s why the Django official documentation recommends using import names like myapp.models.users.

In addition, when running in module mode, each parent module (or package) passed as an argument will also be executed as a module. This means that you can use relative imports in the module (which is not possible when running directly), and the value of __name__ in the passed module will be set to __main__, allowing you to still use the if __name__ == "__main__": condition. If the module passed in python -m <module_name> is a package, the __main__.py script in the package directory will be executed (if it exists), and the value of __name__ in that script will be __main__.

Summary

As you can see here, the most important thing about package path searching is the $path_prefix path prefix, and this value is derived from the Python interpreter path being used. So to find the path of a package, all you need to know is the path of the interpreter. If you encounter a change in the package path, you just need to specify the Python interpreter you want through the correct PATH setting.

Reference

  • https://frostming.com/2019/03-13/where-do-your-packages-go/
Leave a message