os.walk has a major gotcha

Python's os.walk is a very useful idiom for recursively iterating over a directory. However it has one big gotcha that bit us recently: by default, it silently no-ops if the directory doesn't exist:

>>> import os
>>> list(os.walk('/nonexistent/dir'))

However os.walk does support an optional onerror param, which takes a callable to handle any exceptions from the underlying walk:

>>> def strict_errors(ex: OSError):
...   raise ex
>>> list(os.walk('/nonexistent/dir', onerror=strict_errors))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/3.7.4/lib/python3.7/os.py", line 352, in walk
  File "<stdin>", line 2, in strict_errors
  File "/3.7.4/lib/python3.7/os.py", line 349, in walk
    scandir_it = scandir(top)
FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/dir'

Note that this callback will also be invoked for errors deep in the walk. These can happen, for example, if you modify the walk in-flight, as described in the docs, and erroneously add a non-existent directory.

In our codebase we plan to ban os.walk in favor of a wrapper function, safe_walk, that enforces the strict error handling above.  We'll use a custom Flake8 plugin to enforce this. Fortunately, our build system makes it easy to run Flake8, and many other Python linters and formatters. Send us an email if you'd like to learn more about this!

Show Comments