Python 3 assumes a default encoding of UTF-8 in a variety of useful scenarios. For example, s: str = buf.decode()
will decode the byte buffer as UTF-8. And similarly for buf = s.encode()
. Python 3 also defaults to encoding filenames as UTF-8, and Python 3 source code itself is encoded as UTF-8 by default. This is in contrast to Python 2, which either defaulted to ascii
or required an explicit encoding.
However, one noteworthy and gotcha-worthy exception is that the open()
call, when used in text mode, uses a platform-dependent encoding. So it's best to always specify an explicit encoding: open('path/to/file', 'r', encoding='utf-8')
.