UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 6: invalid continuation byte
So this started when I upgraded my mint 19 to 20. The full error:
Traceback (most recent call last): File "/home/notification/views.py", line 206, in get .select_related("history__definition") File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 653, in first for obj in (self if self.ordered else self.order_by('pk'))[:1]: File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 274, in __iter__ self._fetch_all() File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 1242, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 55, in __iter__ results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1133, in execute_sql cursor.execute(sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py", line 192, in execute return self._record(self.cursor.execute, sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py", line 126, in _record return method(sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute return super().execute(sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/sentry_sdk/integrations/django/__init__.py", line 469, in execute return real_execute(self, sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers return executor(sql, params, many, context) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/mysql/base.py", line 73, in execute return self.cursor.execute(query, args) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute res = self._query(query) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 321, in _query self._post_get_result() File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 355, in _post_get_result self._rows = self._fetch_row(0) File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 328, in _fetch_row return self._result.fetch_row(size, self._fetch_type) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 11: invalid continuation byte
The database is MySQL and its configured with utf8mb4
mysql> SHOW VARIABLES LIKE 'char%'; Variable_name |Value | ------------------------|--------------------------| character_set_client |utf8mb4 | character_set_connection|utf8mb4 | character_set_database |utf8mb4 | character_set_filesystem|binary | character_set_results | | character_set_server |latin1 | character_set_system |utf8 | character_sets_dir |c:\mariadb\share\charsets\|
The row that is throwing this error is this one, and looking at the hex, its ok.
description |hex(description) | --------------------------------------------|------------------------------------------------------------------------------------------| Necessária para as partidas na 'batalha'|4E6563657373C3A17269612070617261206173207061727469646173206E612027626174616C6861206A6F7927|
á = C3 A1
Someone is putting the 0xe1, which belongs to cp1252. I went deep on the debug, but the conversion appears to be happening on the MysqlDB library.
More about the environment:
pip3 list | grep -i mysql mysql-connector-python 8.0.20 mysql-connector-python-rf 2.2.2 mysqlclient 2.0.1 PyMySQL 0.9.3
The character_set_%
settings that you have seem strange:
| character_set_client | latin7 | names | character_set_connection | latin7 | names | character_set_database | utf8mb4 | ? | character_set_filesystem | binary | hands-off | character_set_results | latin7 | names | character_set_server | utf8mb4 | ? | character_set_system | utf8 | hands-off
I have labeled them in 3 groups:
- "hands-off" — filesystem and system should not be modified from the default, else internal things are likely to break.
- "names" —
SET NAMES latin7
is, for example, how you specify that the clients are using latin7 encoding. The general move is away from the old defaultlatin1
toward the future standard ofutf8mb4
. (I used latin7 just to make it stand out. - "?" — It is unclear what impact these two have. I recommend leaving them along from the values from installation, which is probably
utf8mb4
(for both) in recent versions of MySQL/MariaDB.
In 5.7.6 GLOBAL character_set_database
and collation_database
system variables were deprecated; the SESSION version become readonly (deprecation)
From the 8.0.1 changelog:
Important Change: The default character set has changed from latin1 to utf8mb4. These system variables are affected:
-
The default value of the character_set_server and character_set_database system variables has changed from latin1 to utf8mb4.
-
The default value of the collation_server and collation_database system variables has changed from latin1_swedish_ci to utf8mb4_0900_ai_ci.
As a result, the default character set and collation for new objects differ from previously unless an explicit character set and collation are specified. This includes databases and objects within them, such as tables, views, and stored programs. One way to preserve the previous defaults is to start the server with these lines in the my.cnf file:
[mysqld] character_set_server=latin1 collation_server=latin1_swedish_ci