[PyQt] Inconsistent pylupdate5 behaviour on UTF8 data
phil at riverbankcomputing.com
Sun Feb 16 13:01:29 GMT 2020
On 12/02/2020 15:27, Giuseppe Corbelli wrote:
> Hi all
> I found a puzzling pylupdate5 behaviour inconsistency between Linux
> and Windows versions.
> Scenario: I am extracting translatable strings from python modules.
> The files are saved as UTF8, I run pylupdate and get different
> representations in the XML output.
> pylupdate5 v5.14.1 as Debian package on Linux and fresh pip install in
> a venv on Windows 10.
> As you can find in the attached test data:
> - on windows the 'ç' character (U+00E7 ç c3 a7 LATIN SMALL LETTER C
> WITH CEDILLA) is converted to <source>this needs UTF8 encoding:
> - on linux the same 'ç' correctly converts to <source>this needs UTF8
> encoding: ç°§</source>
> So it seems that on windows each byte of the utf8 string is replaced
> with its unicode point in xml numeric character format, while on linux
> the same applies (correctly) to the character itself (formed by two
> bytes in UTF8).
> Am I doing something wrong?
I can't reproduce this - I get identical results on Windows, Linux and
If you want to try and debug your own installation then look at
evilBytes() in qpy\pylupdate\metatranslator.cpp
More information about the PyQt