[PyQt] Inconsistent pylupdate5 behaviour on UTF8 data
Giuseppe Corbelli
corbelligiuseppe at mesdan.it
Wed Feb 12 15:27:09 GMT 2020
Hi all
I found a puzzling pylupdate5 behaviour inconsistency between Linux and
Windows versions.
Scenario: I am extracting translatable strings from python modules. The
files are saved as UTF8, I run pylupdate and get different
representations in the XML output.
pylupdate5 v5.14.1 as Debian package on Linux and fresh pip install in a
venv on Windows 10.
As you can find in the attached test data:
- on windows the 'ç' character (U+00E7 ç c3 a7 LATIN SMALL LETTER C WITH
CEDILLA) is converted to <source>this needs UTF8 encoding:
ç°§</source>
- on linux the same 'ç' correctly converts to <source>this needs UTF8
encoding: ç°§</source>
So it seems that on windows each byte of the utf8 string is replaced
with its unicode point in xml numeric character format, while on linux
the same applies (correctly) to the character itself (formed by two
bytes in UTF8).
Am I doing something wrong?
Thanks
--
Giuseppe Corbelli
-------------- next part --------------
A non-text attachment was scrubbed...
Name: it_IT.ts.linux
Type: text/xml
Size: 504 bytes
Desc: not available
URL: <https://www.riverbankcomputing.com/pipermail/pyqt/attachments/20200212/ccc67519/attachment.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: it_IT.ts.win32
Type: text/xml
Size: 522 bytes
Desc: not available
URL: <https://www.riverbankcomputing.com/pipermail/pyqt/attachments/20200212/ccc67519/attachment-0001.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: module.py
Type: text/x-python
Size: 107 bytes
Desc: not available
URL: <https://www.riverbankcomputing.com/pipermail/pyqt/attachments/20200212/ccc67519/attachment.py>
-------------- next part --------------
CODECFORSRC = UTF-8
TRANSLATIONS = it_IT.ts
SOURCES = module.py
More information about the PyQt
mailing list