[PyQt] Inconsistent pylupdate5 behaviour on UTF8 data
Giuseppe Corbelli
corbelligiuseppe at mesdan.it
Wed Feb 19 09:06:50 GMT 2020
On 2/18/20 5:58 PM, Phil Thompson wrote:
> What if you use trUtf8() instead if tr()?
I explored all the combinations I could think of on Windows 10, pyqt
5.14.1 from pip and linguist 5.13.2 and I could NOT find any working
combination. Below I am attaching the test results. Rather lengthy and
boring I fear.
If gist is preferrable:
https://gist.github.com/cowo78/26057f575ddfa3ee20a0b636acd894ff
Section A - using trUtf8() in code
===============================================================================
Using trUtf8 I ALWAYS get a 'Non-ASCII character detected in trUtf8
string' warning
Case 1 - NOT working
-------------------------------------------------------------------------------
trUtf8()
# CODECFORSRC = UTF-8
# CODECFORTR = UTF-8
Message created:
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Repeated pylupdate5 runs are OK, the same message is consistently generated.
Processed by linguist 5.13.2:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation>UTF8</translation>
</message>
Reprocessed by pylupdate5
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="obsolete">UTF8</translation>
</message>
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Case 2 - NOT working
-------------------------------------------------------------------------------
trUtf8()
CODECFORSRC = UTF-8
# CODECFORTR = UTF-8
Message created the FIRST time and subsequent ODD runs
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: 簧</source>
<translation type="unfinished"></translation>
</message>
Message created the SECOND time and subsequent EVEN runs
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Case 3 - NOT working
-------------------------------------------------------------------------------
trUtf8()
# CODECFORSRC = UTF-8
CODECFORTR = UTF-8
Message created:
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Repeated pylupdate5 runs are OK, the same message is consistently generated.
Processed by linguist 5.13.2:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation>utf8</translation>
</message>
Reprocessed by pylupdate5
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="obsolete">utf8</translation>
</message>
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Case 4 - NOT working
-------------------------------------------------------------------------------
trUtf8()
CODECFORSRC = UTF-8
CODECFORTR = UTF-8
Message created:
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Repeated pylupdate5 runs are OK, the same message is consistently generated.
Processed by linguist 5.13.2:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation>utf8</translation>
</message>
Reprocessed by pylupdate5:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="obsolete">utf8</translation>
</message>
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Section B - using tr() in code
===============================================================================
Case 1 - NOT working
-------------------------------------------------------------------------------
tr()
# CODECFORSRC = UTF-8
# CODECFORTR = UTF-8
Message created:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding:
ç°§</source>
<translation type="unfinished"></translation>
</message>
Repeated runs OK.
Linguist shows WRONG characters as the source is incorrectly formatted.
Case 2 - NOT working
-------------------------------------------------------------------------------
tr()
CODECFORSRC = UTF-8
# CODECFORTR = UTF-8
Message created the FIRST time and subsequent ODD runs
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Message created the SECOND time and subsequent EVEN runs
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding:
ç°§</source>
<translation type="unfinished"></translation>
</message>
Case 3 - NOT working
-------------------------------------------------------------------------------
tr()
# CODECFORSRC = UTF-8
CODECFORTR = UTF-8
Message created:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding:
ç°§</source>
<translation type="unfinished"></translation>
</message>
Linguist shows WRONG characters as the source is incorrectly formatted.
Case 4 - NOT working
-------------------------------------------------------------------------------
tr()
CODECFORSRC = UTF-8
CODECFORTR = UTF-8
Message created:
<message encoding="UTF-8">
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="unfinished"></translation>
</message>
Repeated pylupdate5 runs are OK, the same message is consistently generated.
Processed by linguist 5.13.2:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation>utf8</translation>
</message>
Reprocessed by pylupdate5:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation>utf8</translation>
</message>
Reprocessed by pylupdate5 on subsequent runs:
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding: ç°§</source>
<translation type="obsolete">utf8</translation>
</message>
<message>
<location filename="../translations_for_testsuite.py" line="6"/>
<source>this needs UTF8 encoding:
ç°§</source>
<translation type="unfinished"></translation>
</message>
Those who survived until here must be brave.
--
Giuseppe Corbelli
More information about the PyQt
mailing list