[PyQt] Inconsistent pylupdate5 behaviour on UTF8 data
Phil Thompson
phil at riverbankcomputing.com
Mon Mar 2 16:19:04 GMT 2020
Giuseppe,
The fundamental problem is that in moving the code from Qt4 to Qt5 I did
the least work I could get away with rather than doing the job properly.
Can you confirm that you have a workaround for this?
I'd rather not try and fix it at this stage (given Qt6 is sooner rather
than later). For PyQt6 I'd rather replace the whole lot with a pure
Python implementation that compiles and inspects the Python byte code.
Thanks,
Phil
On 19/02/2020 09:06, Giuseppe Corbelli wrote:
> On 2/18/20 5:58 PM, Phil Thompson wrote:
>> What if you use trUtf8() instead if tr()?
>
> I explored all the combinations I could think of on Windows 10, pyqt
> 5.14.1 from pip and linguist 5.13.2 and I could NOT find any working
> combination. Below I am attaching the test results. Rather lengthy and
> boring I fear.
>
> If gist is preferrable:
> https://gist.github.com/cowo78/26057f575ddfa3ee20a0b636acd894ff
>
>
> Section A - using trUtf8() in code
> ===============================================================================
> Using trUtf8 I ALWAYS get a 'Non-ASCII character detected in trUtf8
> string' warning
>
> Case 1 - NOT working
> -------------------------------------------------------------------------------
> trUtf8()
> # CODECFORSRC = UTF-8
> # CODECFORTR = UTF-8
>
> Message created:
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
> Repeated pylupdate5 runs are OK, the same message is consistently
> generated.
>
> Processed by linguist 5.13.2:
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation>UTF8</translation>
> </message>
>
> Reprocessed by pylupdate5
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="obsolete">UTF8</translation>
> </message>
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
>
> Case 2 - NOT working
> -------------------------------------------------------------------------------
> trUtf8()
> CODECFORSRC = UTF-8
> # CODECFORTR = UTF-8
>
> Message created the FIRST time and subsequent ODD runs
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: 簧</source>
> <translation type="unfinished"></translation>
> </message>
>
> Message created the SECOND time and subsequent EVEN runs
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
>
> Case 3 - NOT working
> -------------------------------------------------------------------------------
> trUtf8()
> # CODECFORSRC = UTF-8
> CODECFORTR = UTF-8
>
> Message created:
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
> Repeated pylupdate5 runs are OK, the same message is consistently
> generated.
>
> Processed by linguist 5.13.2:
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation>utf8</translation>
> </message>
>
> Reprocessed by pylupdate5
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="obsolete">utf8</translation>
> </message>
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
>
> Case 4 - NOT working
> -------------------------------------------------------------------------------
> trUtf8()
> CODECFORSRC = UTF-8
> CODECFORTR = UTF-8
>
> Message created:
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
> Repeated pylupdate5 runs are OK, the same message is consistently
> generated.
>
> Processed by linguist 5.13.2:
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation>utf8</translation>
> </message>
>
> Reprocessed by pylupdate5:
> <message>
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="obsolete">utf8</translation>
> </message>
> <message encoding="UTF-8">
> <location filename="../translations_for_testsuite.py" line="6"/>
> <source>this needs UTF8 encoding: ç°§</source>
> <translation type="unfinished"></translation>
> </message>
>
>
> Section B - using tr() in code
> ===============================================================================
> Case 1 - NOT working
>
> -------------------------------------------------------------------------------
>
> tr()
>
> # CODECFORSRC = UTF-8
>
> # CODECFORTR = UTF-8
>
>
>
> Message created:
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding:
> ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
> Repeated runs OK.
>
>
>
> Linguist shows WRONG characters as the source is incorrectly formatted.
>
>
>
>
>
> Case 2 - NOT working
>
> -------------------------------------------------------------------------------
>
> tr()
>
> CODECFORSRC = UTF-8
>
> # CODECFORTR = UTF-8
>
>
>
> Message created the FIRST time and subsequent ODD runs
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding: ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
> Message created the SECOND time and subsequent EVEN runs
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding:
> ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
>
>
> Case 3 - NOT working
>
> -------------------------------------------------------------------------------
>
> tr()
>
> # CODECFORSRC = UTF-8
>
> CODECFORTR = UTF-8
>
>
>
> Message created:
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding:
> ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
> Linguist shows WRONG characters as the source is incorrectly formatted.
>
>
>
>
>
> Case 4 - NOT working
>
> -------------------------------------------------------------------------------
>
> tr()
>
> CODECFORSRC = UTF-8
>
> CODECFORTR = UTF-8
>
>
>
> Message created:
>
> <message encoding="UTF-8">
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding: ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
> Repeated pylupdate5 runs are OK, the same message is consistently
> generated.
>
>
>
> Processed by linguist 5.13.2:
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding: ç°§</source>
>
> <translation>utf8</translation>
>
> </message>
>
>
>
> Reprocessed by pylupdate5:
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding: ç°§</source>
>
> <translation>utf8</translation>
>
> </message>
>
>
>
> Reprocessed by pylupdate5 on subsequent runs:
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding: ç°§</source>
>
> <translation type="obsolete">utf8</translation>
>
> </message>
>
> <message>
>
> <location filename="../translations_for_testsuite.py" line="6"/>
>
> <source>this needs UTF8 encoding:
> ç°§</source>
>
> <translation type="unfinished"></translation>
>
> </message>
>
>
>
> Those who survived until here must be brave.
More information about the PyQt
mailing list