[PyQt] PyQt cannot trasform QString into str when reading emoji symbol from QClipboard

Pavel Roskin proski at gnu.org
Thu Jan 22 20:57:46 GMT 2015


This would decode surrogates!

import array
string = QApplication.clipboard().text()
# string = '\U0001f637'
# string = '\ufeff\ud83d\ude87'
try:
    # sane case - valid unicode
    string.encode('utf-8')
except UnicodeEncodeError:
    # insane case - need to decode surrogates
    string = array.array('H', map(ord, list(string))).tobytes().decode('utf-16')
print(string)

The string is split into characters, converted to integers, packed as
16-bit unsigned int, converted to bytes and decoded as UTF-16. Real
characters over 0xffff would raise OverflowError in that expression.
That's why it's a fallback if UTF-8 encoding doesn't work.

Of course it's a workaround. QApplication.clipboard().text() should
not return surrogates.

-- 
Regards,
Pavel Roskin


More information about the PyQt mailing list