[PyQt] PyQt cannot trasform QString into str when reading emoji symbol from QClipboard
Ilya Kulakov
kulakov.ilya at gmail.com
Fri Jan 23 07:40:52 GMT 2015
This workaround does not work on Python 3.4.2, PyQt 5.4:
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 0-1: unexpected end of data
> On 23 янв. 2015 г., at 2:57, Pavel Roskin <proski at gnu.org> wrote:
>
> This would decode surrogates!
>
> import array
> string = QApplication.clipboard().text()
> # string = '\U0001f637'
> # string = '\ufeff\ud83d\ude87'
> try:
> # sane case - valid unicode
> string.encode('utf-8')
> except UnicodeEncodeError:
> # insane case - need to decode surrogates
> string = array.array('H', map(ord, list(string))).tobytes().decode('utf-16')
> print(string)
>
> The string is split into characters, converted to integers, packed as
> 16-bit unsigned int, converted to bytes and decoded as UTF-16. Real
> characters over 0xffff would raise OverflowError in that expression.
> That's why it's a fallback if UTF-8 encoding doesn't work.
>
> Of course it's a workaround. QApplication.clipboard().text() should
> not return surrogates.
>
> --
> Regards,
> Pavel Roskin
More information about the PyQt
mailing list