[PyQt] PyQt cannot trasform QString into str when reading emoji symbol from QClipboard
Phil Thompson
phil at riverbankcomputing.com
Fri Jan 23 08:45:53 GMT 2015
On 22/01/2015 12:13 pm, Ilya Kulakov wrote:
> I'm testing the following symbol: 😷
>
> I wrote simple Objective-C application to check how native frameworks
> would encode this into UTF-8. Here is the code:
>
> NSString *str = [[NSPasteboard generalPasteboard]
> stringForType:@"public.utf8-plain-text"];
> const char *cstr = str.UTF8String;
> size_t i = 0;
> while (cstr[i] != 0)
> {
> NSLog(@"0x%x", cstr[i]);
> ++i;
> }
>
> Then I wrote a simple Qt app to ensure that returned QString has the
> same bytes:
>
> QClipboard *clipboard = QApplication::clipboard();
> QString originalText = clipboard->text();
> QByteArray bytes = originalText.toUtf8();
> for (size_t i = 0; i < bytes.count(); ++i)
> qDebug("0x%x", bytes.at(i));
>
> In both apps output is:
>
> 0xfffffff0
> 0xffffff9f
> 0xffffff98
> 0xffffffb7
>
> However when I extract text by using PyQt (python 3):
>
> QApplication.clipboard().text()
>
> returned str consists of 1 string and cannot be encoded to UTF-8 due
> to surrogate '\ud83d' at position 0.
> However, as you can see above, there is no such symbol.
>
> That raises 2 questions:
> 1. How this symbols was introduced
> 2. How to handle this in an application
>
> The original bug report we received was from our Windows user, but we
> were not able to reproduce it there. However it's pretty easy to
> reproduce on Mac OS X.
>
> Best Regards,
> Ilya Kulakov
Should be fixed in tonight's PyQt4 and PyQt5 snapshots.
Thanks,
Phil
More information about the PyQt
mailing list