[PyQt] QString API v2 concern...
Phil Thompson
phil at riverbankcomputing.com
Sat May 11 17:12:59 BST 2013
On Thu, 9 May 2013 11:59:34 -0700, Matt Newell <newellm at blur.com> wrote:
> On Monday, May 06, 2013 07:49:25 AM Phil Thompson wrote:
>> The first PyQt5 snapshots are now available. You will need the current
>> SIP
>> snapshot. PyQt5 can be installed alongside PyQt4.
>>
>> I welcome any suggestions for additional changes - as PyQt5 is not
>> intended to be compatible with PyQt4 it is an opportunity to fix and
>> improve things.
>>
>> Current changes from PyQt4:
>>
>> - Versions of Python earlier than v2.6 are not supported.
>>
>> - PyQt4 supported a number of different API versions (QString, QVariant
>> etc.). PyQt5 only implements v2 of those APIs for all versions of
Python.
>>
>
> I haven't looked into this deeper but I am a bit worried about the
> possible
> performance impacts of QString always being converted to a python
> str/unicode.
> (Not to mention the added porting work when going c++ <-> python).
>
> The vast majority of the PyQt code that we use loads data from libraries
> that
> deal with Qt types, and either directly loads that data into widgets, or
> does
> some processing then loads the data into widgets. I suspect that this
> kind of
> usage is very common.
>
> As an example a user of QtSql with the qsqlpsql driver that loads data
and
> displays it in a list view is going to see the following data
> transformations/copies:
>
> PyQt4 with v1 QString api:
>
> libpq data comes from socket
> -> QString (probable utf8->utf16)
> -> PyQt wrapper of QString (actual data not copied or converted)
> -> QString (pointer dereference to get Qt type)
>
> PyQt5, PyQt4 with v2 QString api:
>
> libpq data comes from socket
> -> QString (probable utf8->utf16)
> -> unicode (deep copy of data)
> -> QString (deep copy of data)
>
> So instead of one conversion we now have one conversion and two deep
> copies.
> Another very probable side-effect is that in many cases either the
> original
> QString and/or the unicode object will be held in memory, resulting in
two
> or
> possibly even three copies of the data. Even if all but the last stage
is
> freed, there will still be 2 or 3 copies in memory during processing
> depending
> on how the code is written, which can reduce performance quite a bit
> depending
> on data size because of cpu cache flushing.
>
> So far this is completely theoretical, and I'm sure in a large portion
of
> applications will have no noticeable effect, however I don't like the
idea
> that
> things may get permanently less efficient for apps that do process and
> display
> larger data sets.
>
> The one thing that stands out to me as possibly being a saving grace is
> the
> fact that (at least in my understanding) both Qt and python use utf16 as
> their
> internal string format, which means fast copies instead of slower
> conversions,
> and that it may be possible with some future Qt/python changes to
actually
> allow QString -> unicode -> QString without any data copies.
>
> At some point I will try to do some benchmarks and look into the actual
> code
> to see if there is an elegant solution to this potential problem.
The v2 API was first considered for PyQt3. It was rejected because of
performance concerns - those concerns were never validated. Python3 has
always defaulted to the v2 API - a period of 4 years - and I've never seen
any complaints about it.
So, yes, you need to show there is a real problem.
Phil
More information about the PyQt
mailing list