[PyQt] General questions on parsing large QStrings

Mathias.Born at gmx.de Mathias.Born at gmx.de
Sun May 12 22:41:58 BST 2013


On 12.05.2013, 22:19:55 David Cortesi wrote:
> For an app to be built with PyQt5/Qt5, I will have a
> QPlainTextEdit in which the document may be quite
> sizable, 500K characters or more.

> I will want at times to inspect the document character
> by character, or possibly apply Python relib REs to it.

> I am somewhat at sea regarding the relationship between
> a const QString such as returned by QPlainTextEdit.toPlainText()
> and a Python3 unicode string, and -- just in general -- about
> the best way to do intensive examination of big strings.
> Is there a copy involved in, e.g.

>     docstring = unicode( myEditor.toPlainText() )

> I note that the PyQt4 QString reference omits the 
> QString.begin() or .constBegin() etc methods that return an
> "STL-style iterator" so that's out. Is there some internal magic
> to integrate the QString type into Python's "for" mechanism
> so that "for c in myEditor.toPlainText()" might be more
> efficient than making a Python3 string and iterating on it?

> Also in regard to making intensive loops faster, 
> how well do PyQtx calls integrate with Cython or PyPy?

> Thanks for any insights,

> Dave Cortesi

Hi,

This is how you cand find out things like this yourself:

"QPlainTextEdit.toPlainText()" is wrapped in
<python dir>\sip\PyQt5\QtWidgets\qplaintextedit.sip as

QString toPlainText() const;

QString is not wrapped normally as class, but defined as a mapped
type in <python dir>\sip\PyQt5\QtCore\qstring.sip.
The part interesting you is the conversion C++ -> Python:

%ConvertFromTypeCode
    return qpycore_PyObject_FromQString(*sipCpp);
%End

This function can be found in the source code of PyQt5:

// Convert a QString to a Python Unicode object.
PyObject *qpycore_PyObject_FromQString(const QString &qstr)
{
    PyObject *obj;

#if defined(PYQT_PEP_393)
    obj = PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND, qstr.constData(),
            qstr.length());
#elif defined(Py_UNICODE_WIDE)
    QVector<uint> ucs4 = qstr.toUcs4();

    if ((obj = PyUnicode_FromUnicode(NULL, ucs4.size())) == NULL)
        return NULL;

    memcpy(PyUnicode_AS_UNICODE(obj), ucs4.constData(),
            ucs4.size() * sizeof (Py_UNICODE));
#else
    if ((obj = PyUnicode_FromUnicode(NULL, qstr.length())) == NULL)
        return NULL;

    memcpy(PyUnicode_AS_UNICODE(obj), qstr.utf16(),
            qstr.length() * sizeof (Py_UNICODE));
#endif

    return obj;
}

As you can see, this will use a lot of memory in your case.

Best Regards,
Mathias Born




More information about the PyQt mailing list