[PyQt] python bindings to qwebkit - who's responsible for doing this work (anyone?)

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Oct 14 22:22:05 BST 2008


hiya folks,

last month or so i added glib bindings to webkit, in order to make
them available via pygtk's codegen.py as python bindings, for
pywebkitgtk.  to make that as clear as mud: around webkit's c++ DOM
bindings i added glib bindings so that i could add python bindings.
wait - don't laugh - it does actually make sense - there _is_ a good
reason for doing it that way :)

the background to this story of insanity (300 auto-generated glib
objects and 1,570 auto-generated functions and the work's only 70%
complete) is that i decided to port pyjamas, itself a port of GWT, to
pure python.  in looking for decent technology to utilise, i stumbled
onto webkit.

now, of course, because pyjamas ( http://pyjd.org and http://pyjs.org
) absolutely rocks, and is utterly cool, i'd like to see pyjamas
ported to different DOM model layers, and so, a couple of days ago, i
had a quick hack at doing pyjamas-khtml:
http://github.com/lkcl/pyjamas-desktop/tree/master/pyjamas-khtml which
isn't unfortunately going so well, because there's a fundamental flaw
in the design of the python bindings to khtml.DOM:
https://bugs.kde.org/show_bug.cgi?id=172740

it's _this_ why i'm looking to contact the people who are doing the
work on QWebKit, to find out what the plans are, for doing python
bindings, and to advise you - whomever you might be - to watch out for
two things:
1) not to make the same design mistake
2) to consider ways in which drastic amounts of development time can be saved

... allow me to explain :)  issue 1, first.

there's something very, very important that gobject can do, and i am
unsure that it is appropriate to use e.g. QObject to do the same
thing, or even if Qt4 is capable of doing what gobject can provide
(which it could very well do, but i don't know how), and it's this:
gobject can do "base classing".  object inheritance trees, and,
absolutely absolutely critically, run-time typecasting.

even more importantly than that, python-gobject can "pick up" on the
gobject "type" and will return the correct python object.  the key is
here, in create_gdom_object(), line 284, of GDOMBindings.cpp:
http://github.com/lkcl/webkit/tree/16401.master/WebCore/bindings/gdom/GDOMBinding.cpp#L294
the critical line is this:
   gpointer res = g_object_new(dob->getGobjType(), NULL);
that getGobjType() function is a pure virtual function, in the glib
bindings c++ object (which represents the webkit DOM object).  the
auto-generator (Gobject.pm) creates each and every one of the 300
class instances where getGobjType() is declared, and so they return a
GDOM_GOBJECT_TYPE_NODE, GDOM_GOBJECT_TYPE_ELEMENT,
GDOM_GOBJECT_TYPE_HTML_BODY_ELEMENT etc. etc. etc.

from there, python-pygobject is clever enough to implement the
equivalent of c++ dynamic typecasting, such that _even_ when you do
pywebkitgtk.document.getElementById("body_id"), instead of a
GDOM_GOBJECT_TYPE_NODE - i.e. a pywebkitgtk.DOMNode python object
being returned (with the underlying c++ object being typecast down to
a WebCore::Node*), a GDOM_GOBJECT_TYPE_HTML_BODY_ELEMENT is returned,
and thus python-pygobject creates a pywebkitgtk.HTMLBodyElement.

like i mentioned in my previous post, regarding khtml.DOM, i cannot
express enough how absolutely vital it is that this issue is done
correctly.  as a "workaround", khtml.DOM offers "typecasting"
functions - in python - and the use of these functions is a complete
nightmare (for reasons explained in the kde-devel post)

without the combination of a HashMap and the support and use of the
equivalent of c++ runtime typecasting, developers who use
python-qwebkit for any serious DOM manipulation work are... screwed.
utterly.

issue 2, next.

many people assume that glib equals gtk, and that gtk equals glib.
and that glib equals gobject equals gtk.  i certainly made the mistake
of thinking that, in order to do gobject bindings, i would need the
gtk libraries.

you don't.

glib and gobject have nothing to do with gtk (but - Gtk is _entirely_
dependent on gobject).

so, when i said that i had done glib bindings around webkit's DOM
model - i really _meant_ that i had done glib bindings around webkit -
NOT i repeat NOT gtk bindings [around webkit's DOM model].

the importance of this - particularly in saving you (whomever you
might be) vasts amounts of time and effort - cannot be underestimated.

if you utilise the glib bindings to webkit to provide python-qwebkit
with bindings to webkit's DOM model, my guess is that it would take
about... oooo.... eight hours, absolute maximum.

if you endeavour to do a separate and distinct set of bindings to the
webkit DOM model - either direct in python or onto QObject (based on
GObject.pm auto-generator or the JSBindings auto-generator) i estimate
that the average developer will take approximately.... one month to
complete the work (300 objects), excluding Canvas DOM bindings which
would be about another 7 to 10 days (a further 100 to 150 objects).

if you endeavour to work off the back of the python-pykde work, using SIP:
  http://websvn.kde.org/trunk/KDE/kdebindings/python/pykde4/sip/khtml/
you might be able to save some time (note the similarity between the
sip files, there, and the .IDL files in webkit!) but... much of your
effort would be spent manually writing the SIP files - unless you
wrote an auto-converter, or modified Sip4 to take webkit IDL files as
input - a much more productive route that would be less fraught with
errors and maintenance headaches (as webkit evolves)...

... but you'd still have the same issue to deal with that python-khtml
faces, which has been solved and proven to work in the webkit-glib DOM
bindings, already.

so, my _recommendation_ would be to not turn noses up at gobject.h and
glib.h just because they've got the letter "g" in front of them -
you're _exceedingly unlikely to want to wrap each and every single
object in a DOM model with a full-blown QWidget - that's wildly
inappropriate.  you're not going to need to pull in gtk2 libraries.
vast amounts of time and effort are saved by pulling in _just_
libglib.

what else....

oh yes: there are three strategic functions (that really need to be
reduced to one - maybe two) in the interface between gtk and the
webkit-glib DOM bindings.  it revolves around the allocation of Page*
and Frame* objects.  with a little bit of tweaking, resulting in maybe
one or two _simple_ functions totalling about... ooo... twenty or so
lines of code per webkit "port" (e.g. QWebKit, e.g. WebkitGTK) i
reckon that the webkit-glib DOM bindings could become _completely_
independent.

the key is to make a webkit-glib function that takes a Frame* and
creates a gobject "GDOMFrame", and another that takes a Page* and
creates a gobject "GDOMPage".  astute readers will have noticed that
there _is_ no function in any of the webkit "ports" (e.g. QWebKit)
that create "WebCore::Page" c++ objects - because it's a base class.

in the QWebKit port, that means that in WebKit/qt/Api/qwebpage.cpp,
QWebPagePrivate's constructor would call this webkit-glib function to
create a gobject "GDOMPage"; in the WebkitGtk port, in
WebKit/qt/webkit/webkitwebview.cpp webkit_web_view_init would do
likewise.

that's the... general idea, anyway.  it gets a bit like that jackie
chan film where he jumps up a 12 ft wall by bouncing off alternate
walls in a corner - bootstrapping your way up, but that's just
unavoidable by the way that stuff gets passed around from view, to
page, to frame, to view, to document, to view, to page and back again
:)  all you're doing is throwing in a hook into the middle and saying
"scuse me, i'm a GDOMPage, i'm responsible for the Page* object now:
hand it over, and if you want access to the Page*, call my private
function" - likewise for the GDOMFrame object.

it's the GDOMFrame object that's the "kingpin".

once the Frame* object is inside a GDOMFrame, the python bindings -
and any other bindings (e.g. webkit-glib-mm c++), and any other
developers who want to use the gobject bindings directly - do not have
to know ANYTHING and i really really mean ANYTHING about the "port"
(webkit terminology - meaning e.g. QWebKit or WebkitGTK) that they are
operating under.

amazingly, then, you'd be able to write applications that would
operate on the webkit DOM model _not caring_ if the DOM model display
mechanism was... a qt screen, a gtk screen, a wxWidgets screen or
even.... if someone had added webkit into the curses-based lynx for
goodness sake! :)

... more about _that_ in another post :)

anyway.  some of what i describe above, for this 2nd issue, is mentioned here:
  https://bugs.webkit.org/show_bug.cgi?id=20632

apologies if this is a bit long - it's also quite involved, so it's unavoidable.

webkit rocks.

l.


More information about the PyQt mailing list