[PyKDE] How do you get HTML source from konqueror/KHTMLPart?
Marcos Dione
mdione at grulic.org.ar
Wed Dec 20 23:24:58 GMT 2006
On Wed, Dec 20, 2006 at 10:59:06AM -0800, yichun wei wrote:
> I am trying to grab some html pages via KHTMLPart.openURL and scrape
> the content I get. However I am not able to read out the HTML document
> sources I have in KHTMLPart.
just call:
domDocu= part.document ()
html= domDocu.toString ().string ()
that's a QString.
> kdelibs has KHTML::documentSource in khtml that can return the source of the
> pages since 2005, however I only found .document() in pyKDE.
yes; either it dissapeared from the sources or sip didn't pick it up
or something.
> toHTML() seemed to return nothing (None or ""), while toString() gave
> me an exception and my script crashed:
yes, under certain circumstances that happens. I think it's because
the KHTMLPart has no parentWidet or no parent or both. if you setup the
whole apparatus for showing the part, everythings works just fine.
> I find
> some discussion which point me to use KIO.get, but it returns a
> TransferJob and I have no idea how to get a QString from a
> TransferJob...
the kios[1] send signals when data() arrives. just use a KIO::Get
job, connect it to a slot that accumulates the data. there's another
signal when it finishes (result). you could also use NetAccess[2].
--
[1] http://developer.kde.org/documentation/library/3.5-api/kdelibs-apidocs/kio/kio/html/index.html
[2] http://developer.kde.org/documentation/library/3.5-api/kdelibs-apidocs/kio/kio/html/classKIO_1_1NetAccess.html
--
(Not so) Random fortune:
[11:50] <xanthus> m4rgin4l: si, pero es un pais civilizado por mas que sea un caos
-- xanthus, hablando de Argentina.
More information about the PyQt
mailing list