Home | Trees | Indices | Help |
|
---|
|
object --+ | _BaseParser --+ | _FeedParser --+ | HTMLParser
The HTML parser. This parser allows reading HTML into a normal XML tree. By default, it can read broken (non well-formed) HTML, depending on the capabilities of libxml2. Use the 'recover' option to switch this off.
Available boolean keyword arguments: * recover - try hard to parse through broken HTML (default: True) * no_network - prevent network access (default: True) * remove_blank_text - discard empty text nodes * remove_comments - discard comments * remove_pis - discard processing instructions * compact - safe memory for short text content (default: True)
You can pass a parser target as ``target`` keyword argument.
Note that you should avoid sharing parsers between threads for performance reasons.
|
|||
|
|||
|
|||
Inherited from Inherited from Inherited from |
|
|||
Inherited from Inherited from |
|
|
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Sun Sep 16 00:12:46 2007 | http://epydoc.sourceforge.net |