Class HTMLParser
object --+
|
_BaseParser --+
|
_FeedParser --+
|
HTMLParser
- Known Subclasses:
-
The HTML parser. This parser allows reading HTML into a normal XML
tree. By default, it can read broken (non well-formed) HTML, depending
on the capabilities of libxml2. Use the 'recover' option to switch this
off.
Available boolean keyword arguments: * recover - try hard
to parse through broken HTML (default: True) * no_network -
prevent network access for related files (default: True) *
remove_blank_text - discard empty text nodes * remove_comments -
discard comments * remove_pis - discard processing instructions *
compact - safe memory for short text content (default:
True)
Other keyword arguments: * encoding - override the document encoding *
target - a parser target object that will receive the parse events *
schema - an XMLSchema to validate against
Note that you should avoid sharing parsers between threads for
performance reasons.
|
__init__(...)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature |
|
|
a new object with type S, a subtype of T
|
|
Inherited from _FeedParser :
close ,
feed
Inherited from _BaseParser :
copy ,
makeelement ,
setElementClassLookup ,
set_element_class_lookup
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__reduce__ ,
__reduce_ex__ ,
__repr__ ,
__setattr__ ,
__str__
|
__init__(...)
(Constructor)
|
|
x.__init__(...) initializes x; see x.__class__.__doc__ for
signature
- Overrides:
object.__init__
|
- Returns: a new object with type S, a subtype of T
- Overrides:
object.__new__
|