Package lxml :: Module etree :: Class HTMLParser
[hide private]
[frames] | no frames]

Class HTMLParser

    object --+        
             |        
??._BaseParser --+    
                 |    
       _FeedParser --+
                     |
                    HTMLParser
Known Subclasses:

HTMLParser(self, encoding=None, remove_blank_text=False, remove_comments=False, remove_pis=False, strip_cdata=True, no_network=True, target=None, schema: XMLSchema =None, recover=True, compact=True, collect_ids=True)

The HTML parser.

This parser allows reading HTML into a normal XML tree. By default, it can read broken (non well-formed) HTML, depending on the capabilities of libxml2. Use the 'recover' option to switch this off.

Available boolean keyword arguments:

Other keyword arguments:

Note that you should avoid sharing parsers between threads for performance reasons.

Instance Methods [hide private]
 
__init__(self, encoding=None, remove_blank_text=False, remove_comments=False, remove_pis=False, strip_cdata=True, no_network=True, target=None, schema: XMLSchema=None, recover=True, compact=True, collect_ids=True)
x.__init__(...) initializes x; see help(type(x)) for signature
a new object with type S, a subtype of T
__new__(T, S, ...)

Inherited from _FeedParser: close, feed

Inherited from unreachable._BaseParser: copy, makeelement, setElementClassLookup, set_element_class_lookup

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties [hide private]

Inherited from _FeedParser: feed_error_log

Inherited from unreachable._BaseParser: error_log, resolvers, target, version

Inherited from object: __class__

Method Details [hide private]

__init__(self, encoding=None, remove_blank_text=False, remove_comments=False, remove_pis=False, strip_cdata=True, no_network=True, target=None, schema: XMLSchema=None, recover=True, compact=True, collect_ids=True)
(Constructor)

 
x.__init__(...) initializes x; see help(type(x)) for signature
Overrides: object.__init__

__new__(T, S, ...)

 
Returns: a new object with type S, a subtype of T
Overrides: object.__new__