lxml.etree.HTMLParser

Package lxml :: Module etree :: Class HTMLParser

[show private | hide private]

Type HTMLParser

 object --+    
          |    
_BaseParser --+
              |
             HTMLParser

The HTML parser. This parser allows reading HTML into a normal XML tree. By default, it can read broken (non well-formed) HTML, depending on the capabilities of libxml2. Use the 'recover' option to switch this off.

Available boolean keyword arguments: * recover - try hard to parse through broken HTML (default: True) * no_network - prevent network access (default: True) * remove_blank_text - discard empty text nodes * remove_comments - discard comments * remove_pis - discard processing instructions * compact - safe memory for short text content (default: True)

Note that you should avoid sharing parsers between threads for performance reasons.

Method Summary
	`__init__(...)` x.__init__(...) initializes x; see x.__class__.__doc__ for signature
	`__new__(T, S, ...)` T.__new__(S, ...) -> a new object with type S, a subtype of T
Inherited from object
	`__delattr__(...)` x.__delattr__('name') <==> del x.name
	`__getattribute__(...)` x.__getattribute__('name') <==> x.name
	`__hash__(x)` x.__hash__() <==> hash(x)
	`__reduce__(...)` helper for pickle
	`__reduce_ex__(...)` helper for pickle
	`__repr__(x)` x.__repr__() <==> repr(x)
	`__setattr__(...)` x.__setattr__('name', value) <==> x.name = value
	`__str__(x)` x.__str__() <==> str(x)

Class Variable Summary
`PyCObject`	`__pyx_vtable__` = `<PyCObject object at 0x401cb9c8>`

Method Details

init(...)
(Constructor)

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides:: lxml.etree._BaseParser.__init__

new(T, S, ...)

T.__new__(S, ...) -> a new object with type S, a subtype of T

Returns:

a new object with type S, a subtype of T

Overrides:: lxml.etree._BaseParser.__new__

Class Variable Details

__pyx_vtable__

Type:: PyCObject
Value:: <PyCObject object at 0x401cb9c8>

Generated by Epydoc 2.1 on Sat Aug 18 12:44:27 2007

http://epydoc.sf.net

Type HTMLParser

__init__(...) (Constructor)

__new__(T, S, ...)

__pyx_vtable__

init(...)
(Constructor)

new(T, S, ...)