Package lxml :: Module etree

Module etree

The lxml.etree module implements the extended ElementTree API for XML.

Version: 2.1.5

Classes

AncestorsIterator
AncestorsIterator(self, node, tag=None) Iterates over the ancestors of an element (from parent to parent).

AttributeBasedElementClassLookup
AttributeBasedElementClassLookup(self, attribute_name, class_mapping, fallback=None) Checks an attribute of an Element and looks up the value in a class dictionary.

C14NError
Error during C14N serialisation.

CDATA
CDATA(data)

CommentBase
All custom Comment classes must inherit from this one.

CustomElementClassLookup
CustomElementClassLookup(self, fallback=None) Element class lookup based on a subclass method.

DTD
DTD(self, file=None, external_id=None) A DTD validator.

DTDError
Base class for DTD errors.

DTDParseError
Error while parsing a DTD.

DTDValidateError
Error while validating an XML document with a DTD.

DocInfo
Document information provided by parser and DTD.

DocumentInvalid
Validation error.

ETCompatXMLParser
ETCompatXMLParser(self, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, remove_blank_text=False, compact=True, resolve_entities=True, remove_comments=True, remove_pis=True, target=None, encoding=None, schema=None) An XML parser with an ElementTree compatible default setup.

ETXPath
ETXPath(self, path, extensions=None, regexp=True) Special XPath class that supports the ElementTree {uri} notation for namespaces.

ElementBase
All custom Element classes must inherit from this one.

ElementChildIterator
ElementChildIterator(self, node, tag=None, reversed=False) Iterates over the children of an element.

ElementClassLookup
ElementClassLookup(self) Superclass of Element class lookups.

ElementDefaultClassLookup
ElementDefaultClassLookup(self, element=None, comment=None, pi=None, entity=None) Element class lookup scheme that always returns the default Element class.

ElementDepthFirstIterator
ElementDepthFirstIterator(self, node, tag=None, inclusive=True) Iterates over an element and its sub-elements in document order (depth first pre-order).

ElementNamespaceClassLookup
ElementNamespaceClassLookup(self, fallback=None)

ElementTextIterator
ElementTextIterator(self, element, tag=None, with_tail=True) Iterates over the text content of a subtree.

EntityBase
All custom Entity classes must inherit from this one.

Error

ErrorDomains
Libxml2 error domains

ErrorLevels
Libxml2 error levels

ErrorTypes
Libxml2 error types

FallbackElementClassLookup
FallbackElementClassLookup(self, fallback=None)

HTMLParser
HTMLParser(self, recover=True, no_network=True, remove_blank_text=False, compact=True, remove_comments=False, remove_pis=False, target=None, encoding=None, schema=None) The HTML parser.

LxmlError
Main exception base class for lxml.

LxmlRegistryError
Base class of lxml registry errors.

LxmlSyntaxError
Base class for all syntax errors.

NamespaceRegistryError
Error registering a namespace extension.

PIBase
All custom Processing Instruction classes must inherit from this one.

ParseError
Syntax error while parsing an XML document.

ParserBasedElementClassLookup
ParserBasedElementClassLookup(self, fallback=None) Element class lookup based on the XML parser.

ParserError
Internal lxml parser error.

PyErrorLog
PyErrorLog(self, logger_name=None) A global error log that connects to the Python stdlib logging package.

PythonElementClassLookup
PythonElementClassLookup(self, fallback=None) Element class lookup based on a subclass method.

QName
QName(text_or_uri, tag=None)

RelaxNG
RelaxNG(self, etree=None, file=None) Turn a document into a Relax NG validator.

RelaxNGError
Base class for RelaxNG errors.

RelaxNGErrorTypes
Libxml2 RelaxNG error types

RelaxNGParseError
Error while parsing an XML document as RelaxNG.

RelaxNGValidateError
Error while validating an XML document with a RelaxNG schema.

Resolver
This is the base class of all resolvers.

Schematron
Schematron(self, etree=None, file=None) A Schematron validator.

SchematronError
Base class of all Schematron errors.

SchematronParseError
Error while parsing an XML document as Schematron schema.

SchematronValidateError
Error while validating an XML document with a Schematron schema.

SiblingsIterator
SiblingsIterator(self, node, tag=None, preceding=False) Iterates over the siblings of an element.

TreeBuilder
TreeBuilder(self, element_factory=None, parser=None) Parser target that builds a tree.

XInclude
XInclude(self) XInclude processor.

XIncludeError
Error during XInclude processing.

XMLParser
XMLParser(self, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, remove_blank_text=False, compact=True, resolve_entities=True, remove_comments=False, remove_pis=False, target=None, encoding=None, schema=None) The XML parser.

XMLSchema
XMLSchema(self, etree=None, file=None) Turn a document into an XML Schema validator.

XMLSchemaError
Base class of all XML Schema errors

XMLSchemaParseError
Error while parsing an XML document as XML Schema.

XMLSchemaValidateError
Error while validating an XML document with an XML Schema.

XMLSyntaxError
Syntax error while parsing an XML document.

XPath
XPath(self, path, namespaces=None, extensions=None, regexp=True, smart_strings=True) A compiled XPath expression that can be called on Elements and ElementTrees.

XPathDocumentEvaluator
XPathDocumentEvaluator(self, etree, namespaces=None, extensions=None, regexp=True, smart_strings=True) Create an XPath evaluator for an ElementTree.

XPathElementEvaluator
XPathElementEvaluator(self, element, namespaces=None, extensions=None, regexp=True, smart_strings=True) Create an XPath evaluator for an element.

XPathError
Base class of all XPath errors.

XPathEvalError
Error during XPath evaluation.

XPathFunctionError
Internal error looking up an XPath extension function.

XPathResultError
Error handling an XPath result.

XPathSyntaxError

XSLT
XSLT(self, xslt_input, extensions=None, regexp=True, access_control=None)

XSLTAccessControl
XSLTAccessControl(self, read_file=True, write_file=True, create_dir=True, read_network=True, write_network=True)

XSLTApplyError
Error running an XSL transformation.

XSLTError
Base class of all XSLT errors.

XSLTExtension
Base class of an XSLT extension element.

XSLTExtensionError
Error registering an XSLT extension.

XSLTParseError
Error parsing a stylesheet document.

XSLTSaveError
Error serialising an XSLT result.

_AppendOnlyElementProxy
A read-only element that allows adding children and changing the text content (i.e.

_Attrib
A dict-like proxy for the Element.attrib property.

_AttribIterator
Attribute iterator - for internal use only!

_BaseContext

_BaseErrorLog

_BaseParser

_ClassNamespaceRegistry
Dictionary-like registry for namespace implementation classes

_Comment

_Document
Internal base class to reference a libxml document.

_DomainErrorLog

_Element
Element class.

_ElementIterator

_ElementStringResult

_ElementTagMatcher

_ElementTree

_ElementUnicodeResult

_Entity

_ErrorLog

_ExceptionContext

_ExsltRegExp

_FeedParser

_FileReaderContext

_FilelikeWriter

_FunctionNamespaceRegistry

_IDDict
IDDict(self, etree) A dictionary-like proxy class that mapps ID attributes to elements.

_InputDocument

_IterparseContext

_ListErrorLog
Immutable base version of a list based error log.

_LogEntry

_NamespaceRegistry
Dictionary-like namespace registry

_ParserContext

_ParserDictionaryContext

_ParserSchemaValidationContext

_ProcessingInstruction

_PythonSaxParserTarget

_ReadOnlyElementProxy
The main read-only Element proxy class (for internal use only!).

_ResolverContext

_ResolverRegistry

_RotatingErrorLog

_SaxParserContext
This class maps SAX2 events to method calls.

_SaxParserTarget

_TargetParserContext
This class maps SAX2 events to the ET parser target interface.

_TargetParserResult

_TempStore

_Validator
Base class for XML validators.

_XPathContext

_XPathEvaluatorBase

_XPathFunctionNamespaceRegistry

_XSLTContext

_XSLTProcessingInstruction

_XSLTResolverContext

_XSLTResultTree

__ContentOnlyElement

iterparse
iterparse(self, source, events=("end",), tag=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, remove_blank_text=False, remove_comments=False, remove_pis=False, encoding=None, html=False, schema=None)

iterwalk
iterwalk(self, element_or_tree, events=("end",), tag=None)

Functions

[hide private]

Comment(text=None)
Comment element factory.

Element(_tag, attrib=None, nsmap=None, **_extra)
Element factory.

ElementTree(element=None, file=None, parser=None)
ElementTree wrapper class.

Entity(name)
Entity factory.

Extension(module, function_mapping=None, ns=None)
Build a dictionary of extension functions from the functions defined in a module or the methods of an object.

FunctionNamespace(ns_uri)
Retrieve the function namespace object associated with the given URI.

HTML(text, parser=None, base_url=None)
Parses an HTML document from a string constant.

PI(target, text=None)
ProcessingInstruction element factory.

ProcessingInstruction(target, text=None)
ProcessingInstruction element factory.

SubElement(_parent, _tag, attrib=None, nsmap=None, **_extra)
Subelement factory.

XML(text, parser=None, base_url=None)
Parses an XML document from a string constant.

XMLDTDID(text)
Parse the text and return a tuple (root node, ID dictionary).

XMLID(text)
Parse the text and return a tuple (root node, ID dictionary).

XPathEvaluator(etree_or_element, namespaces=None, extensions=None, regexp=True, smart_strings=True)
Creates an XPath evaluator for an ElementTree or an Element.

cleanup_namespaces(tree_or_element)
Remove all namespace declarations from a subtree that are not used by any of the elements in that tree.

clear_error_log()
Clear the global error log.

dump(elem, pretty_print=True, with_tail=True)
Writes an element tree or element structure to sys.stdout.

fromstring(text, parser=None, base_url=None)
Parses an XML document from a string.

fromstringlist(strings, parser=None)
Parses an XML document from a sequence of strings.

get_default_parser()

iselement(element)
Checks if an object appears to be a valid element object.

parse(source, parser=None, base_url=None)
Return an ElementTree object loaded with source elements.

parseid(source, parser=None)
Parses the source into a tuple containing an ElementTree object and an ID dictionary.

set_default_parser(parser=None)
Set a default parser for the current thread.

set_element_class_lookup(lookup= None)
Set the global default element class lookup method.

tostring(element_or_tree, encoding=None, method="xml", xml_declaration=None, pretty_print=False, with_tail=True)
Serialize an element to an encoded string representation of its XML tree.

tostringlist(element_or_tree, *args, **kwargs)
Serialize an element to an encoded string representation of its XML tree, stored in a list of partial strings.

tounicode(element_or_tree, method="xml", pretty_print=False, with_tail=True)
Serialize an element to the Python unicode representation of its XML tree.

use_global_python_log(log)
Replace the global error log by an etree.PyErrorLog that uses the standard Python logging package.

Variables

[hide private]

DEBUG = 1

LIBXML_COMPILED_VERSION = (2, 6, 32)

LIBXML_VERSION = (2, 6, 32)

LIBXSLT_COMPILED_VERSION = (1, 1, 24)

LIBXSLT_VERSION = (1, 1, 24)

LXML_VERSION = (2, 1, 5, 0)

__pyx_capi__ = {'appendChild': <PyCObject object at 0x908e128>...

Function Details

[hide private]

Comment(text=None)

Comment element factory. This factory function creates a special element that will be serialized as an XML comment.

Element(_tag, attrib=None, nsmap=None, **_extra)

Element factory. This function returns an object implementing the Element interface.

Entity(name)

Entity factory. This factory function creates a special element that will be serialized as an XML entity reference or character reference. Note, however, that entities will not be automatically declared in the document. A document that uses entity references requires a DTD to define the entities.

Extension(module, function_mapping=None, ns=None)

Build a dictionary of extension functions from the functions defined in a module or the methods of an object.

As second argument, you can pass an additional mapping of attribute names to XPath function names, or a list of function names that should be taken.

The ns keyword argument accepts a namespace URI for the XPath functions.

FunctionNamespace(ns_uri)

Retrieve the function namespace object associated with the given URI.

Creates a new one if it does not yet exist. A function namespace can only be used to register extension functions.

HTML(text, parser=None, base_url=None)

Parses an HTML document from a string constant. This function can be used to embed "HTML literals" in Python code.

To override the parser with a different HTMLParser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, ...).

PI(target, text=None)

ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.

ProcessingInstruction(target, text=None)

ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.

SubElement(_parent, _tag, attrib=None, nsmap=None, **_extra)

Subelement factory. This function creates an element instance, and appends it to an existing element.

XML(text, parser=None, base_url=None)

Parses an XML document from a string constant. This function can be used to embed "XML literals" in Python code, like in

>>> root = etree.XML("<root><test/></root>")

To override the parser with a different XMLParser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, ...).

XMLDTDID(text)

Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of ID attributes as defined by the DTD. The elements referenced by the ID are stored as dictionary values.

Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.

XMLID(text)

Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of 'id' attributes. The elements referenced by the ID are stored as dictionary values.

XPathEvaluator(etree_or_element, namespaces=None, extensions=None, regexp=True, smart_strings=True)

Creates an XPath evaluator for an ElementTree or an Element.

The resulting object can be called with an XPath expression as argument and XPath variables provided as keyword arguments.

Additional namespace declarations can be passed with the 'namespace' keyword argument. EXSLT regular expression support can be disabled with the 'regexp' boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass smart_strings=False.

clear_error_log()

Clear the global error log. Note that this log is already bound to a fixed size.

dump(elem, pretty_print=True, with_tail=True)

Writes an element tree or element structure to sys.stdout. This function should be used for debugging only.

fromstring(text, parser=None, base_url=None)

Parses an XML document from a string.

To override the default parser with a different parser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, ...).

fromstringlist(strings, parser=None)

Parses an XML document from a sequence of strings.

To override the default parser with a different parser you can pass it to the parser keyword argument.

parse(source, parser=None, base_url=None)

Return an ElementTree object loaded with source elements. If no parser is provided as second argument, the default parser is used.

The base_url keyword allows setting a URL for the document when parsing from a file-like object. This is needed when looking up external entities (DTD, XInclude, ...) with relative paths.

parseid(source, parser=None)

Parses the source into a tuple containing an ElementTree object and an ID dictionary. If no parser is provided as second argument, the default parser is used.

Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.

set_default_parser(parser=None)

Set a default parser for the current thread. This parser is used globally whenever no parser is supplied to the various parse functions of the lxml API. If this function is called without a parser (or if it is None), the default parser is reset to the original configuration.

Note that the pre-installed default parser is not thread-safe. Avoid the default parser in multi-threaded environments. You can create a separate parser for each thread explicitly or use a parser pool.

tostring(element_or_tree, encoding=None, method="xml", xml_declaration=None, pretty_print=False, with_tail=True)

Serialize an element to an encoded string representation of its XML tree.

Defaults to ASCII encoding without XML declaration. This behaviour can be configured with the keyword arguments 'encoding' (string) and 'xml_declaration' (bool). Note that changing the encoding to a non UTF-8 compatible encoding will enable a declaration by default.

You can also serialise to a Unicode string without declaration by passing the unicode function as encoding.

The keyword argument 'pretty_print' (bool) enables formatted XML.

The keyword argument 'method' selects the output method: 'xml', 'html' or plain 'text'.

You can prevent the tail text of the element from being serialised by passing the boolean with_tail option. This has no impact on the tail text of children, which will always be serialised.

tostringlist(element_or_tree, *args, **kwargs)

Serialize an element to an encoded string representation of its XML tree, stored in a list of partial strings.

This is purely for ElementTree 1.3 compatibility. The result is a single string wrapped in a list.

tounicode(element_or_tree, method="xml", pretty_print=False, with_tail=True)

Serialize an element to the Python unicode representation of its XML tree.

Note that the result does not carry an XML encoding declaration and is therefore not necessarily suited for serialization to byte streams without further treatment.

The boolean keyword argument 'pretty_print' enables formatted XML.

The keyword argument 'method' selects the output method: 'xml', 'html' or plain 'text'.

You can prevent the tail text of the element from being serialised by passing the boolean with_tail option. This has no impact on the tail text of children, which will always be serialised.

Deprecated: use tostring(el, encoding=unicode) instead.

use_global_python_log(log)

Replace the global error log by an etree.PyErrorLog that uses the standard Python logging package.

Note that this disables access to the global error log from exceptions. Parsers, XSLT etc. will continue to provide their normal local error log.

Variables Details

[hide private]

__pyx_capi__

Value:

{'appendChild': <PyCObject object at 0x908e128>,
 'attributeValue': <PyCObject object at 0x9085fb0>,
 'attributeValueFromNsName': <PyCObject object at 0x9085fc8>,
 'callLookupFallback': <PyCObject object at 0x9085ec0>,
 'collectAttributes': <PyCObject object at 0x908e038>,
 'deepcopyNodeToDocument': <PyCObject object at 0x9085de8>,
 'delAttribute': <PyCObject object at 0x908e068>,
 'delAttributeFromNsName': <PyCObject object at 0x908e080>,
...