lxml.etree module

The lxml.etree module implements the extended ElementTree API for XML.

exception lxml.etree.C14NError

Bases: LxmlError

Error during C14N serialisation.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.DTDError

Bases: LxmlError

Base class for DTD errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.DTDParseError

Bases: DTDError

Error while parsing a DTD.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.DTDValidateError

Bases: DTDError

Error while validating an XML document with a DTD.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.DocumentInvalid

Bases: LxmlError

Validation error.

Raised by all document validators when their assertValid(tree) method fails.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.Error

Bases: Exception

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.LxmlError

Bases: Error

Main exception base class for lxml. All other exceptions inherit from this one.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.LxmlRegistryError

Bases: LxmlError

Base class of lxml registry errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.LxmlSyntaxError

Bases: LxmlError, SyntaxError

Base class for all syntax errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
end_lineno

exception end lineno

end_offset

exception end offset

filename

exception filename

lineno

exception lineno

msg

exception msg

offset

exception offset

print_file_and_line

exception print_file_and_line

text

exception text

exception lxml.etree.NamespaceRegistryError

Bases: LxmlRegistryError

Error registering a namespace extension.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.ParseError(message, code, line, column, filename=None)

Bases: LxmlSyntaxError

Syntax error while parsing an XML document.

For compatibility with ElementTree 1.3 and later.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
end_lineno

exception end lineno

end_offset

exception end offset

filename

exception filename

lineno

exception lineno

msg

exception msg

offset

exception offset

property position
print_file_and_line

exception print_file_and_line

text

exception text

exception lxml.etree.ParserError

Bases: LxmlError

Internal lxml parser error.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.RelaxNGError

Bases: LxmlError

Base class for RelaxNG errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.RelaxNGParseError

Bases: RelaxNGError

Error while parsing an XML document as RelaxNG.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.RelaxNGValidateError

Bases: RelaxNGError

Error while validating an XML document with a RelaxNG schema.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.SchematronError

Bases: LxmlError

Base class of all Schematron errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.SchematronParseError

Bases: SchematronError

Error while parsing an XML document as Schematron schema.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.SchematronValidateError

Bases: SchematronError

Error while validating an XML document with a Schematron schema.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.SerialisationError

Bases: LxmlError

A libxml2 error that occurred during serialisation.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XIncludeError

Bases: LxmlError

Error during XInclude processing.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XMLSchemaError

Bases: LxmlError

Base class of all XML Schema errors

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XMLSchemaParseError

Bases: XMLSchemaError

Error while parsing an XML document as XML Schema.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XMLSchemaValidateError

Bases: XMLSchemaError

Error while validating an XML document with an XML Schema.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XMLSyntaxAssertionError(message)

Bases: XMLSyntaxError, AssertionError

An XMLSyntaxError that additionally inherits from AssertionError for ElementTree / backwards compatibility reasons.

This class may get replaced by a plain XMLSyntaxError in a future version.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
end_lineno

exception end lineno

end_offset

exception end offset

filename

exception filename

lineno

exception lineno

msg

exception msg

offset

exception offset

property position
print_file_and_line

exception print_file_and_line

text

exception text

exception lxml.etree.XMLSyntaxError(message, code, line, column, filename=None)

Bases: ParseError

Syntax error while parsing an XML document.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
end_lineno

exception end lineno

end_offset

exception end offset

filename

exception filename

lineno

exception lineno

msg

exception msg

offset

exception offset

property position
print_file_and_line

exception print_file_and_line

text

exception text

exception lxml.etree.XPathError

Bases: LxmlError

Base class of all XPath errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XPathEvalError

Bases: XPathError

Error during XPath evaluation.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XPathFunctionError

Bases: XPathEvalError

Internal error looking up an XPath extension function.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XPathResultError

Bases: XPathEvalError

Error handling an XPath result.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XPathSyntaxError

Bases: LxmlSyntaxError, XPathError

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
end_lineno

exception end lineno

end_offset

exception end offset

filename

exception filename

lineno

exception lineno

msg

exception msg

offset

exception offset

print_file_and_line

exception print_file_and_line

text

exception text

exception lxml.etree.XSLTApplyError

Bases: XSLTError

Error running an XSL transformation.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XSLTError

Bases: LxmlError

Base class of all XSLT errors.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XSLTExtensionError

Bases: XSLTError

Error registering an XSLT extension.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XSLTParseError

Bases: XSLTError

Error parsing a stylesheet document.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree.XSLTSaveError

Bases: XSLTError, SerialisationError

Error serialising an XSLT result.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception lxml.etree._TargetParserResult(result)

Bases: Exception

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
class lxml.etree.AncestorsIterator(self, node, tag=None)

Bases: _ElementMatchIterator

Iterates over the ancestors of an element (from parent to parent).

class lxml.etree.AttributeBasedElementClassLookup(self, attribute_name, class_mapping, fallback=None)

Bases: FallbackElementClassLookup

Checks an attribute of an Element and looks up the value in a class dictionary.

Arguments:
  • attribute name - ‘{ns}name’ style string

  • class mapping - Python dict mapping attribute values to Element classes

  • fallback - optional fallback lookup mechanism

A None key in the class mapping will be checked if the attribute is missing.

set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.C14NWriterTarget

Bases: object

Canonicalization writer target for the XMLParser.

Serialises parse events to XML C14N 2.0.

Configuration options:

  • with_comments: set to true to include comments

  • strip_text: set to true to strip whitespace before and after text content

  • rewrite_prefixes: set to true to replace namespace prefixes by “n{number}”

  • qname_aware_tags: a set of qname aware tag names in which prefixes

    should be replaced in text content

  • qname_aware_attrs: a set of qname aware attribute names in which prefixes

    should be replaced in text content

  • exclude_attrs: a set of attribute names that should not be serialised

  • exclude_tags: a set of tag names that should not be serialised

_iter_namespaces(ns_stack)
close()
comment(text)
data(data)
end(tag)
pi(target, data)
start(tag, attrs)
start_ns(prefix, uri)
class lxml.etree.CDATA(data)

Bases: object

CDATA factory. This factory creates an opaque data object that can be used to set Element text. The usual way to use it is:

>>> el = Element('content')
>>> el.text = CDATA('a string')

>>> print(el.text)
a string
>>> print(tostring(el, encoding="unicode"))
<content><![CDATA[a string]]></content>
class lxml.etree.CommentBase

Bases: _Comment

All custom Comment classes must inherit from this one.

To create an XML Comment instance, use the Comment() factory.

Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Comments must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an _init(self) method that will be called after object creation.

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)
getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib
base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text
class lxml.etree.CustomElementClassLookup(self, fallback=None)

Bases: FallbackElementClassLookup

Element class lookup based on a subclass method.

You can inherit from this class and override the method:

lookup(self, type, doc, namespace, name)

to lookup the element class for a node. Arguments of the method: * type: one of ‘element’, ‘comment’, ‘PI’, ‘entity’ * doc: document that the node is in * namespace: namespace URI of the node (or None for comments/PIs/entities) * name: name of the element/entity, None for comments, target for PIs

If you return None from this method, the fallback will be called.

lookup(self, type, doc, namespace, name)
set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.DTD(self, file=None, external_id=None)

Bases: _Validator

A DTD validator.

Can load from filesystem directly given a filename or file-like object. Alternatively, pass the keyword parameter external_id to load from a catalog.

_append_log_message(domain, type, level, line, message, filename)
_clear_error_log()
assertValid(self, etree)

Raises DocumentInvalid if the document does not comply with the schema.

assert_(self, etree)

Raises AssertionError if the document does not comply with the schema.

elements()
entities()
iterelements()
iterentities()
validate(self, etree)

Validate the document using this schema.

Returns true if document is valid, false if not.

error_log

The log of validation errors and warnings.

external_id
name
system_url
class lxml.etree.DocInfo

Bases: object

Document information provided by parser and DTD.

clear()

Removes DOCTYPE and internal subset from the document.

URL

The source URL of the document (or None if unknown).

doctype

Returns a DOCTYPE declaration string for the document.

encoding

Returns the encoding name as declared by the document.

externalDTD

Returns a DTD validator based on the external subset of the document.

internalDTD

Returns a DTD validator based on the internal subset of the document.

public_id

Public ID of the DOCTYPE.

Mutable. May be set to a valid string or None. If a DTD does not exist, setting this variable (even to None) will create one.

root_name

Returns the name of the root node as defined by the DOCTYPE.

standalone

Returns the standalone flag as declared by the document. The possible values are True (standalone='yes'), False (standalone='no' or flag not provided in the declaration), and None (unknown or no declaration found). Note that a normal truth test on this value will always tell if the standalone flag was set to 'yes' or not.

system_url

System ID of the DOCTYPE.

Mutable. May be set to a valid string or None. If a DTD does not exist, setting this variable (even to None) will create one.

xml_version

Returns the XML version as declared by the document.

class lxml.etree.ETCompatXMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, schema=None, huge_tree=False, remove_blank_text=False, resolve_entities=True, remove_comments=True, remove_pis=True, strip_cdata=True, target=None, compact=True)

Bases: XMLParser

An XML parser with an ElementTree compatible default setup.

See the XMLParser class for details.

This parser has remove_comments and remove_pis enabled by default and thus ignores comments and processing instructions.

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree.ETXPath(self, path, extensions=None, regexp=True, smart_strings=True)

Bases: XPath

Special XPath class that supports the ElementTree {uri} notation for namespaces.

Note that this class does not accept the namespace keyword argument. All namespaces must be passed as part of the path string. Smart strings will be returned for string results unless you pass smart_strings=False.

error_log
path

The literal XPath expression.

class lxml.etree.ElementBase(*children, attrib=None, nsmap=None, **_extra)

Bases: _Element

The public Element class. All custom Element classes must inherit from this one. To create an Element, use the Element() factory.

BIG FAT WARNING: Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Elements must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an _init(self) method that will be called directly after object creation.

Subclasses of this class can be instantiated to create a new Element. By default, the tag name will be the class name and the namespace will be empty. You can modify this with the following class attributes:

  • TAG - the tag name, possibly containing a namespace in Clark notation

  • NAMESPACE - the default namespace URI, unless provided as part of the TAG attribute.

  • HTML - flag if the class is an HTML tag, as opposed to an XML tag. This only applies to un-namespaced tags and defaults to false (i.e. XML).

  • PARSER - the parser that provides the configuration for the newly created document. Providing an HTML parser here will default to creating an HTML element.

In user code, the latter three are commonly inherited in class hierarchies that implement a common namespace.

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, element)

Adds a subelement to the end of this element.

clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)

Gets an element attribute.

getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, element)

Inserts a subelement at the given position in this element

items(self)

Gets element attributes, as a sequence. The attributes are returned in an arbitrary order.

iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)

Gets a list of attribute names. The names are returned in an arbitrary order (just like for an ordinary Python dictionary).

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)

Sets an element attribute. In HTML documents (not XML or XHTML), the value None is allowed and creates an attribute without value (just the attribute name).

values(self)

Gets element attribute values as a sequence of strings. The attributes are returned in an arbitrary order.

xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib

Element attribute dictionary. Where possible, use get(), set(), keys(), values() and items() to access element attributes.

base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag

Element tag

tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text

Text before the first subelement. This is either a string or the value None, if there was no text.

class lxml.etree.ElementChildIterator(self, node, tag=None, reversed=False)

Bases: _ElementMatchIterator

Iterates over the children of an element.

class lxml.etree.ElementClassLookup(self)

Bases: object

Superclass of Element class lookups.

class lxml.etree.ElementDefaultClassLookup(self, element=None, comment=None, pi=None, entity=None)

Bases: ElementClassLookup

Element class lookup scheme that always returns the default Element class.

The keyword arguments element, comment, pi and entity accept the respective Element classes.

comment_class
element_class
entity_class
pi_class
class lxml.etree.ElementDepthFirstIterator(self, node, tag=None, inclusive=True)

Bases: object

Iterates over an element and its sub-elements in document order (depth first pre-order).

Note that this also includes comments, entities and processing instructions. To filter them out, check if the tag property of the returned element is a string (i.e. not None and not a factory function), or pass the Element factory for the tag argument to receive only Elements.

If the optional tag argument is not None, the iterator returns only the elements that match the respective name and namespace.

The optional boolean argument ‘inclusive’ defaults to True and can be set to False to exclude the start element itself.

Note that the behaviour of this iterator is completely undefined if the tree it traverses is modified during iteration.

class lxml.etree.ElementNamespaceClassLookup(self, fallback=None)

Bases: FallbackElementClassLookup

Element class lookup scheme that searches the Element class in the Namespace registry.

Usage:

>>> lookup = ElementNamespaceClassLookup()
>>> ns_elements = lookup.get_namespace("http://schema.org/Movie")
>>> @ns_elements
... class movie(ElementBase):
...     "Element implementation for 'movie' tag (using class name) in schema namespace."
>>> @ns_elements("movie")
... class MovieElement(ElementBase):
...     "Element implementation for 'movie' tag (explicit tag name) in schema namespace."
get_namespace(self, ns_uri)

Retrieve the namespace object associated with the given URI. Pass None for the empty namespace.

Creates a new namespace object if it does not yet exist.

set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.ElementTextIterator(self, element, tag=None, with_tail=True)

Bases: object

Iterates over the text content of a subtree.

You can pass the tag keyword argument to restrict text content to a specific tag name.

You can set the with_tail keyword argument to False to skip over tail text (e.g. if you know that it’s only whitespace from pretty-printing).

class lxml.etree.EntityBase

Bases: _Entity

All custom Entity classes must inherit from this one.

To create an XML Entity instance, use the Entity() factory.

Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Entities must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an _init(self) method that will be called after object creation.

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)
getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib
base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

name
nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text
class lxml.etree.ErrorDomains

Bases: object

Libxml2 error domains

_getName(default=None, /)

Return the value for key if key is in the dictionary, else default.

BUFFER = 29
C14N = 21
CATALOG = 20
CHECK = 24
DATATYPE = 15
DTD = 4
FTP = 9
HTML = 5
HTTP = 10
I18N = 27
IO = 8
MEMORY = 6
MODULE = 26
NAMESPACE = 3
NONE = 0
OUTPUT = 7
PARSER = 1
REGEXP = 14
RELAXNGP = 18
RELAXNGV = 19
SCHEMASP = 16
SCHEMASV = 17
SCHEMATRONV = 28
TREE = 2
URI = 30
VALID = 23
WRITER = 25
XINCLUDE = 11
XPATH = 12
XPOINTER = 13
XSLT = 22
_names = {0: 'NONE', 1: 'PARSER', 2: 'TREE', 3: 'NAMESPACE', 4: 'DTD', 5: 'HTML', 6: 'MEMORY', 7: 'OUTPUT', 8: 'IO', 9: 'FTP', 10: 'HTTP', 11: 'XINCLUDE', 12: 'XPATH', 13: 'XPOINTER', 14: 'REGEXP', 15: 'DATATYPE', 16: 'SCHEMASP', 17: 'SCHEMASV', 18: 'RELAXNGP', 19: 'RELAXNGV', 20: 'CATALOG', 21: 'C14N', 22: 'XSLT', 23: 'VALID', 24: 'CHECK', 25: 'WRITER', 26: 'MODULE', 27: 'I18N', 28: 'SCHEMATRONV', 29: 'BUFFER', 30: 'URI'}
class lxml.etree.ErrorLevels

Bases: object

Libxml2 error levels

_getName(default=None, /)

Return the value for key if key is in the dictionary, else default.

ERROR = 2
FATAL = 3
NONE = 0
WARNING = 1
_names = {0: 'NONE', 1: 'WARNING', 2: 'ERROR', 3: 'FATAL'}
class lxml.etree.ErrorTypes

Bases: object

Libxml2 error types

_getName(default=None, /)

Return the value for key if key is in the dictionary, else default.

BUF_OVERFLOW = 7000
C14N_CREATE_CTXT = 1950
C14N_CREATE_STACK = 1952
C14N_INVALID_NODE = 1953
C14N_RELATIVE_NAMESPACE = 1955
C14N_REQUIRES_UTF8 = 1951
C14N_UNKNOW_NODE = 1954
CATALOG_ENTRY_BROKEN = 1651
CATALOG_MISSING_ATTR = 1650
CATALOG_NOT_CATALOG = 1653
CATALOG_PREFER_VALUE = 1652
CATALOG_RECURSION = 1654
CHECK_ENTITY_TYPE = 5012
CHECK_FOUND_ATTRIBUTE = 5001
CHECK_FOUND_CDATA = 5003
CHECK_FOUND_COMMENT = 5007
CHECK_FOUND_DOCTYPE = 5008
CHECK_FOUND_ELEMENT = 5000
CHECK_FOUND_ENTITY = 5005
CHECK_FOUND_ENTITYREF = 5004
CHECK_FOUND_FRAGMENT = 5009
CHECK_FOUND_NOTATION = 5010
CHECK_FOUND_PI = 5006
CHECK_FOUND_TEXT = 5002
CHECK_NAME_NOT_NULL = 5037
CHECK_NOT_ATTR = 5023
CHECK_NOT_ATTR_DECL = 5024
CHECK_NOT_DTD = 5022
CHECK_NOT_ELEM_DECL = 5025
CHECK_NOT_ENTITY_DECL = 5026
CHECK_NOT_NCNAME = 5034
CHECK_NOT_NS_DECL = 5027
CHECK_NOT_UTF8 = 5032
CHECK_NO_DICT = 5033
CHECK_NO_DOC = 5014
CHECK_NO_ELEM = 5016
CHECK_NO_HREF = 5028
CHECK_NO_NAME = 5015
CHECK_NO_NEXT = 5020
CHECK_NO_PARENT = 5013
CHECK_NO_PREV = 5018
CHECK_NS_ANCESTOR = 5031
CHECK_NS_SCOPE = 5030
CHECK_OUTSIDE_DICT = 5035
CHECK_UNKNOWN_NODE = 5011
CHECK_WRONG_DOC = 5017
CHECK_WRONG_NAME = 5036
CHECK_WRONG_NEXT = 5021
CHECK_WRONG_PARENT = 5029
CHECK_WRONG_PREV = 5019
DTD_ATTRIBUTE_DEFAULT = 500
DTD_ATTRIBUTE_REDEFINED = 501
DTD_ATTRIBUTE_VALUE = 502
DTD_CONTENT_ERROR = 503
DTD_CONTENT_MODEL = 504
DTD_CONTENT_NOT_DETERMINIST = 505
DTD_DIFFERENT_PREFIX = 506
DTD_DUP_TOKEN = 541
DTD_ELEM_DEFAULT_NAMESPACE = 507
DTD_ELEM_NAMESPACE = 508
DTD_ELEM_REDEFINED = 509
DTD_EMPTY_NOTATION = 510
DTD_ENTITY_TYPE = 511
DTD_ID_FIXED = 512
DTD_ID_REDEFINED = 513
DTD_ID_SUBSET = 514
DTD_INVALID_CHILD = 515
DTD_INVALID_DEFAULT = 516
DTD_LOAD_ERROR = 517
DTD_MISSING_ATTRIBUTE = 518
DTD_MIXED_CORRUPT = 519
DTD_MULTIPLE_ID = 520
DTD_NOTATION_REDEFINED = 526
DTD_NOTATION_VALUE = 527
DTD_NOT_EMPTY = 528
DTD_NOT_PCDATA = 529
DTD_NOT_STANDALONE = 530
DTD_NO_DOC = 521
DTD_NO_DTD = 522
DTD_NO_ELEM_NAME = 523
DTD_NO_PREFIX = 524
DTD_NO_ROOT = 525
DTD_ROOT_NAME = 531
DTD_STANDALONE_DEFAULTED = 538
DTD_STANDALONE_WHITE_SPACE = 532
DTD_UNKNOWN_ATTRIBUTE = 533
DTD_UNKNOWN_ELEM = 534
DTD_UNKNOWN_ENTITY = 535
DTD_UNKNOWN_ID = 536
DTD_UNKNOWN_NOTATION = 537
DTD_XMLID_TYPE = 540
DTD_XMLID_VALUE = 539
ERR_ATTLIST_NOT_FINISHED = 51
ERR_ATTLIST_NOT_STARTED = 50
ERR_ATTRIBUTE_NOT_FINISHED = 40
ERR_ATTRIBUTE_NOT_STARTED = 39
ERR_ATTRIBUTE_REDEFINED = 42
ERR_ATTRIBUTE_WITHOUT_VALUE = 41
ERR_CDATA_NOT_FINISHED = 63
ERR_CHARREF_AT_EOF = 10
ERR_CHARREF_IN_DTD = 13
ERR_CHARREF_IN_EPILOG = 12
ERR_CHARREF_IN_PROLOG = 11
ERR_COMMENT_ABRUPTLY_ENDED = 112
ERR_COMMENT_NOT_FINISHED = 45
ERR_CONDSEC_INVALID = 83
ERR_CONDSEC_INVALID_KEYWORD = 95
ERR_CONDSEC_NOT_FINISHED = 59
ERR_CONDSEC_NOT_STARTED = 58
ERR_DOCTYPE_NOT_FINISHED = 61
ERR_DOCUMENT_EMPTY = 4
ERR_DOCUMENT_END = 5
ERR_DOCUMENT_START = 3
ERR_ELEMCONTENT_NOT_FINISHED = 55
ERR_ELEMCONTENT_NOT_STARTED = 54
ERR_ENCODING_NAME = 79
ERR_ENTITYREF_AT_EOF = 14
ERR_ENTITYREF_IN_DTD = 17
ERR_ENTITYREF_IN_EPILOG = 16
ERR_ENTITYREF_IN_PROLOG = 15
ERR_ENTITYREF_NO_NAME = 22
ERR_ENTITYREF_SEMICOL_MISSING = 23
ERR_ENTITY_BOUNDARY = 90
ERR_ENTITY_CHAR_ERROR = 87
ERR_ENTITY_IS_EXTERNAL = 29
ERR_ENTITY_IS_PARAMETER = 30
ERR_ENTITY_LOOP = 89
ERR_ENTITY_NOT_FINISHED = 37
ERR_ENTITY_NOT_STARTED = 36
ERR_ENTITY_PE_INTERNAL = 88
ERR_ENTITY_PROCESSING = 104
ERR_EQUAL_REQUIRED = 75
ERR_EXTRA_CONTENT = 86
ERR_EXT_ENTITY_STANDALONE = 82
ERR_EXT_SUBSET_NOT_FINISHED = 60
ERR_GT_REQUIRED = 73
ERR_HYPHEN_IN_COMMENT = 80
ERR_INTERNAL_ERROR = 1
ERR_INVALID_CHAR = 9
ERR_INVALID_CHARREF = 8
ERR_INVALID_DEC_CHARREF = 7
ERR_INVALID_ENCODING = 81
ERR_INVALID_HEX_CHARREF = 6
ERR_INVALID_URI = 91
ERR_LITERAL_NOT_FINISHED = 44
ERR_LITERAL_NOT_STARTED = 43
ERR_LTSLASH_REQUIRED = 74
ERR_LT_IN_ATTRIBUTE = 38
ERR_LT_REQUIRED = 72
ERR_MISPLACED_CDATA_END = 62
ERR_MISSING_ENCODING = 101
ERR_MIXED_NOT_FINISHED = 53
ERR_MIXED_NOT_STARTED = 52
ERR_NAME_REQUIRED = 68
ERR_NAME_TOO_LONG = 110
ERR_NMTOKEN_REQUIRED = 67
ERR_NOTATION_NOT_FINISHED = 49
ERR_NOTATION_NOT_STARTED = 48
ERR_NOTATION_PROCESSING = 105
ERR_NOT_STANDALONE = 103
ERR_NOT_WELL_BALANCED = 85
ERR_NO_DTD = 94
ERR_NO_MEMORY = 2
ERR_NS_DECL_ERROR = 35
ERR_OK = 0
ERR_PCDATA_REQUIRED = 69
ERR_PEREF_AT_EOF = 18
ERR_PEREF_IN_EPILOG = 20
ERR_PEREF_IN_INT_SUBSET = 21
ERR_PEREF_IN_PROLOG = 19
ERR_PEREF_NO_NAME = 24
ERR_PEREF_SEMICOL_MISSING = 25
ERR_PI_NOT_FINISHED = 47
ERR_PI_NOT_STARTED = 46
ERR_PUBID_REQUIRED = 71
ERR_RESERVED_XML_NAME = 64
ERR_SEPARATOR_REQUIRED = 66
ERR_SPACE_REQUIRED = 65
ERR_STANDALONE_VALUE = 78
ERR_STRING_NOT_CLOSED = 34
ERR_STRING_NOT_STARTED = 33
ERR_TAG_NAME_MISMATCH = 76
ERR_TAG_NOT_FINISHED = 77
ERR_UNDECLARED_ENTITY = 26
ERR_UNKNOWN_ENCODING = 31
ERR_UNKNOWN_VERSION = 108
ERR_UNPARSED_ENTITY = 28
ERR_UNSUPPORTED_ENCODING = 32
ERR_URI_FRAGMENT = 92
ERR_URI_REQUIRED = 70
ERR_USER_STOP = 111
ERR_VALUE_REQUIRED = 84
ERR_VERSION_MISMATCH = 109
ERR_VERSION_MISSING = 96
ERR_XMLDECL_NOT_FINISHED = 57
ERR_XMLDECL_NOT_STARTED = 56
FTP_ACCNT = 2002
FTP_EPSV_ANSWER = 2001
FTP_PASV_ANSWER = 2000
FTP_URL_SYNTAX = 2003
HTML_STRUCURE_ERROR = 800
HTML_UNKNOWN_TAG = 801
HTTP_UNKNOWN_HOST = 2022
HTTP_URL_SYNTAX = 2020
HTTP_USE_IP = 2021
I18N_CONV_FAILED = 6003
I18N_EXCESS_HANDLER = 6002
I18N_NO_HANDLER = 6001
I18N_NO_NAME = 6000
I18N_NO_OUTPUT = 6004
IO_BUFFER_FULL = 1548
IO_EACCES = 1501
IO_EADDRINUSE = 1554
IO_EAFNOSUPPORT = 1556
IO_EAGAIN = 1502
IO_EALREADY = 1555
IO_EBADF = 1503
IO_EBADMSG = 1504
IO_EBUSY = 1505
IO_ECANCELED = 1506
IO_ECHILD = 1507
IO_ECONNREFUSED = 1552
IO_EDEADLK = 1508
IO_EDOM = 1509
IO_EEXIST = 1510
IO_EFAULT = 1511
IO_EFBIG = 1512
IO_EINPROGRESS = 1513
IO_EINTR = 1514
IO_EINVAL = 1515
IO_EIO = 1516
IO_EISCONN = 1551
IO_EISDIR = 1517
IO_EMFILE = 1518
IO_EMSGSIZE = 1520
IO_ENAMETOOLONG = 1521
IO_ENCODER = 1544
IO_ENETUNREACH = 1553
IO_ENFILE = 1522
IO_ENODEV = 1523
IO_ENOENT = 1524
IO_ENOEXEC = 1525
IO_ENOLCK = 1526
IO_ENOMEM = 1527
IO_ENOSPC = 1528
IO_ENOSYS = 1529
IO_ENOTDIR = 1530
IO_ENOTEMPTY = 1531
IO_ENOTSOCK = 1550
IO_ENOTSUP = 1532
IO_ENOTTY = 1533
IO_ENXIO = 1534
IO_EPERM = 1535
IO_EPIPE = 1536
IO_ERANGE = 1537
IO_EROFS = 1538
IO_ESPIPE = 1539
IO_ESRCH = 1540
IO_ETIMEDOUT = 1541
IO_EXDEV = 1542
IO_FLUSH = 1545
IO_LOAD_ERROR = 1549
IO_NETWORK_ATTEMPT = 1543
IO_NO_INPUT = 1547
IO_UNKNOWN = 1500
IO_WRITE = 1546
MODULE_CLOSE = 4901
MODULE_OPEN = 4900
NS_ERR_ATTRIBUTE_REDEFINED = 203
NS_ERR_COLON = 205
NS_ERR_EMPTY = 204
NS_ERR_QNAME = 202
NS_ERR_UNDEFINED_NAMESPACE = 201
NS_ERR_XML_NAMESPACE = 200
REGEXP_COMPILE_ERROR = 1450
RNGP_ANYNAME_ATTR_ANCESTOR = 1000
RNGP_ATTRIBUTE_CHILDREN = 1002
RNGP_ATTRIBUTE_CONTENT = 1003
RNGP_ATTRIBUTE_EMPTY = 1004
RNGP_ATTRIBUTE_NOOP = 1005
RNGP_ATTR_CONFLICT = 1001
RNGP_CHOICE_CONTENT = 1006
RNGP_CHOICE_EMPTY = 1007
RNGP_CREATE_FAILURE = 1008
RNGP_DATA_CONTENT = 1009
RNGP_DEFINE_CREATE_FAILED = 1011
RNGP_DEFINE_EMPTY = 1012
RNGP_DEFINE_MISSING = 1013
RNGP_DEFINE_NAME_MISSING = 1014
RNGP_DEF_CHOICE_AND_INTERLEAVE = 1010
RNGP_ELEMENT_CONTENT = 1018
RNGP_ELEMENT_EMPTY = 1017
RNGP_ELEMENT_NAME = 1019
RNGP_ELEMENT_NO_CONTENT = 1020
RNGP_ELEM_CONTENT_EMPTY = 1015
RNGP_ELEM_CONTENT_ERROR = 1016
RNGP_ELEM_TEXT_CONFLICT = 1021
RNGP_EMPTY = 1022
RNGP_EMPTY_CONSTRUCT = 1023
RNGP_EMPTY_CONTENT = 1024
RNGP_EMPTY_NOT_EMPTY = 1025
RNGP_ERROR_TYPE_LIB = 1026
RNGP_EXCEPT_EMPTY = 1027
RNGP_EXCEPT_MISSING = 1028
RNGP_EXCEPT_MULTIPLE = 1029
RNGP_EXCEPT_NO_CONTENT = 1030
RNGP_EXTERNALREF_EMTPY = 1031
RNGP_EXTERNALREF_RECURSE = 1033
RNGP_EXTERNAL_REF_FAILURE = 1032
RNGP_FORBIDDEN_ATTRIBUTE = 1034
RNGP_FOREIGN_ELEMENT = 1035
RNGP_GRAMMAR_CONTENT = 1036
RNGP_GRAMMAR_EMPTY = 1037
RNGP_GRAMMAR_MISSING = 1038
RNGP_GRAMMAR_NO_START = 1039
RNGP_GROUP_ATTR_CONFLICT = 1040
RNGP_HREF_ERROR = 1041
RNGP_INCLUDE_EMPTY = 1042
RNGP_INCLUDE_FAILURE = 1043
RNGP_INCLUDE_RECURSE = 1044
RNGP_INTERLEAVE_ADD = 1045
RNGP_INTERLEAVE_CREATE_FAILED = 1046
RNGP_INTERLEAVE_EMPTY = 1047
RNGP_INTERLEAVE_NO_CONTENT = 1048
RNGP_INVALID_DEFINE_NAME = 1049
RNGP_INVALID_URI = 1050
RNGP_INVALID_VALUE = 1051
RNGP_MISSING_HREF = 1052
RNGP_NAME_MISSING = 1053
RNGP_NEED_COMBINE = 1054
RNGP_NOTALLOWED_NOT_EMPTY = 1055
RNGP_NSNAME_ATTR_ANCESTOR = 1056
RNGP_NSNAME_NO_NS = 1057
RNGP_PARAM_FORBIDDEN = 1058
RNGP_PARAM_NAME_MISSING = 1059
RNGP_PARENTREF_CREATE_FAILED = 1060
RNGP_PARENTREF_NAME_INVALID = 1061
RNGP_PARENTREF_NOT_EMPTY = 1064
RNGP_PARENTREF_NO_NAME = 1062
RNGP_PARENTREF_NO_PARENT = 1063
RNGP_PARSE_ERROR = 1065
RNGP_PAT_ANYNAME_EXCEPT_ANYNAME = 1066
RNGP_PAT_ATTR_ATTR = 1067
RNGP_PAT_ATTR_ELEM = 1068
RNGP_PAT_DATA_EXCEPT_ATTR = 1069
RNGP_PAT_DATA_EXCEPT_ELEM = 1070
RNGP_PAT_DATA_EXCEPT_EMPTY = 1071
RNGP_PAT_DATA_EXCEPT_GROUP = 1072
RNGP_PAT_DATA_EXCEPT_INTERLEAVE = 1073
RNGP_PAT_DATA_EXCEPT_LIST = 1074
RNGP_PAT_DATA_EXCEPT_ONEMORE = 1075
RNGP_PAT_DATA_EXCEPT_REF = 1076
RNGP_PAT_DATA_EXCEPT_TEXT = 1077
RNGP_PAT_LIST_ATTR = 1078
RNGP_PAT_LIST_ELEM = 1079
RNGP_PAT_LIST_INTERLEAVE = 1080
RNGP_PAT_LIST_LIST = 1081
RNGP_PAT_LIST_REF = 1082
RNGP_PAT_LIST_TEXT = 1083
RNGP_PAT_NSNAME_EXCEPT_ANYNAME = 1084
RNGP_PAT_NSNAME_EXCEPT_NSNAME = 1085
RNGP_PAT_ONEMORE_GROUP_ATTR = 1086
RNGP_PAT_ONEMORE_INTERLEAVE_ATTR = 1087
RNGP_PAT_START_ATTR = 1088
RNGP_PAT_START_DATA = 1089
RNGP_PAT_START_EMPTY = 1090
RNGP_PAT_START_GROUP = 1091
RNGP_PAT_START_INTERLEAVE = 1092
RNGP_PAT_START_LIST = 1093
RNGP_PAT_START_ONEMORE = 1094
RNGP_PAT_START_TEXT = 1095
RNGP_PAT_START_VALUE = 1096
RNGP_PREFIX_UNDEFINED = 1097
RNGP_REF_CREATE_FAILED = 1098
RNGP_REF_CYCLE = 1099
RNGP_REF_NAME_INVALID = 1100
RNGP_REF_NOT_EMPTY = 1103
RNGP_REF_NO_DEF = 1101
RNGP_REF_NO_NAME = 1102
RNGP_START_CHOICE_AND_INTERLEAVE = 1104
RNGP_START_CONTENT = 1105
RNGP_START_EMPTY = 1106
RNGP_START_MISSING = 1107
RNGP_TEXT_EXPECTED = 1108
RNGP_TEXT_HAS_CHILD = 1109
RNGP_TYPE_MISSING = 1110
RNGP_TYPE_NOT_FOUND = 1111
RNGP_TYPE_VALUE = 1112
RNGP_UNKNOWN_ATTRIBUTE = 1113
RNGP_UNKNOWN_COMBINE = 1114
RNGP_UNKNOWN_CONSTRUCT = 1115
RNGP_UNKNOWN_TYPE_LIB = 1116
RNGP_URI_FRAGMENT = 1117
RNGP_URI_NOT_ABSOLUTE = 1118
RNGP_VALUE_EMPTY = 1119
RNGP_VALUE_NO_CONTENT = 1120
RNGP_XMLNS_NAME = 1121
RNGP_XML_NS = 1122
SAVE_CHAR_INVALID = 1401
SAVE_NOT_UTF8 = 1400
SAVE_NO_DOCTYPE = 1402
SAVE_UNKNOWN_ENCODING = 1403
SCHEMAP_AG_PROPS_CORRECT = 3087
SCHEMAP_ATTRFORMDEFAULT_VALUE = 1701
SCHEMAP_ATTRGRP_NONAME_NOREF = 1702
SCHEMAP_ATTR_NONAME_NOREF = 1703
SCHEMAP_AU_PROPS_CORRECT = 3089
SCHEMAP_AU_PROPS_CORRECT_2 = 3078
SCHEMAP_A_PROPS_CORRECT_2 = 3079
SCHEMAP_A_PROPS_CORRECT_3 = 3090
SCHEMAP_COMPLEXTYPE_NONAME_NOREF = 1704
SCHEMAP_COS_ALL_LIMITED = 3091
SCHEMAP_COS_CT_EXTENDS_1_1 = 3063
SCHEMAP_COS_CT_EXTENDS_1_2 = 3088
SCHEMAP_COS_CT_EXTENDS_1_3 = 1800
SCHEMAP_COS_ST_DERIVED_OK_2_1 = 3031
SCHEMAP_COS_ST_DERIVED_OK_2_2 = 3032
SCHEMAP_COS_ST_RESTRICTS_1_1 = 3011
SCHEMAP_COS_ST_RESTRICTS_1_2 = 3012
SCHEMAP_COS_ST_RESTRICTS_1_3_1 = 3013
SCHEMAP_COS_ST_RESTRICTS_1_3_2 = 3014
SCHEMAP_COS_ST_RESTRICTS_2_1 = 3015
SCHEMAP_COS_ST_RESTRICTS_2_3_1_1 = 3016
SCHEMAP_COS_ST_RESTRICTS_2_3_1_2 = 3017
SCHEMAP_COS_ST_RESTRICTS_2_3_2_1 = 3018
SCHEMAP_COS_ST_RESTRICTS_2_3_2_2 = 3019
SCHEMAP_COS_ST_RESTRICTS_2_3_2_3 = 3020
SCHEMAP_COS_ST_RESTRICTS_2_3_2_4 = 3021
SCHEMAP_COS_ST_RESTRICTS_2_3_2_5 = 3022
SCHEMAP_COS_ST_RESTRICTS_3_1 = 3023
SCHEMAP_COS_ST_RESTRICTS_3_3_1 = 3024
SCHEMAP_COS_ST_RESTRICTS_3_3_1_2 = 3025
SCHEMAP_COS_ST_RESTRICTS_3_3_2_1 = 3027
SCHEMAP_COS_ST_RESTRICTS_3_3_2_2 = 3026
SCHEMAP_COS_ST_RESTRICTS_3_3_2_3 = 3028
SCHEMAP_COS_ST_RESTRICTS_3_3_2_4 = 3029
SCHEMAP_COS_ST_RESTRICTS_3_3_2_5 = 3030
SCHEMAP_COS_VALID_DEFAULT_1 = 3058
SCHEMAP_COS_VALID_DEFAULT_2_1 = 3059
SCHEMAP_COS_VALID_DEFAULT_2_2_1 = 3060
SCHEMAP_COS_VALID_DEFAULT_2_2_2 = 3061
SCHEMAP_CT_PROPS_CORRECT_1 = 1782
SCHEMAP_CT_PROPS_CORRECT_2 = 1783
SCHEMAP_CT_PROPS_CORRECT_3 = 1784
SCHEMAP_CT_PROPS_CORRECT_4 = 1785
SCHEMAP_CT_PROPS_CORRECT_5 = 1786
SCHEMAP_CVC_SIMPLE_TYPE = 3062
SCHEMAP_C_PROPS_CORRECT = 3080
SCHEMAP_DEF_AND_PREFIX = 1768
SCHEMAP_DERIVATION_OK_RESTRICTION_1 = 1787
SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_1 = 1788
SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_2 = 1789
SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_3 = 3077
SCHEMAP_DERIVATION_OK_RESTRICTION_2_2 = 1790
SCHEMAP_DERIVATION_OK_RESTRICTION_3 = 1791
SCHEMAP_DERIVATION_OK_RESTRICTION_4_1 = 1797
SCHEMAP_DERIVATION_OK_RESTRICTION_4_2 = 1798
SCHEMAP_DERIVATION_OK_RESTRICTION_4_3 = 1799
SCHEMAP_ELEMFORMDEFAULT_VALUE = 1705
SCHEMAP_ELEM_DEFAULT_FIXED = 1755
SCHEMAP_ELEM_NONAME_NOREF = 1706
SCHEMAP_EXTENSION_NO_BASE = 1707
SCHEMAP_E_PROPS_CORRECT_2 = 3045
SCHEMAP_E_PROPS_CORRECT_3 = 3046
SCHEMAP_E_PROPS_CORRECT_4 = 3047
SCHEMAP_E_PROPS_CORRECT_5 = 3048
SCHEMAP_E_PROPS_CORRECT_6 = 3049
SCHEMAP_FACET_NO_VALUE = 1708
SCHEMAP_FAILED_BUILD_IMPORT = 1709
SCHEMAP_FAILED_LOAD = 1757
SCHEMAP_FAILED_PARSE = 1766
SCHEMAP_GROUP_NONAME_NOREF = 1710
SCHEMAP_IMPORT_NAMESPACE_NOT_URI = 1711
SCHEMAP_IMPORT_REDEFINE_NSNAME = 1712
SCHEMAP_IMPORT_SCHEMA_NOT_URI = 1713
SCHEMAP_INCLUDE_SCHEMA_NOT_URI = 1770
SCHEMAP_INCLUDE_SCHEMA_NO_URI = 1771
SCHEMAP_INTERNAL = 3069
SCHEMAP_INTERSECTION_NOT_EXPRESSIBLE = 1793
SCHEMAP_INVALID_ATTR_COMBINATION = 1777
SCHEMAP_INVALID_ATTR_INLINE_COMBINATION = 1778
SCHEMAP_INVALID_ATTR_NAME = 1780
SCHEMAP_INVALID_ATTR_USE = 1774
SCHEMAP_INVALID_BOOLEAN = 1714
SCHEMAP_INVALID_ENUM = 1715
SCHEMAP_INVALID_FACET = 1716
SCHEMAP_INVALID_FACET_VALUE = 1717
SCHEMAP_INVALID_MAXOCCURS = 1718
SCHEMAP_INVALID_MINOCCURS = 1719
SCHEMAP_INVALID_REF_AND_SUBTYPE = 1720
SCHEMAP_INVALID_WHITE_SPACE = 1721
SCHEMAP_MG_PROPS_CORRECT_1 = 3074
SCHEMAP_MG_PROPS_CORRECT_2 = 3075
SCHEMAP_MISSING_SIMPLETYPE_CHILD = 1779
SCHEMAP_NOATTR_NOREF = 1722
SCHEMAP_NOROOT = 1759
SCHEMAP_NOTATION_NO_NAME = 1723
SCHEMAP_NOTHING_TO_PARSE = 1758
SCHEMAP_NOTYPE_NOREF = 1724
SCHEMAP_NOT_DETERMINISTIC = 3070
SCHEMAP_NOT_SCHEMA = 1772
SCHEMAP_NO_XMLNS = 3056
SCHEMAP_NO_XSI = 3057
SCHEMAP_PREFIX_UNDEFINED = 1700
SCHEMAP_P_PROPS_CORRECT_1 = 3042
SCHEMAP_P_PROPS_CORRECT_2_1 = 3043
SCHEMAP_P_PROPS_CORRECT_2_2 = 3044
SCHEMAP_RECURSIVE = 1775
SCHEMAP_REDEFINED_ATTR = 1764
SCHEMAP_REDEFINED_ATTRGROUP = 1763
SCHEMAP_REDEFINED_ELEMENT = 1762
SCHEMAP_REDEFINED_GROUP = 1760
SCHEMAP_REDEFINED_NOTATION = 1765
SCHEMAP_REDEFINED_TYPE = 1761
SCHEMAP_REF_AND_CONTENT = 1781
SCHEMAP_REF_AND_SUBTYPE = 1725
SCHEMAP_REGEXP_INVALID = 1756
SCHEMAP_RESTRICTION_NONAME_NOREF = 1726
SCHEMAP_S4S_ATTR_INVALID_VALUE = 3037
SCHEMAP_S4S_ATTR_MISSING = 3036
SCHEMAP_S4S_ATTR_NOT_ALLOWED = 3035
SCHEMAP_S4S_ELEM_MISSING = 3034
SCHEMAP_S4S_ELEM_NOT_ALLOWED = 3033
SCHEMAP_SIMPLETYPE_NONAME = 1727
SCHEMAP_SRC_ATTRIBUTE_1 = 3051
SCHEMAP_SRC_ATTRIBUTE_2 = 3052
SCHEMAP_SRC_ATTRIBUTE_3_1 = 3053
SCHEMAP_SRC_ATTRIBUTE_3_2 = 3054
SCHEMAP_SRC_ATTRIBUTE_4 = 3055
SCHEMAP_SRC_ATTRIBUTE_GROUP_1 = 3071
SCHEMAP_SRC_ATTRIBUTE_GROUP_2 = 3072
SCHEMAP_SRC_ATTRIBUTE_GROUP_3 = 3073
SCHEMAP_SRC_CT_1 = 3076
SCHEMAP_SRC_ELEMENT_1 = 3038
SCHEMAP_SRC_ELEMENT_2_1 = 3039
SCHEMAP_SRC_ELEMENT_2_2 = 3040
SCHEMAP_SRC_ELEMENT_3 = 3041
SCHEMAP_SRC_IMPORT = 3082
SCHEMAP_SRC_IMPORT_1_1 = 3064
SCHEMAP_SRC_IMPORT_1_2 = 3065
SCHEMAP_SRC_IMPORT_2 = 3066
SCHEMAP_SRC_IMPORT_2_1 = 3067
SCHEMAP_SRC_IMPORT_2_2 = 3068
SCHEMAP_SRC_IMPORT_3_1 = 1795
SCHEMAP_SRC_IMPORT_3_2 = 1796
SCHEMAP_SRC_INCLUDE = 3050
SCHEMAP_SRC_LIST_ITEMTYPE_OR_SIMPLETYPE = 3006
SCHEMAP_SRC_REDEFINE = 3081
SCHEMAP_SRC_RESOLVE = 3004
SCHEMAP_SRC_RESTRICTION_BASE_OR_SIMPLETYPE = 3005
SCHEMAP_SRC_SIMPLE_TYPE_1 = 3000
SCHEMAP_SRC_SIMPLE_TYPE_2 = 3001
SCHEMAP_SRC_SIMPLE_TYPE_3 = 3002
SCHEMAP_SRC_SIMPLE_TYPE_4 = 3003
SCHEMAP_SRC_UNION_MEMBERTYPES_OR_SIMPLETYPES = 3007
SCHEMAP_ST_PROPS_CORRECT_1 = 3008
SCHEMAP_ST_PROPS_CORRECT_2 = 3009
SCHEMAP_ST_PROPS_CORRECT_3 = 3010
SCHEMAP_SUPERNUMEROUS_LIST_ITEM_TYPE = 1776
SCHEMAP_TYPE_AND_SUBTYPE = 1728
SCHEMAP_UNION_NOT_EXPRESSIBLE = 1794
SCHEMAP_UNKNOWN_ALL_CHILD = 1729
SCHEMAP_UNKNOWN_ANYATTRIBUTE_CHILD = 1730
SCHEMAP_UNKNOWN_ATTRGRP_CHILD = 1732
SCHEMAP_UNKNOWN_ATTRIBUTE_GROUP = 1733
SCHEMAP_UNKNOWN_ATTR_CHILD = 1731
SCHEMAP_UNKNOWN_BASE_TYPE = 1734
SCHEMAP_UNKNOWN_CHOICE_CHILD = 1735
SCHEMAP_UNKNOWN_COMPLEXCONTENT_CHILD = 1736
SCHEMAP_UNKNOWN_COMPLEXTYPE_CHILD = 1737
SCHEMAP_UNKNOWN_ELEM_CHILD = 1738
SCHEMAP_UNKNOWN_EXTENSION_CHILD = 1739
SCHEMAP_UNKNOWN_FACET_CHILD = 1740
SCHEMAP_UNKNOWN_FACET_TYPE = 1741
SCHEMAP_UNKNOWN_GROUP_CHILD = 1742
SCHEMAP_UNKNOWN_IMPORT_CHILD = 1743
SCHEMAP_UNKNOWN_INCLUDE_CHILD = 1769
SCHEMAP_UNKNOWN_LIST_CHILD = 1744
SCHEMAP_UNKNOWN_MEMBER_TYPE = 1773
SCHEMAP_UNKNOWN_NOTATION_CHILD = 1745
SCHEMAP_UNKNOWN_PREFIX = 1767
SCHEMAP_UNKNOWN_PROCESSCONTENT_CHILD = 1746
SCHEMAP_UNKNOWN_REF = 1747
SCHEMAP_UNKNOWN_RESTRICTION_CHILD = 1748
SCHEMAP_UNKNOWN_SCHEMAS_CHILD = 1749
SCHEMAP_UNKNOWN_SEQUENCE_CHILD = 1750
SCHEMAP_UNKNOWN_SIMPLECONTENT_CHILD = 1751
SCHEMAP_UNKNOWN_SIMPLETYPE_CHILD = 1752
SCHEMAP_UNKNOWN_TYPE = 1753
SCHEMAP_UNKNOWN_UNION_CHILD = 1754
SCHEMAP_WARN_ATTR_POINTLESS_PROH = 3086
SCHEMAP_WARN_ATTR_REDECL_PROH = 3085
SCHEMAP_WARN_SKIP_SCHEMA = 3083
SCHEMAP_WARN_UNLOCATED_SCHEMA = 3084
SCHEMAP_WILDCARD_INVALID_NS_MEMBER = 1792
SCHEMATRONV_ASSERT = 4000
SCHEMATRONV_REPORT = 4001
SCHEMAV_ATTRINVALID = 1821
SCHEMAV_ATTRUNKNOWN = 1820
SCHEMAV_CONSTRUCT = 1817
SCHEMAV_CVC_ATTRIBUTE_1 = 1861
SCHEMAV_CVC_ATTRIBUTE_2 = 1862
SCHEMAV_CVC_ATTRIBUTE_3 = 1863
SCHEMAV_CVC_ATTRIBUTE_4 = 1864
SCHEMAV_CVC_AU = 1874
SCHEMAV_CVC_COMPLEX_TYPE_1 = 1873
SCHEMAV_CVC_COMPLEX_TYPE_2_1 = 1841
SCHEMAV_CVC_COMPLEX_TYPE_2_2 = 1842
SCHEMAV_CVC_COMPLEX_TYPE_2_3 = 1843
SCHEMAV_CVC_COMPLEX_TYPE_2_4 = 1844
SCHEMAV_CVC_COMPLEX_TYPE_3_1 = 1865
SCHEMAV_CVC_COMPLEX_TYPE_3_2_1 = 1866
SCHEMAV_CVC_COMPLEX_TYPE_3_2_2 = 1867
SCHEMAV_CVC_COMPLEX_TYPE_4 = 1868
SCHEMAV_CVC_COMPLEX_TYPE_5_1 = 1869
SCHEMAV_CVC_COMPLEX_TYPE_5_2 = 1870
SCHEMAV_CVC_DATATYPE_VALID_1_2_1 = 1824
SCHEMAV_CVC_DATATYPE_VALID_1_2_2 = 1825
SCHEMAV_CVC_DATATYPE_VALID_1_2_3 = 1826
SCHEMAV_CVC_ELT_1 = 1845
SCHEMAV_CVC_ELT_2 = 1846
SCHEMAV_CVC_ELT_3_1 = 1847
SCHEMAV_CVC_ELT_3_2_1 = 1848
SCHEMAV_CVC_ELT_3_2_2 = 1849
SCHEMAV_CVC_ELT_4_1 = 1850
SCHEMAV_CVC_ELT_4_2 = 1851
SCHEMAV_CVC_ELT_4_3 = 1852
SCHEMAV_CVC_ELT_5_1_1 = 1853
SCHEMAV_CVC_ELT_5_1_2 = 1854
SCHEMAV_CVC_ELT_5_2_1 = 1855
SCHEMAV_CVC_ELT_5_2_2_1 = 1856
SCHEMAV_CVC_ELT_5_2_2_2_1 = 1857
SCHEMAV_CVC_ELT_5_2_2_2_2 = 1858
SCHEMAV_CVC_ELT_6 = 1859
SCHEMAV_CVC_ELT_7 = 1860
SCHEMAV_CVC_ENUMERATION_VALID = 1840
SCHEMAV_CVC_FACET_VALID = 1829
SCHEMAV_CVC_FRACTIONDIGITS_VALID = 1838
SCHEMAV_CVC_IDC = 1877
SCHEMAV_CVC_LENGTH_VALID = 1830
SCHEMAV_CVC_MAXEXCLUSIVE_VALID = 1836
SCHEMAV_CVC_MAXINCLUSIVE_VALID = 1834
SCHEMAV_CVC_MAXLENGTH_VALID = 1832
SCHEMAV_CVC_MINEXCLUSIVE_VALID = 1835
SCHEMAV_CVC_MININCLUSIVE_VALID = 1833
SCHEMAV_CVC_MINLENGTH_VALID = 1831
SCHEMAV_CVC_PATTERN_VALID = 1839
SCHEMAV_CVC_TOTALDIGITS_VALID = 1837
SCHEMAV_CVC_TYPE_1 = 1875
SCHEMAV_CVC_TYPE_2 = 1876
SCHEMAV_CVC_TYPE_3_1_1 = 1827
SCHEMAV_CVC_TYPE_3_1_2 = 1828
SCHEMAV_CVC_WILDCARD = 1878
SCHEMAV_DOCUMENT_ELEMENT_MISSING = 1872
SCHEMAV_ELEMCONT = 1810
SCHEMAV_ELEMENT_CONTENT = 1871
SCHEMAV_EXTRACONTENT = 1813
SCHEMAV_FACET = 1823
SCHEMAV_HAVEDEFAULT = 1811
SCHEMAV_INTERNAL = 1818
SCHEMAV_INVALIDATTR = 1814
SCHEMAV_INVALIDELEM = 1815
SCHEMAV_ISABSTRACT = 1808
SCHEMAV_MISC = 1879
SCHEMAV_MISSING = 1804
SCHEMAV_NOROLLBACK = 1807
SCHEMAV_NOROOT = 1801
SCHEMAV_NOTDETERMINIST = 1816
SCHEMAV_NOTEMPTY = 1809
SCHEMAV_NOTNILLABLE = 1812
SCHEMAV_NOTSIMPLE = 1819
SCHEMAV_NOTTOPLEVEL = 1803
SCHEMAV_NOTYPE = 1806
SCHEMAV_UNDECLAREDELEM = 1802
SCHEMAV_VALUE = 1822
SCHEMAV_WRONGELEM = 1805
TREE_INVALID_DEC = 1301
TREE_INVALID_HEX = 1300
TREE_NOT_UTF8 = 1303
TREE_UNTERMINATED_ENTITY = 1302
WAR_CATALOG_PI = 93
WAR_ENTITY_REDEFINED = 107
WAR_LANG_VALUE = 98
WAR_NS_COLUMN = 106
WAR_NS_URI = 99
WAR_NS_URI_RELATIVE = 100
WAR_SPACE_VALUE = 102
WAR_UNDECLARED_ENTITY = 27
WAR_UNKNOWN_VERSION = 97
XINCLUDE_BUILD_FAILED = 1609
XINCLUDE_DEPRECATED_NS = 1617
XINCLUDE_ENTITY_DEF_MISMATCH = 1602
XINCLUDE_FALLBACKS_IN_INCLUDE = 1615
XINCLUDE_FALLBACK_NOT_IN_INCLUDE = 1616
XINCLUDE_FRAGMENT_ID = 1618
XINCLUDE_HREF_URI = 1605
XINCLUDE_INCLUDE_IN_INCLUDE = 1614
XINCLUDE_INVALID_CHAR = 1608
XINCLUDE_MULTIPLE_ROOT = 1611
XINCLUDE_NO_FALLBACK = 1604
XINCLUDE_NO_HREF = 1603
XINCLUDE_PARSE_VALUE = 1601
XINCLUDE_RECURSION = 1600
XINCLUDE_TEXT_DOCUMENT = 1607
XINCLUDE_TEXT_FRAGMENT = 1606
XINCLUDE_UNKNOWN_ENCODING = 1610
XINCLUDE_XPTR_FAILED = 1612
XINCLUDE_XPTR_RESULT = 1613
XPATH_ENCODING_ERROR = 1220
XPATH_EXPRESSION_OK = 1200
XPATH_EXPR_ERROR = 1207
XPATH_INVALID_ARITY = 1212
XPATH_INVALID_CHAR_ERROR = 1221
XPATH_INVALID_CTXT_POSITION = 1214
XPATH_INVALID_CTXT_SIZE = 1213
XPATH_INVALID_OPERAND = 1210
XPATH_INVALID_PREDICATE_ERROR = 1206
XPATH_INVALID_TYPE = 1211
XPATH_MEMORY_ERROR = 1215
XPATH_NUMBER_ERROR = 1201
XPATH_START_LITERAL_ERROR = 1203
XPATH_UNCLOSED_ERROR = 1208
XPATH_UNDEF_PREFIX_ERROR = 1219
XPATH_UNDEF_VARIABLE_ERROR = 1205
XPATH_UNFINISHED_LITERAL_ERROR = 1202
XPATH_UNKNOWN_FUNC_ERROR = 1209
XPATH_VARIABLE_REF_ERROR = 1204
XPTR_CHILDSEQ_START = 1901
XPTR_EVAL_FAILED = 1902
XPTR_EXTRA_OBJECTS = 1903
XPTR_RESOURCE_ERROR = 1217
XPTR_SUB_RESOURCE_ERROR = 1218
XPTR_SYNTAX_ERROR = 1216
XPTR_UNKNOWN_SCHEME = 1900
_names = {0: 'ERR_OK', 1: 'ERR_INTERNAL_ERROR', 2: 'ERR_NO_MEMORY', 3: 'ERR_DOCUMENT_START', 4: 'ERR_DOCUMENT_EMPTY', 5: 'ERR_DOCUMENT_END', 6: 'ERR_INVALID_HEX_CHARREF', 7: 'ERR_INVALID_DEC_CHARREF', 8: 'ERR_INVALID_CHARREF', 9: 'ERR_INVALID_CHAR', 10: 'ERR_CHARREF_AT_EOF', 11: 'ERR_CHARREF_IN_PROLOG', 12: 'ERR_CHARREF_IN_EPILOG', 13: 'ERR_CHARREF_IN_DTD', 14: 'ERR_ENTITYREF_AT_EOF', 15: 'ERR_ENTITYREF_IN_PROLOG', 16: 'ERR_ENTITYREF_IN_EPILOG', 17: 'ERR_ENTITYREF_IN_DTD', 18: 'ERR_PEREF_AT_EOF', 19: 'ERR_PEREF_IN_PROLOG', 20: 'ERR_PEREF_IN_EPILOG', 21: 'ERR_PEREF_IN_INT_SUBSET', 22: 'ERR_ENTITYREF_NO_NAME', 23: 'ERR_ENTITYREF_SEMICOL_MISSING', 24: 'ERR_PEREF_NO_NAME', 25: 'ERR_PEREF_SEMICOL_MISSING', 26: 'ERR_UNDECLARED_ENTITY', 27: 'WAR_UNDECLARED_ENTITY', 28: 'ERR_UNPARSED_ENTITY', 29: 'ERR_ENTITY_IS_EXTERNAL', 30: 'ERR_ENTITY_IS_PARAMETER', 31: 'ERR_UNKNOWN_ENCODING', 32: 'ERR_UNSUPPORTED_ENCODING', 33: 'ERR_STRING_NOT_STARTED', 34: 'ERR_STRING_NOT_CLOSED', 35: 'ERR_NS_DECL_ERROR', 36: 'ERR_ENTITY_NOT_STARTED', 37: 'ERR_ENTITY_NOT_FINISHED', 38: 'ERR_LT_IN_ATTRIBUTE', 39: 'ERR_ATTRIBUTE_NOT_STARTED', 40: 'ERR_ATTRIBUTE_NOT_FINISHED', 41: 'ERR_ATTRIBUTE_WITHOUT_VALUE', 42: 'ERR_ATTRIBUTE_REDEFINED', 43: 'ERR_LITERAL_NOT_STARTED', 44: 'ERR_LITERAL_NOT_FINISHED', 45: 'ERR_COMMENT_NOT_FINISHED', 46: 'ERR_PI_NOT_STARTED', 47: 'ERR_PI_NOT_FINISHED', 48: 'ERR_NOTATION_NOT_STARTED', 49: 'ERR_NOTATION_NOT_FINISHED', 50: 'ERR_ATTLIST_NOT_STARTED', 51: 'ERR_ATTLIST_NOT_FINISHED', 52: 'ERR_MIXED_NOT_STARTED', 53: 'ERR_MIXED_NOT_FINISHED', 54: 'ERR_ELEMCONTENT_NOT_STARTED', 55: 'ERR_ELEMCONTENT_NOT_FINISHED', 56: 'ERR_XMLDECL_NOT_STARTED', 57: 'ERR_XMLDECL_NOT_FINISHED', 58: 'ERR_CONDSEC_NOT_STARTED', 59: 'ERR_CONDSEC_NOT_FINISHED', 60: 'ERR_EXT_SUBSET_NOT_FINISHED', 61: 'ERR_DOCTYPE_NOT_FINISHED', 62: 'ERR_MISPLACED_CDATA_END', 63: 'ERR_CDATA_NOT_FINISHED', 64: 'ERR_RESERVED_XML_NAME', 65: 'ERR_SPACE_REQUIRED', 66: 'ERR_SEPARATOR_REQUIRED', 67: 'ERR_NMTOKEN_REQUIRED', 68: 'ERR_NAME_REQUIRED', 69: 'ERR_PCDATA_REQUIRED', 70: 'ERR_URI_REQUIRED', 71: 'ERR_PUBID_REQUIRED', 72: 'ERR_LT_REQUIRED', 73: 'ERR_GT_REQUIRED', 74: 'ERR_LTSLASH_REQUIRED', 75: 'ERR_EQUAL_REQUIRED', 76: 'ERR_TAG_NAME_MISMATCH', 77: 'ERR_TAG_NOT_FINISHED', 78: 'ERR_STANDALONE_VALUE', 79: 'ERR_ENCODING_NAME', 80: 'ERR_HYPHEN_IN_COMMENT', 81: 'ERR_INVALID_ENCODING', 82: 'ERR_EXT_ENTITY_STANDALONE', 83: 'ERR_CONDSEC_INVALID', 84: 'ERR_VALUE_REQUIRED', 85: 'ERR_NOT_WELL_BALANCED', 86: 'ERR_EXTRA_CONTENT', 87: 'ERR_ENTITY_CHAR_ERROR', 88: 'ERR_ENTITY_PE_INTERNAL', 89: 'ERR_ENTITY_LOOP', 90: 'ERR_ENTITY_BOUNDARY', 91: 'ERR_INVALID_URI', 92: 'ERR_URI_FRAGMENT', 93: 'WAR_CATALOG_PI', 94: 'ERR_NO_DTD', 95: 'ERR_CONDSEC_INVALID_KEYWORD', 96: 'ERR_VERSION_MISSING', 97: 'WAR_UNKNOWN_VERSION', 98: 'WAR_LANG_VALUE', 99: 'WAR_NS_URI', 100: 'WAR_NS_URI_RELATIVE', 101: 'ERR_MISSING_ENCODING', 102: 'WAR_SPACE_VALUE', 103: 'ERR_NOT_STANDALONE', 104: 'ERR_ENTITY_PROCESSING', 105: 'ERR_NOTATION_PROCESSING', 106: 'WAR_NS_COLUMN', 107: 'WAR_ENTITY_REDEFINED', 108: 'ERR_UNKNOWN_VERSION', 109: 'ERR_VERSION_MISMATCH', 110: 'ERR_NAME_TOO_LONG', 111: 'ERR_USER_STOP', 112: 'ERR_COMMENT_ABRUPTLY_ENDED', 200: 'NS_ERR_XML_NAMESPACE', 201: 'NS_ERR_UNDEFINED_NAMESPACE', 202: 'NS_ERR_QNAME', 203: 'NS_ERR_ATTRIBUTE_REDEFINED', 204: 'NS_ERR_EMPTY', 205: 'NS_ERR_COLON', 500: 'DTD_ATTRIBUTE_DEFAULT', 501: 'DTD_ATTRIBUTE_REDEFINED', 502: 'DTD_ATTRIBUTE_VALUE', 503: 'DTD_CONTENT_ERROR', 504: 'DTD_CONTENT_MODEL', 505: 'DTD_CONTENT_NOT_DETERMINIST', 506: 'DTD_DIFFERENT_PREFIX', 507: 'DTD_ELEM_DEFAULT_NAMESPACE', 508: 'DTD_ELEM_NAMESPACE', 509: 'DTD_ELEM_REDEFINED', 510: 'DTD_EMPTY_NOTATION', 511: 'DTD_ENTITY_TYPE', 512: 'DTD_ID_FIXED', 513: 'DTD_ID_REDEFINED', 514: 'DTD_ID_SUBSET', 515: 'DTD_INVALID_CHILD', 516: 'DTD_INVALID_DEFAULT', 517: 'DTD_LOAD_ERROR', 518: 'DTD_MISSING_ATTRIBUTE', 519: 'DTD_MIXED_CORRUPT', 520: 'DTD_MULTIPLE_ID', 521: 'DTD_NO_DOC', 522: 'DTD_NO_DTD', 523: 'DTD_NO_ELEM_NAME', 524: 'DTD_NO_PREFIX', 525: 'DTD_NO_ROOT', 526: 'DTD_NOTATION_REDEFINED', 527: 'DTD_NOTATION_VALUE', 528: 'DTD_NOT_EMPTY', 529: 'DTD_NOT_PCDATA', 530: 'DTD_NOT_STANDALONE', 531: 'DTD_ROOT_NAME', 532: 'DTD_STANDALONE_WHITE_SPACE', 533: 'DTD_UNKNOWN_ATTRIBUTE', 534: 'DTD_UNKNOWN_ELEM', 535: 'DTD_UNKNOWN_ENTITY', 536: 'DTD_UNKNOWN_ID', 537: 'DTD_UNKNOWN_NOTATION', 538: 'DTD_STANDALONE_DEFAULTED', 539: 'DTD_XMLID_VALUE', 540: 'DTD_XMLID_TYPE', 541: 'DTD_DUP_TOKEN', 800: 'HTML_STRUCURE_ERROR', 801: 'HTML_UNKNOWN_TAG', 1000: 'RNGP_ANYNAME_ATTR_ANCESTOR', 1001: 'RNGP_ATTR_CONFLICT', 1002: 'RNGP_ATTRIBUTE_CHILDREN', 1003: 'RNGP_ATTRIBUTE_CONTENT', 1004: 'RNGP_ATTRIBUTE_EMPTY', 1005: 'RNGP_ATTRIBUTE_NOOP', 1006: 'RNGP_CHOICE_CONTENT', 1007: 'RNGP_CHOICE_EMPTY', 1008: 'RNGP_CREATE_FAILURE', 1009: 'RNGP_DATA_CONTENT', 1010: 'RNGP_DEF_CHOICE_AND_INTERLEAVE', 1011: 'RNGP_DEFINE_CREATE_FAILED', 1012: 'RNGP_DEFINE_EMPTY', 1013: 'RNGP_DEFINE_MISSING', 1014: 'RNGP_DEFINE_NAME_MISSING', 1015: 'RNGP_ELEM_CONTENT_EMPTY', 1016: 'RNGP_ELEM_CONTENT_ERROR', 1017: 'RNGP_ELEMENT_EMPTY', 1018: 'RNGP_ELEMENT_CONTENT', 1019: 'RNGP_ELEMENT_NAME', 1020: 'RNGP_ELEMENT_NO_CONTENT', 1021: 'RNGP_ELEM_TEXT_CONFLICT', 1022: 'RNGP_EMPTY', 1023: 'RNGP_EMPTY_CONSTRUCT', 1024: 'RNGP_EMPTY_CONTENT', 1025: 'RNGP_EMPTY_NOT_EMPTY', 1026: 'RNGP_ERROR_TYPE_LIB', 1027: 'RNGP_EXCEPT_EMPTY', 1028: 'RNGP_EXCEPT_MISSING', 1029: 'RNGP_EXCEPT_MULTIPLE', 1030: 'RNGP_EXCEPT_NO_CONTENT', 1031: 'RNGP_EXTERNALREF_EMTPY', 1032: 'RNGP_EXTERNAL_REF_FAILURE', 1033: 'RNGP_EXTERNALREF_RECURSE', 1034: 'RNGP_FORBIDDEN_ATTRIBUTE', 1035: 'RNGP_FOREIGN_ELEMENT', 1036: 'RNGP_GRAMMAR_CONTENT', 1037: 'RNGP_GRAMMAR_EMPTY', 1038: 'RNGP_GRAMMAR_MISSING', 1039: 'RNGP_GRAMMAR_NO_START', 1040: 'RNGP_GROUP_ATTR_CONFLICT', 1041: 'RNGP_HREF_ERROR', 1042: 'RNGP_INCLUDE_EMPTY', 1043: 'RNGP_INCLUDE_FAILURE', 1044: 'RNGP_INCLUDE_RECURSE', 1045: 'RNGP_INTERLEAVE_ADD', 1046: 'RNGP_INTERLEAVE_CREATE_FAILED', 1047: 'RNGP_INTERLEAVE_EMPTY', 1048: 'RNGP_INTERLEAVE_NO_CONTENT', 1049: 'RNGP_INVALID_DEFINE_NAME', 1050: 'RNGP_INVALID_URI', 1051: 'RNGP_INVALID_VALUE', 1052: 'RNGP_MISSING_HREF', 1053: 'RNGP_NAME_MISSING', 1054: 'RNGP_NEED_COMBINE', 1055: 'RNGP_NOTALLOWED_NOT_EMPTY', 1056: 'RNGP_NSNAME_ATTR_ANCESTOR', 1057: 'RNGP_NSNAME_NO_NS', 1058: 'RNGP_PARAM_FORBIDDEN', 1059: 'RNGP_PARAM_NAME_MISSING', 1060: 'RNGP_PARENTREF_CREATE_FAILED', 1061: 'RNGP_PARENTREF_NAME_INVALID', 1062: 'RNGP_PARENTREF_NO_NAME', 1063: 'RNGP_PARENTREF_NO_PARENT', 1064: 'RNGP_PARENTREF_NOT_EMPTY', 1065: 'RNGP_PARSE_ERROR', 1066: 'RNGP_PAT_ANYNAME_EXCEPT_ANYNAME', 1067: 'RNGP_PAT_ATTR_ATTR', 1068: 'RNGP_PAT_ATTR_ELEM', 1069: 'RNGP_PAT_DATA_EXCEPT_ATTR', 1070: 'RNGP_PAT_DATA_EXCEPT_ELEM', 1071: 'RNGP_PAT_DATA_EXCEPT_EMPTY', 1072: 'RNGP_PAT_DATA_EXCEPT_GROUP', 1073: 'RNGP_PAT_DATA_EXCEPT_INTERLEAVE', 1074: 'RNGP_PAT_DATA_EXCEPT_LIST', 1075: 'RNGP_PAT_DATA_EXCEPT_ONEMORE', 1076: 'RNGP_PAT_DATA_EXCEPT_REF', 1077: 'RNGP_PAT_DATA_EXCEPT_TEXT', 1078: 'RNGP_PAT_LIST_ATTR', 1079: 'RNGP_PAT_LIST_ELEM', 1080: 'RNGP_PAT_LIST_INTERLEAVE', 1081: 'RNGP_PAT_LIST_LIST', 1082: 'RNGP_PAT_LIST_REF', 1083: 'RNGP_PAT_LIST_TEXT', 1084: 'RNGP_PAT_NSNAME_EXCEPT_ANYNAME', 1085: 'RNGP_PAT_NSNAME_EXCEPT_NSNAME', 1086: 'RNGP_PAT_ONEMORE_GROUP_ATTR', 1087: 'RNGP_PAT_ONEMORE_INTERLEAVE_ATTR', 1088: 'RNGP_PAT_START_ATTR', 1089: 'RNGP_PAT_START_DATA', 1090: 'RNGP_PAT_START_EMPTY', 1091: 'RNGP_PAT_START_GROUP', 1092: 'RNGP_PAT_START_INTERLEAVE', 1093: 'RNGP_PAT_START_LIST', 1094: 'RNGP_PAT_START_ONEMORE', 1095: 'RNGP_PAT_START_TEXT', 1096: 'RNGP_PAT_START_VALUE', 1097: 'RNGP_PREFIX_UNDEFINED', 1098: 'RNGP_REF_CREATE_FAILED', 1099: 'RNGP_REF_CYCLE', 1100: 'RNGP_REF_NAME_INVALID', 1101: 'RNGP_REF_NO_DEF', 1102: 'RNGP_REF_NO_NAME', 1103: 'RNGP_REF_NOT_EMPTY', 1104: 'RNGP_START_CHOICE_AND_INTERLEAVE', 1105: 'RNGP_START_CONTENT', 1106: 'RNGP_START_EMPTY', 1107: 'RNGP_START_MISSING', 1108: 'RNGP_TEXT_EXPECTED', 1109: 'RNGP_TEXT_HAS_CHILD', 1110: 'RNGP_TYPE_MISSING', 1111: 'RNGP_TYPE_NOT_FOUND', 1112: 'RNGP_TYPE_VALUE', 1113: 'RNGP_UNKNOWN_ATTRIBUTE', 1114: 'RNGP_UNKNOWN_COMBINE', 1115: 'RNGP_UNKNOWN_CONSTRUCT', 1116: 'RNGP_UNKNOWN_TYPE_LIB', 1117: 'RNGP_URI_FRAGMENT', 1118: 'RNGP_URI_NOT_ABSOLUTE', 1119: 'RNGP_VALUE_EMPTY', 1120: 'RNGP_VALUE_NO_CONTENT', 1121: 'RNGP_XMLNS_NAME', 1122: 'RNGP_XML_NS', 1200: 'XPATH_EXPRESSION_OK', 1201: 'XPATH_NUMBER_ERROR', 1202: 'XPATH_UNFINISHED_LITERAL_ERROR', 1203: 'XPATH_START_LITERAL_ERROR', 1204: 'XPATH_VARIABLE_REF_ERROR', 1205: 'XPATH_UNDEF_VARIABLE_ERROR', 1206: 'XPATH_INVALID_PREDICATE_ERROR', 1207: 'XPATH_EXPR_ERROR', 1208: 'XPATH_UNCLOSED_ERROR', 1209: 'XPATH_UNKNOWN_FUNC_ERROR', 1210: 'XPATH_INVALID_OPERAND', 1211: 'XPATH_INVALID_TYPE', 1212: 'XPATH_INVALID_ARITY', 1213: 'XPATH_INVALID_CTXT_SIZE', 1214: 'XPATH_INVALID_CTXT_POSITION', 1215: 'XPATH_MEMORY_ERROR', 1216: 'XPTR_SYNTAX_ERROR', 1217: 'XPTR_RESOURCE_ERROR', 1218: 'XPTR_SUB_RESOURCE_ERROR', 1219: 'XPATH_UNDEF_PREFIX_ERROR', 1220: 'XPATH_ENCODING_ERROR', 1221: 'XPATH_INVALID_CHAR_ERROR', 1300: 'TREE_INVALID_HEX', 1301: 'TREE_INVALID_DEC', 1302: 'TREE_UNTERMINATED_ENTITY', 1303: 'TREE_NOT_UTF8', 1400: 'SAVE_NOT_UTF8', 1401: 'SAVE_CHAR_INVALID', 1402: 'SAVE_NO_DOCTYPE', 1403: 'SAVE_UNKNOWN_ENCODING', 1450: 'REGEXP_COMPILE_ERROR', 1500: 'IO_UNKNOWN', 1501: 'IO_EACCES', 1502: 'IO_EAGAIN', 1503: 'IO_EBADF', 1504: 'IO_EBADMSG', 1505: 'IO_EBUSY', 1506: 'IO_ECANCELED', 1507: 'IO_ECHILD', 1508: 'IO_EDEADLK', 1509: 'IO_EDOM', 1510: 'IO_EEXIST', 1511: 'IO_EFAULT', 1512: 'IO_EFBIG', 1513: 'IO_EINPROGRESS', 1514: 'IO_EINTR', 1515: 'IO_EINVAL', 1516: 'IO_EIO', 1517: 'IO_EISDIR', 1518: 'IO_EMFILE', 1519: 'IO_EMLINK', 1520: 'IO_EMSGSIZE', 1521: 'IO_ENAMETOOLONG', 1522: 'IO_ENFILE', 1523: 'IO_ENODEV', 1524: 'IO_ENOENT', 1525: 'IO_ENOEXEC', 1526: 'IO_ENOLCK', 1527: 'IO_ENOMEM', 1528: 'IO_ENOSPC', 1529: 'IO_ENOSYS', 1530: 'IO_ENOTDIR', 1531: 'IO_ENOTEMPTY', 1532: 'IO_ENOTSUP', 1533: 'IO_ENOTTY', 1534: 'IO_ENXIO', 1535: 'IO_EPERM', 1536: 'IO_EPIPE', 1537: 'IO_ERANGE', 1538: 'IO_EROFS', 1539: 'IO_ESPIPE', 1540: 'IO_ESRCH', 1541: 'IO_ETIMEDOUT', 1542: 'IO_EXDEV', 1543: 'IO_NETWORK_ATTEMPT', 1544: 'IO_ENCODER', 1545: 'IO_FLUSH', 1546: 'IO_WRITE', 1547: 'IO_NO_INPUT', 1548: 'IO_BUFFER_FULL', 1549: 'IO_LOAD_ERROR', 1550: 'IO_ENOTSOCK', 1551: 'IO_EISCONN', 1552: 'IO_ECONNREFUSED', 1553: 'IO_ENETUNREACH', 1554: 'IO_EADDRINUSE', 1555: 'IO_EALREADY', 1556: 'IO_EAFNOSUPPORT', 1600: 'XINCLUDE_RECURSION', 1601: 'XINCLUDE_PARSE_VALUE', 1602: 'XINCLUDE_ENTITY_DEF_MISMATCH', 1603: 'XINCLUDE_NO_HREF', 1604: 'XINCLUDE_NO_FALLBACK', 1605: 'XINCLUDE_HREF_URI', 1606: 'XINCLUDE_TEXT_FRAGMENT', 1607: 'XINCLUDE_TEXT_DOCUMENT', 1608: 'XINCLUDE_INVALID_CHAR', 1609: 'XINCLUDE_BUILD_FAILED', 1610: 'XINCLUDE_UNKNOWN_ENCODING', 1611: 'XINCLUDE_MULTIPLE_ROOT', 1612: 'XINCLUDE_XPTR_FAILED', 1613: 'XINCLUDE_XPTR_RESULT', 1614: 'XINCLUDE_INCLUDE_IN_INCLUDE', 1615: 'XINCLUDE_FALLBACKS_IN_INCLUDE', 1616: 'XINCLUDE_FALLBACK_NOT_IN_INCLUDE', 1617: 'XINCLUDE_DEPRECATED_NS', 1618: 'XINCLUDE_FRAGMENT_ID', 1650: 'CATALOG_MISSING_ATTR', 1651: 'CATALOG_ENTRY_BROKEN', 1652: 'CATALOG_PREFER_VALUE', 1653: 'CATALOG_NOT_CATALOG', 1654: 'CATALOG_RECURSION', 1700: 'SCHEMAP_PREFIX_UNDEFINED', 1701: 'SCHEMAP_ATTRFORMDEFAULT_VALUE', 1702: 'SCHEMAP_ATTRGRP_NONAME_NOREF', 1703: 'SCHEMAP_ATTR_NONAME_NOREF', 1704: 'SCHEMAP_COMPLEXTYPE_NONAME_NOREF', 1705: 'SCHEMAP_ELEMFORMDEFAULT_VALUE', 1706: 'SCHEMAP_ELEM_NONAME_NOREF', 1707: 'SCHEMAP_EXTENSION_NO_BASE', 1708: 'SCHEMAP_FACET_NO_VALUE', 1709: 'SCHEMAP_FAILED_BUILD_IMPORT', 1710: 'SCHEMAP_GROUP_NONAME_NOREF', 1711: 'SCHEMAP_IMPORT_NAMESPACE_NOT_URI', 1712: 'SCHEMAP_IMPORT_REDEFINE_NSNAME', 1713: 'SCHEMAP_IMPORT_SCHEMA_NOT_URI', 1714: 'SCHEMAP_INVALID_BOOLEAN', 1715: 'SCHEMAP_INVALID_ENUM', 1716: 'SCHEMAP_INVALID_FACET', 1717: 'SCHEMAP_INVALID_FACET_VALUE', 1718: 'SCHEMAP_INVALID_MAXOCCURS', 1719: 'SCHEMAP_INVALID_MINOCCURS', 1720: 'SCHEMAP_INVALID_REF_AND_SUBTYPE', 1721: 'SCHEMAP_INVALID_WHITE_SPACE', 1722: 'SCHEMAP_NOATTR_NOREF', 1723: 'SCHEMAP_NOTATION_NO_NAME', 1724: 'SCHEMAP_NOTYPE_NOREF', 1725: 'SCHEMAP_REF_AND_SUBTYPE', 1726: 'SCHEMAP_RESTRICTION_NONAME_NOREF', 1727: 'SCHEMAP_SIMPLETYPE_NONAME', 1728: 'SCHEMAP_TYPE_AND_SUBTYPE', 1729: 'SCHEMAP_UNKNOWN_ALL_CHILD', 1730: 'SCHEMAP_UNKNOWN_ANYATTRIBUTE_CHILD', 1731: 'SCHEMAP_UNKNOWN_ATTR_CHILD', 1732: 'SCHEMAP_UNKNOWN_ATTRGRP_CHILD', 1733: 'SCHEMAP_UNKNOWN_ATTRIBUTE_GROUP', 1734: 'SCHEMAP_UNKNOWN_BASE_TYPE', 1735: 'SCHEMAP_UNKNOWN_CHOICE_CHILD', 1736: 'SCHEMAP_UNKNOWN_COMPLEXCONTENT_CHILD', 1737: 'SCHEMAP_UNKNOWN_COMPLEXTYPE_CHILD', 1738: 'SCHEMAP_UNKNOWN_ELEM_CHILD', 1739: 'SCHEMAP_UNKNOWN_EXTENSION_CHILD', 1740: 'SCHEMAP_UNKNOWN_FACET_CHILD', 1741: 'SCHEMAP_UNKNOWN_FACET_TYPE', 1742: 'SCHEMAP_UNKNOWN_GROUP_CHILD', 1743: 'SCHEMAP_UNKNOWN_IMPORT_CHILD', 1744: 'SCHEMAP_UNKNOWN_LIST_CHILD', 1745: 'SCHEMAP_UNKNOWN_NOTATION_CHILD', 1746: 'SCHEMAP_UNKNOWN_PROCESSCONTENT_CHILD', 1747: 'SCHEMAP_UNKNOWN_REF', 1748: 'SCHEMAP_UNKNOWN_RESTRICTION_CHILD', 1749: 'SCHEMAP_UNKNOWN_SCHEMAS_CHILD', 1750: 'SCHEMAP_UNKNOWN_SEQUENCE_CHILD', 1751: 'SCHEMAP_UNKNOWN_SIMPLECONTENT_CHILD', 1752: 'SCHEMAP_UNKNOWN_SIMPLETYPE_CHILD', 1753: 'SCHEMAP_UNKNOWN_TYPE', 1754: 'SCHEMAP_UNKNOWN_UNION_CHILD', 1755: 'SCHEMAP_ELEM_DEFAULT_FIXED', 1756: 'SCHEMAP_REGEXP_INVALID', 1757: 'SCHEMAP_FAILED_LOAD', 1758: 'SCHEMAP_NOTHING_TO_PARSE', 1759: 'SCHEMAP_NOROOT', 1760: 'SCHEMAP_REDEFINED_GROUP', 1761: 'SCHEMAP_REDEFINED_TYPE', 1762: 'SCHEMAP_REDEFINED_ELEMENT', 1763: 'SCHEMAP_REDEFINED_ATTRGROUP', 1764: 'SCHEMAP_REDEFINED_ATTR', 1765: 'SCHEMAP_REDEFINED_NOTATION', 1766: 'SCHEMAP_FAILED_PARSE', 1767: 'SCHEMAP_UNKNOWN_PREFIX', 1768: 'SCHEMAP_DEF_AND_PREFIX', 1769: 'SCHEMAP_UNKNOWN_INCLUDE_CHILD', 1770: 'SCHEMAP_INCLUDE_SCHEMA_NOT_URI', 1771: 'SCHEMAP_INCLUDE_SCHEMA_NO_URI', 1772: 'SCHEMAP_NOT_SCHEMA', 1773: 'SCHEMAP_UNKNOWN_MEMBER_TYPE', 1774: 'SCHEMAP_INVALID_ATTR_USE', 1775: 'SCHEMAP_RECURSIVE', 1776: 'SCHEMAP_SUPERNUMEROUS_LIST_ITEM_TYPE', 1777: 'SCHEMAP_INVALID_ATTR_COMBINATION', 1778: 'SCHEMAP_INVALID_ATTR_INLINE_COMBINATION', 1779: 'SCHEMAP_MISSING_SIMPLETYPE_CHILD', 1780: 'SCHEMAP_INVALID_ATTR_NAME', 1781: 'SCHEMAP_REF_AND_CONTENT', 1782: 'SCHEMAP_CT_PROPS_CORRECT_1', 1783: 'SCHEMAP_CT_PROPS_CORRECT_2', 1784: 'SCHEMAP_CT_PROPS_CORRECT_3', 1785: 'SCHEMAP_CT_PROPS_CORRECT_4', 1786: 'SCHEMAP_CT_PROPS_CORRECT_5', 1787: 'SCHEMAP_DERIVATION_OK_RESTRICTION_1', 1788: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_1', 1789: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_2', 1790: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_2', 1791: 'SCHEMAP_DERIVATION_OK_RESTRICTION_3', 1792: 'SCHEMAP_WILDCARD_INVALID_NS_MEMBER', 1793: 'SCHEMAP_INTERSECTION_NOT_EXPRESSIBLE', 1794: 'SCHEMAP_UNION_NOT_EXPRESSIBLE', 1795: 'SCHEMAP_SRC_IMPORT_3_1', 1796: 'SCHEMAP_SRC_IMPORT_3_2', 1797: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_1', 1798: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_2', 1799: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_3', 1800: 'SCHEMAP_COS_CT_EXTENDS_1_3', 1801: 'SCHEMAV_NOROOT', 1802: 'SCHEMAV_UNDECLAREDELEM', 1803: 'SCHEMAV_NOTTOPLEVEL', 1804: 'SCHEMAV_MISSING', 1805: 'SCHEMAV_WRONGELEM', 1806: 'SCHEMAV_NOTYPE', 1807: 'SCHEMAV_NOROLLBACK', 1808: 'SCHEMAV_ISABSTRACT', 1809: 'SCHEMAV_NOTEMPTY', 1810: 'SCHEMAV_ELEMCONT', 1811: 'SCHEMAV_HAVEDEFAULT', 1812: 'SCHEMAV_NOTNILLABLE', 1813: 'SCHEMAV_EXTRACONTENT', 1814: 'SCHEMAV_INVALIDATTR', 1815: 'SCHEMAV_INVALIDELEM', 1816: 'SCHEMAV_NOTDETERMINIST', 1817: 'SCHEMAV_CONSTRUCT', 1818: 'SCHEMAV_INTERNAL', 1819: 'SCHEMAV_NOTSIMPLE', 1820: 'SCHEMAV_ATTRUNKNOWN', 1821: 'SCHEMAV_ATTRINVALID', 1822: 'SCHEMAV_VALUE', 1823: 'SCHEMAV_FACET', 1824: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_1', 1825: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_2', 1826: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_3', 1827: 'SCHEMAV_CVC_TYPE_3_1_1', 1828: 'SCHEMAV_CVC_TYPE_3_1_2', 1829: 'SCHEMAV_CVC_FACET_VALID', 1830: 'SCHEMAV_CVC_LENGTH_VALID', 1831: 'SCHEMAV_CVC_MINLENGTH_VALID', 1832: 'SCHEMAV_CVC_MAXLENGTH_VALID', 1833: 'SCHEMAV_CVC_MININCLUSIVE_VALID', 1834: 'SCHEMAV_CVC_MAXINCLUSIVE_VALID', 1835: 'SCHEMAV_CVC_MINEXCLUSIVE_VALID', 1836: 'SCHEMAV_CVC_MAXEXCLUSIVE_VALID', 1837: 'SCHEMAV_CVC_TOTALDIGITS_VALID', 1838: 'SCHEMAV_CVC_FRACTIONDIGITS_VALID', 1839: 'SCHEMAV_CVC_PATTERN_VALID', 1840: 'SCHEMAV_CVC_ENUMERATION_VALID', 1841: 'SCHEMAV_CVC_COMPLEX_TYPE_2_1', 1842: 'SCHEMAV_CVC_COMPLEX_TYPE_2_2', 1843: 'SCHEMAV_CVC_COMPLEX_TYPE_2_3', 1844: 'SCHEMAV_CVC_COMPLEX_TYPE_2_4', 1845: 'SCHEMAV_CVC_ELT_1', 1846: 'SCHEMAV_CVC_ELT_2', 1847: 'SCHEMAV_CVC_ELT_3_1', 1848: 'SCHEMAV_CVC_ELT_3_2_1', 1849: 'SCHEMAV_CVC_ELT_3_2_2', 1850: 'SCHEMAV_CVC_ELT_4_1', 1851: 'SCHEMAV_CVC_ELT_4_2', 1852: 'SCHEMAV_CVC_ELT_4_3', 1853: 'SCHEMAV_CVC_ELT_5_1_1', 1854: 'SCHEMAV_CVC_ELT_5_1_2', 1855: 'SCHEMAV_CVC_ELT_5_2_1', 1856: 'SCHEMAV_CVC_ELT_5_2_2_1', 1857: 'SCHEMAV_CVC_ELT_5_2_2_2_1', 1858: 'SCHEMAV_CVC_ELT_5_2_2_2_2', 1859: 'SCHEMAV_CVC_ELT_6', 1860: 'SCHEMAV_CVC_ELT_7', 1861: 'SCHEMAV_CVC_ATTRIBUTE_1', 1862: 'SCHEMAV_CVC_ATTRIBUTE_2', 1863: 'SCHEMAV_CVC_ATTRIBUTE_3', 1864: 'SCHEMAV_CVC_ATTRIBUTE_4', 1865: 'SCHEMAV_CVC_COMPLEX_TYPE_3_1', 1866: 'SCHEMAV_CVC_COMPLEX_TYPE_3_2_1', 1867: 'SCHEMAV_CVC_COMPLEX_TYPE_3_2_2', 1868: 'SCHEMAV_CVC_COMPLEX_TYPE_4', 1869: 'SCHEMAV_CVC_COMPLEX_TYPE_5_1', 1870: 'SCHEMAV_CVC_COMPLEX_TYPE_5_2', 1871: 'SCHEMAV_ELEMENT_CONTENT', 1872: 'SCHEMAV_DOCUMENT_ELEMENT_MISSING', 1873: 'SCHEMAV_CVC_COMPLEX_TYPE_1', 1874: 'SCHEMAV_CVC_AU', 1875: 'SCHEMAV_CVC_TYPE_1', 1876: 'SCHEMAV_CVC_TYPE_2', 1877: 'SCHEMAV_CVC_IDC', 1878: 'SCHEMAV_CVC_WILDCARD', 1879: 'SCHEMAV_MISC', 1900: 'XPTR_UNKNOWN_SCHEME', 1901: 'XPTR_CHILDSEQ_START', 1902: 'XPTR_EVAL_FAILED', 1903: 'XPTR_EXTRA_OBJECTS', 1950: 'C14N_CREATE_CTXT', 1951: 'C14N_REQUIRES_UTF8', 1952: 'C14N_CREATE_STACK', 1953: 'C14N_INVALID_NODE', 1954: 'C14N_UNKNOW_NODE', 1955: 'C14N_RELATIVE_NAMESPACE', 2000: 'FTP_PASV_ANSWER', 2001: 'FTP_EPSV_ANSWER', 2002: 'FTP_ACCNT', 2003: 'FTP_URL_SYNTAX', 2020: 'HTTP_URL_SYNTAX', 2021: 'HTTP_USE_IP', 2022: 'HTTP_UNKNOWN_HOST', 3000: 'SCHEMAP_SRC_SIMPLE_TYPE_1', 3001: 'SCHEMAP_SRC_SIMPLE_TYPE_2', 3002: 'SCHEMAP_SRC_SIMPLE_TYPE_3', 3003: 'SCHEMAP_SRC_SIMPLE_TYPE_4', 3004: 'SCHEMAP_SRC_RESOLVE', 3005: 'SCHEMAP_SRC_RESTRICTION_BASE_OR_SIMPLETYPE', 3006: 'SCHEMAP_SRC_LIST_ITEMTYPE_OR_SIMPLETYPE', 3007: 'SCHEMAP_SRC_UNION_MEMBERTYPES_OR_SIMPLETYPES', 3008: 'SCHEMAP_ST_PROPS_CORRECT_1', 3009: 'SCHEMAP_ST_PROPS_CORRECT_2', 3010: 'SCHEMAP_ST_PROPS_CORRECT_3', 3011: 'SCHEMAP_COS_ST_RESTRICTS_1_1', 3012: 'SCHEMAP_COS_ST_RESTRICTS_1_2', 3013: 'SCHEMAP_COS_ST_RESTRICTS_1_3_1', 3014: 'SCHEMAP_COS_ST_RESTRICTS_1_3_2', 3015: 'SCHEMAP_COS_ST_RESTRICTS_2_1', 3016: 'SCHEMAP_COS_ST_RESTRICTS_2_3_1_1', 3017: 'SCHEMAP_COS_ST_RESTRICTS_2_3_1_2', 3018: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_1', 3019: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_2', 3020: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_3', 3021: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_4', 3022: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_5', 3023: 'SCHEMAP_COS_ST_RESTRICTS_3_1', 3024: 'SCHEMAP_COS_ST_RESTRICTS_3_3_1', 3025: 'SCHEMAP_COS_ST_RESTRICTS_3_3_1_2', 3026: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_2', 3027: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_1', 3028: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_3', 3029: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_4', 3030: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_5', 3031: 'SCHEMAP_COS_ST_DERIVED_OK_2_1', 3032: 'SCHEMAP_COS_ST_DERIVED_OK_2_2', 3033: 'SCHEMAP_S4S_ELEM_NOT_ALLOWED', 3034: 'SCHEMAP_S4S_ELEM_MISSING', 3035: 'SCHEMAP_S4S_ATTR_NOT_ALLOWED', 3036: 'SCHEMAP_S4S_ATTR_MISSING', 3037: 'SCHEMAP_S4S_ATTR_INVALID_VALUE', 3038: 'SCHEMAP_SRC_ELEMENT_1', 3039: 'SCHEMAP_SRC_ELEMENT_2_1', 3040: 'SCHEMAP_SRC_ELEMENT_2_2', 3041: 'SCHEMAP_SRC_ELEMENT_3', 3042: 'SCHEMAP_P_PROPS_CORRECT_1', 3043: 'SCHEMAP_P_PROPS_CORRECT_2_1', 3044: 'SCHEMAP_P_PROPS_CORRECT_2_2', 3045: 'SCHEMAP_E_PROPS_CORRECT_2', 3046: 'SCHEMAP_E_PROPS_CORRECT_3', 3047: 'SCHEMAP_E_PROPS_CORRECT_4', 3048: 'SCHEMAP_E_PROPS_CORRECT_5', 3049: 'SCHEMAP_E_PROPS_CORRECT_6', 3050: 'SCHEMAP_SRC_INCLUDE', 3051: 'SCHEMAP_SRC_ATTRIBUTE_1', 3052: 'SCHEMAP_SRC_ATTRIBUTE_2', 3053: 'SCHEMAP_SRC_ATTRIBUTE_3_1', 3054: 'SCHEMAP_SRC_ATTRIBUTE_3_2', 3055: 'SCHEMAP_SRC_ATTRIBUTE_4', 3056: 'SCHEMAP_NO_XMLNS', 3057: 'SCHEMAP_NO_XSI', 3058: 'SCHEMAP_COS_VALID_DEFAULT_1', 3059: 'SCHEMAP_COS_VALID_DEFAULT_2_1', 3060: 'SCHEMAP_COS_VALID_DEFAULT_2_2_1', 3061: 'SCHEMAP_COS_VALID_DEFAULT_2_2_2', 3062: 'SCHEMAP_CVC_SIMPLE_TYPE', 3063: 'SCHEMAP_COS_CT_EXTENDS_1_1', 3064: 'SCHEMAP_SRC_IMPORT_1_1', 3065: 'SCHEMAP_SRC_IMPORT_1_2', 3066: 'SCHEMAP_SRC_IMPORT_2', 3067: 'SCHEMAP_SRC_IMPORT_2_1', 3068: 'SCHEMAP_SRC_IMPORT_2_2', 3069: 'SCHEMAP_INTERNAL', 3070: 'SCHEMAP_NOT_DETERMINISTIC', 3071: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_1', 3072: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_2', 3073: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_3', 3074: 'SCHEMAP_MG_PROPS_CORRECT_1', 3075: 'SCHEMAP_MG_PROPS_CORRECT_2', 3076: 'SCHEMAP_SRC_CT_1', 3077: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_3', 3078: 'SCHEMAP_AU_PROPS_CORRECT_2', 3079: 'SCHEMAP_A_PROPS_CORRECT_2', 3080: 'SCHEMAP_C_PROPS_CORRECT', 3081: 'SCHEMAP_SRC_REDEFINE', 3082: 'SCHEMAP_SRC_IMPORT', 3083: 'SCHEMAP_WARN_SKIP_SCHEMA', 3084: 'SCHEMAP_WARN_UNLOCATED_SCHEMA', 3085: 'SCHEMAP_WARN_ATTR_REDECL_PROH', 3086: 'SCHEMAP_WARN_ATTR_POINTLESS_PROH', 3087: 'SCHEMAP_AG_PROPS_CORRECT', 3088: 'SCHEMAP_COS_CT_EXTENDS_1_2', 3089: 'SCHEMAP_AU_PROPS_CORRECT', 3090: 'SCHEMAP_A_PROPS_CORRECT_3', 3091: 'SCHEMAP_COS_ALL_LIMITED', 4000: 'SCHEMATRONV_ASSERT', 4001: 'SCHEMATRONV_REPORT', 4900: 'MODULE_OPEN', 4901: 'MODULE_CLOSE', 5000: 'CHECK_FOUND_ELEMENT', 5001: 'CHECK_FOUND_ATTRIBUTE', 5002: 'CHECK_FOUND_TEXT', 5003: 'CHECK_FOUND_CDATA', 5004: 'CHECK_FOUND_ENTITYREF', 5005: 'CHECK_FOUND_ENTITY', 5006: 'CHECK_FOUND_PI', 5007: 'CHECK_FOUND_COMMENT', 5008: 'CHECK_FOUND_DOCTYPE', 5009: 'CHECK_FOUND_FRAGMENT', 5010: 'CHECK_FOUND_NOTATION', 5011: 'CHECK_UNKNOWN_NODE', 5012: 'CHECK_ENTITY_TYPE', 5013: 'CHECK_NO_PARENT', 5014: 'CHECK_NO_DOC', 5015: 'CHECK_NO_NAME', 5016: 'CHECK_NO_ELEM', 5017: 'CHECK_WRONG_DOC', 5018: 'CHECK_NO_PREV', 5019: 'CHECK_WRONG_PREV', 5020: 'CHECK_NO_NEXT', 5021: 'CHECK_WRONG_NEXT', 5022: 'CHECK_NOT_DTD', 5023: 'CHECK_NOT_ATTR', 5024: 'CHECK_NOT_ATTR_DECL', 5025: 'CHECK_NOT_ELEM_DECL', 5026: 'CHECK_NOT_ENTITY_DECL', 5027: 'CHECK_NOT_NS_DECL', 5028: 'CHECK_NO_HREF', 5029: 'CHECK_WRONG_PARENT', 5030: 'CHECK_NS_SCOPE', 5031: 'CHECK_NS_ANCESTOR', 5032: 'CHECK_NOT_UTF8', 5033: 'CHECK_NO_DICT', 5034: 'CHECK_NOT_NCNAME', 5035: 'CHECK_OUTSIDE_DICT', 5036: 'CHECK_WRONG_NAME', 5037: 'CHECK_NAME_NOT_NULL', 6000: 'I18N_NO_NAME', 6001: 'I18N_NO_HANDLER', 6002: 'I18N_EXCESS_HANDLER', 6003: 'I18N_CONV_FAILED', 6004: 'I18N_NO_OUTPUT', 7000: 'BUF_OVERFLOW'}
class lxml.etree.FallbackElementClassLookup(self, fallback=None)

Bases: ElementClassLookup

Superclass of Element class lookups with additional fallback.

set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.HTMLParser(self, encoding=None, remove_blank_text=False, remove_comments=False, remove_pis=False, no_network=True, target=None, schema: XMLSchema = None, recover=True, compact=True, collect_ids=True, huge_tree=False)

Bases: _FeedParser

The HTML parser.

This parser allows reading HTML into a normal XML tree. By default, it can read broken (non well-formed) HTML, depending on the capabilities of libxml2. Use the ‘recover’ option to switch this off.

Available boolean keyword arguments:

  • recover - try hard to parse through broken HTML (default: True)

  • no_network - prevent network access for related files (default: True)

  • remove_blank_text - discard empty text nodes that are ignorable (i.e. not actual text content)

  • remove_comments - discard comments

  • remove_pis - discard processing instructions

  • compact - save memory for short text content (default: True)

  • default_doctype - add a default doctype even if it is not found in the HTML (default: True)

  • collect_ids - use a hash table of XML IDs for fast access (default: True)

  • huge_tree - disable security restrictions and support very deep trees

    and very long text content (only affects libxml2 2.7+)

Other keyword arguments:

  • encoding - override the document encoding (note: libiconv encoding name)

  • target - a parser target object that will receive the parse events

  • schema - an XMLSchema to validate against

Note that you should avoid sharing parsers between threads for performance reasons.

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree.HTMLPullParser(self, events=None, *, tag=None, base_url=None, **kwargs)

Bases: HTMLParser

HTML parser that collects parse events in an iterator.

The collected events are the same as for iterparse(), but the parser itself is non-blocking in the sense that it receives data chunks incrementally through its .feed() method, instead of reading them directly from a file(-like) object all by itself.

By default, it collects Element end events. To change that, pass any subset of the available events into the events argument: 'start', 'end', 'start-ns', 'end-ns', 'comment', 'pi'.

To support loading external dependencies relative to the input source, you can pass the base_url.

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

read_events()
set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree.PIBase

Bases: _ProcessingInstruction

All custom Processing Instruction classes must inherit from this one.

To create an XML ProcessingInstruction instance, use the PI() factory.

Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of PIs must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an _init(self) method that will be called after object creation.

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)

Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.

Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.

getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib

Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.

base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

target
text
class lxml.etree.ParserBasedElementClassLookup(self, fallback=None)

Bases: FallbackElementClassLookup

Element class lookup based on the XML parser.

set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.PyErrorLog(self, logger_name=None, logger=None)

Bases: _BaseErrorLog

A global error log that connects to the Python stdlib logging package.

The constructor accepts an optional logger name or a readily instantiated logger instance.

If you want to change the mapping between libxml2’s ErrorLevels and Python logging levels, you can modify the level_map dictionary from a subclass.

The default mapping is:

ErrorLevels.WARNING = logging.WARNING
ErrorLevels.ERROR   = logging.ERROR
ErrorLevels.FATAL   = logging.CRITICAL

You can also override the method receive() that takes a LogEntry object and calls self.log(log_entry, format_string, arg1, arg2, ...) with appropriate data.

copy()

Dummy method that returns an empty error log.

log(self, log_entry, message, *args)

Called by the .receive() method to log a _LogEntry instance to the Python logging system. This handles the error level mapping.

In the default implementation, the message argument receives a complete log line, and there are no further args. To change the message format, it is best to override the .receive() method instead of this one.

receive(self, log_entry)

Receive a _LogEntry instance from the logging system. Calls the .log() method with appropriate parameters:

self.log(log_entry, repr(log_entry))

You can override this method to provide your own log output format.

last_error
level_map
class lxml.etree.PythonElementClassLookup(self, fallback=None)

Bases: FallbackElementClassLookup

Element class lookup based on a subclass method.

This class lookup scheme allows access to the entire XML tree in read-only mode. To use it, re-implement the lookup(self, doc, root) method in a subclass:

from lxml import etree, pyclasslookup

class MyElementClass(etree.ElementBase):
    honkey = True

class MyLookup(pyclasslookup.PythonElementClassLookup):
    def lookup(self, doc, root):
        if root.tag == "sometag":
            return MyElementClass
        else:
            for child in root:
                if child.tag == "someothertag":
                    return MyElementClass
        # delegate to default
        return None

If you return None from this method, the fallback will be called.

The first argument is the opaque document instance that contains the Element. The second argument is a lightweight Element proxy implementation that is only valid during the lookup. Do not try to keep a reference to it. Once the lookup is done, the proxy will be invalid.

Also, you cannot wrap such a read-only Element in an ElementTree, and you must take care not to keep a reference to them outside of the lookup() method.

Note that the API of the Element objects is not complete. It is purely read-only and does not support all features of the normal lxml.etree API (such as XPath, extended slicing or some iteration methods).

See https://lxml.de/element_classes.html

lookup(self, doc, element)

Override this method to implement your own lookup scheme.

set_fallback(self, lookup)

Sets the fallback scheme for this lookup method.

fallback
class lxml.etree.QName(text_or_uri_or_element, tag=None)

Bases: object

QName wrapper for qualified XML names.

Pass a tag name by itself or a namespace URI and a tag name to create a qualified name. Alternatively, pass an Element to extract its tag name. None as first argument is ignored in order to allow for generic 2-argument usage.

The text property holds the qualified name in {namespace}tagname notation. The namespace and localname properties hold the respective parts of the tag name.

You can pass QName objects wherever a tag name is expected. Also, setting Element text from a QName will resolve the namespace prefix on assignment and set a qualified text value. This is helpful in XML languages like SOAP or XML-Schema that use prefixed tag names in their text content.

localname
namespace
text
class lxml.etree.RelaxNG(self, etree=None, file=None)

Bases: _Validator

Turn a document into a Relax NG validator.

Either pass a schema as Element or ElementTree, or pass a file or filename through the file keyword argument.

_append_log_message(domain, type, level, line, message, filename)
_clear_error_log()
assertValid(self, etree)

Raises DocumentInvalid if the document does not comply with the schema.

assert_(self, etree)

Raises AssertionError if the document does not comply with the schema.

classmethod from_rnc_string(src, base_url=None)

Parse a RelaxNG schema in compact syntax from a text string

Requires the rnc2rng package to be installed.

Passing the source URL or file path of the source as ‘base_url’ will enable resolving resource references relative to the source.

validate(self, etree)

Validate the document using this schema.

Returns true if document is valid, false if not.

error_log

The log of validation errors and warnings.

class lxml.etree.RelaxNGErrorTypes

Bases: object

Libxml2 RelaxNG error types

_getName(default=None, /)

Return the value for key if key is in the dictionary, else default.

RELAXNG_ERR_ATTREXTRANS = 20
RELAXNG_ERR_ATTRNAME = 14
RELAXNG_ERR_ATTRNONS = 16
RELAXNG_ERR_ATTRVALID = 24
RELAXNG_ERR_ATTRWRONGNS = 18
RELAXNG_ERR_CONTENTVALID = 25
RELAXNG_ERR_DATAELEM = 28
RELAXNG_ERR_DATATYPE = 31
RELAXNG_ERR_DUPID = 4
RELAXNG_ERR_ELEMEXTRANS = 19
RELAXNG_ERR_ELEMNAME = 13
RELAXNG_ERR_ELEMNONS = 15
RELAXNG_ERR_ELEMNOTEMPTY = 21
RELAXNG_ERR_ELEMWRONG = 38
RELAXNG_ERR_ELEMWRONGNS = 17
RELAXNG_ERR_EXTRACONTENT = 26
RELAXNG_ERR_EXTRADATA = 35
RELAXNG_ERR_INTEREXTRA = 12
RELAXNG_ERR_INTERNAL = 37
RELAXNG_ERR_INTERNODATA = 10
RELAXNG_ERR_INTERSEQ = 11
RELAXNG_ERR_INVALIDATTR = 27
RELAXNG_ERR_LACKDATA = 36
RELAXNG_ERR_LIST = 33
RELAXNG_ERR_LISTELEM = 30
RELAXNG_ERR_LISTEMPTY = 9
RELAXNG_ERR_LISTEXTRA = 8
RELAXNG_ERR_MEMORY = 1
RELAXNG_ERR_NODEFINE = 7
RELAXNG_ERR_NOELEM = 22
RELAXNG_ERR_NOGRAMMAR = 34
RELAXNG_ERR_NOSTATE = 6
RELAXNG_ERR_NOTELEM = 23
RELAXNG_ERR_TEXTWRONG = 39
RELAXNG_ERR_TYPE = 2
RELAXNG_ERR_TYPECMP = 5
RELAXNG_ERR_TYPEVAL = 3
RELAXNG_ERR_VALELEM = 29
RELAXNG_ERR_VALUE = 32
RELAXNG_OK = 0
_names = {0: 'RELAXNG_OK', 1: 'RELAXNG_ERR_MEMORY', 2: 'RELAXNG_ERR_TYPE', 3: 'RELAXNG_ERR_TYPEVAL', 4: 'RELAXNG_ERR_DUPID', 5: 'RELAXNG_ERR_TYPECMP', 6: 'RELAXNG_ERR_NOSTATE', 7: 'RELAXNG_ERR_NODEFINE', 8: 'RELAXNG_ERR_LISTEXTRA', 9: 'RELAXNG_ERR_LISTEMPTY', 10: 'RELAXNG_ERR_INTERNODATA', 11: 'RELAXNG_ERR_INTERSEQ', 12: 'RELAXNG_ERR_INTEREXTRA', 13: 'RELAXNG_ERR_ELEMNAME', 14: 'RELAXNG_ERR_ATTRNAME', 15: 'RELAXNG_ERR_ELEMNONS', 16: 'RELAXNG_ERR_ATTRNONS', 17: 'RELAXNG_ERR_ELEMWRONGNS', 18: 'RELAXNG_ERR_ATTRWRONGNS', 19: 'RELAXNG_ERR_ELEMEXTRANS', 20: 'RELAXNG_ERR_ATTREXTRANS', 21: 'RELAXNG_ERR_ELEMNOTEMPTY', 22: 'RELAXNG_ERR_NOELEM', 23: 'RELAXNG_ERR_NOTELEM', 24: 'RELAXNG_ERR_ATTRVALID', 25: 'RELAXNG_ERR_CONTENTVALID', 26: 'RELAXNG_ERR_EXTRACONTENT', 27: 'RELAXNG_ERR_INVALIDATTR', 28: 'RELAXNG_ERR_DATAELEM', 29: 'RELAXNG_ERR_VALELEM', 30: 'RELAXNG_ERR_LISTELEM', 31: 'RELAXNG_ERR_DATATYPE', 32: 'RELAXNG_ERR_VALUE', 33: 'RELAXNG_ERR_LIST', 34: 'RELAXNG_ERR_NOGRAMMAR', 35: 'RELAXNG_ERR_EXTRADATA', 36: 'RELAXNG_ERR_LACKDATA', 37: 'RELAXNG_ERR_INTERNAL', 38: 'RELAXNG_ERR_ELEMWRONG', 39: 'RELAXNG_ERR_TEXTWRONG'}
class lxml.etree.Resolver

Bases: object

This is the base class of all resolvers.

resolve(self, system_url, public_id, context)

Override this method to resolve an external source by system_url and public_id. The third argument is an opaque context object.

Return the result of one of the resolve_*() methods.

resolve_empty(self, context)

Return an empty input document.

Pass context as parameter.

resolve_file(self, f, context, base_url=None, close=True)

Return an open file-like object as input document.

Pass open file and context as parameters. You can pass the base URL or filename of the file through the base_url keyword argument. If the close flag is True (the default), the file will be closed after reading.

Note that using .resolve_filename() is more efficient, especially in threaded environments.

resolve_filename(self, filename, context)

Return the name of a parsable file as input document.

Pass filename and context as parameters. You can also pass a URL with an HTTP, FTP or file target.

resolve_string(self, string, context, base_url=None)

Return a parsable string as input document.

Pass data string and context as parameters. You can pass the source URL or filename through the base_url keyword argument.

class lxml.etree.Schematron(self, etree=None, file=None)

Bases: _Validator

A Schematron validator.

Pass a root Element or an ElementTree to turn it into a validator. Alternatively, pass a filename as keyword argument ‘file’ to parse from the file system.

Schematron is a less well known, but very powerful schema language. The main idea is to use the capabilities of XPath to put restrictions on the structure and the content of XML documents. Here is a simple example:

>>> schematron = Schematron(XML('''
... <schema xmlns="http://www.ascc.net/xml/schematron" >
...   <pattern name="id is the only permitted attribute name">
...     <rule context="*">
...       <report test="@*[not(name()='id')]">Attribute
...         <name path="@*[not(name()='id')]"/> is forbidden<name/>
...       </report>
...     </rule>
...   </pattern>
... </schema>
... '''))

>>> xml = XML('''
... <AAA name="aaa">
...   <BBB id="bbb"/>
...   <CCC color="ccc"/>
... </AAA>
... ''')

>>> schematron.validate(xml)
0

>>> xml = XML('''
... <AAA id="aaa">
...   <BBB id="bbb"/>
...   <CCC/>
... </AAA>
... ''')

>>> schematron.validate(xml)
1

Schematron was added to libxml2 in version 2.6.21. Before version 2.6.32, however, Schematron lacked support for error reporting other than to stderr. This version is therefore required to retrieve validation warnings and errors in lxml.

_append_log_message(domain, type, level, line, message, filename)
_clear_error_log()
assertValid(self, etree)

Raises DocumentInvalid if the document does not comply with the schema.

assert_(self, etree)

Raises AssertionError if the document does not comply with the schema.

validate(self, etree)

Validate the document using this schema.

Returns true if document is valid, false if not.

error_log

The log of validation errors and warnings.

class lxml.etree.SiblingsIterator(self, node, tag=None, preceding=False)

Bases: _ElementMatchIterator

Iterates over the siblings of an element.

You can pass the boolean keyword preceding to specify the direction.

class lxml.etree.TreeBuilder

Bases: _SaxParserTarget

TreeBuilder(self, element_factory=None, parser=None,

comment_factory=None, pi_factory=None, insert_comments=True, insert_pis=True)

Parser target that builds a tree from parse event callbacks.

The factory arguments can be used to influence the creation of elements, comments and processing instructions.

By default, comments and processing instructions are inserted into the tree, but they can be ignored by passing the respective flags.

The final tree is returned by the close() method.

close(self)

Flushes the builder buffers, and returns the toplevel document element. Raises XMLSyntaxError on inconsistencies.

comment(self, comment)

Creates a comment using the factory, appends it (unless disabled) and returns it.

data(self, data)

Adds text to the current element. The value should be either an 8-bit string containing ASCII text, or a Unicode string.

end(self, tag)

Closes the current element.

pi(self, target, data=None)

Creates a processing instruction using the factory, appends it (unless disabled) and returns it.

start(self, tag, attrs, nsmap=None)

Opens a new element.

class lxml.etree.XInclude(self)

Bases: object

XInclude processor.

Create an instance and call it on an Element to run XInclude processing.

error_log
class lxml.etree.XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, schema: XMLSchema = None, huge_tree=False, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)

Bases: _FeedParser

The XML parser.

Parsers can be supplied as additional argument to various parse functions of the lxml API. A default parser is always available and can be replaced by a call to the global function ‘set_default_parser’. New parsers can be created at any time without a major run-time overhead.

The keyword arguments in the constructor are mainly based on the libxml2 parser configuration. A DTD will also be loaded if DTD validation or attribute default values are requested (unless you additionally provide an XMLSchema from which the default attributes can be read).

Available boolean keyword arguments:

  • attribute_defaults - inject default attributes from DTD or XMLSchema

  • dtd_validation - validate against a DTD referenced by the document

  • load_dtd - use DTD for parsing

  • no_network - prevent network access for related files (default: True)

  • ns_clean - clean up redundant namespace declarations

  • recover - try hard to parse through broken XML

  • remove_blank_text - discard blank text nodes that appear ignorable

  • remove_comments - discard comments

  • remove_pis - discard processing instructions

  • strip_cdata - replace CDATA sections by normal text content (default: True)

  • compact - save memory for short text content (default: True)

  • collect_ids - use a hash table of XML IDs for fast access (default: True, always True with DTD validation)

  • huge_tree - disable security restrictions and support very deep trees

    and very long text content (only affects libxml2 2.7+)

Other keyword arguments:

  • resolve_entities - replace entities by their text value: False for keeping the

    entity references, True for resolving them, and ‘internal’ for resolving internal definitions only (no external file/URL access). The default used to be True and was changed to ‘internal’ in lxml 5.0.

  • encoding - override the document encoding (note: libiconv encoding name)

  • target - a parser target object that will receive the parse events

  • schema - an XMLSchema to validate against

Note that you should avoid sharing parsers between threads. While this is not harmful, it is more efficient to use separate parsers. This does not apply to the default parser.

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree.XMLPullParser(self, events=None, *, tag=None, **kwargs)

Bases: XMLParser

XML parser that collects parse events in an iterator.

The collected events are the same as for iterparse(), but the parser itself is non-blocking in the sense that it receives data chunks incrementally through its .feed() method, instead of reading them directly from a file(-like) object all by itself.

By default, it collects Element end events. To change that, pass any subset of the available events into the events argument: 'start', 'end', 'start-ns', 'end-ns', 'comment', 'pi'.

To support loading external dependencies relative to the input source, you can pass the base_url.

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

read_events()
set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree.XMLSchema(self, etree=None, file=None)

Bases: _Validator

Turn a document into an XML Schema validator.

Either pass a schema as Element or ElementTree, or pass a file or filename through the file keyword argument.

Passing the attribute_defaults boolean option will make the schema insert default/fixed attributes into validated documents.

_append_log_message(domain, type, level, line, message, filename)
_clear_error_log()
assertValid(self, etree)

Raises DocumentInvalid if the document does not comply with the schema.

assert_(self, etree)

Raises AssertionError if the document does not comply with the schema.

validate(self, etree)

Validate the document using this schema.

Returns true if document is valid, false if not.

error_log

The log of validation errors and warnings.

lxml.etree.XMLTreeBuilder

alias of ETCompatXMLParser

class lxml.etree.XPath(self, path, namespaces=None, extensions=None, regexp=True, smart_strings=True)

Bases: _XPathEvaluatorBase

A compiled XPath expression that can be called on Elements and ElementTrees.

Besides the XPath expression, you can pass prefix-namespace mappings and extension functions to the constructor through the keyword arguments namespaces and extensions. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass smart_strings=False.

error_log
path

The literal XPath expression.

class lxml.etree.XPathDocumentEvaluator(self, etree, namespaces=None, extensions=None, regexp=True, smart_strings=True)

Bases: XPathElementEvaluator

Create an XPath evaluator for an ElementTree.

Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass smart_strings=False.

register_namespace(prefix, uri)

Register a namespace with the XPath context.

register_namespaces(namespaces)

Register a prefix -> uri dict.

error_log
class lxml.etree.XPathElementEvaluator(self, element, namespaces=None, extensions=None, regexp=True, smart_strings=True)

Bases: _XPathEvaluatorBase

Create an XPath evaluator for an element.

Absolute XPath expressions (starting with ‘/’) will be evaluated against the ElementTree as returned by getroottree().

Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass smart_strings=False.

register_namespace(prefix, uri)

Register a namespace with the XPath context.

register_namespaces(namespaces)

Register a prefix -> uri dict.

error_log
class lxml.etree.XSLT(self, xslt_input, extensions=None, regexp=True, access_control=None)

Bases: object

Turn an XSL document into an XSLT object.

Calling this object on a tree or Element will execute the XSLT:

transform = etree.XSLT(xsl_tree)
result = transform(xml_tree)

Keyword arguments of the constructor:

  • extensions: a dict mapping (namespace, name) pairs to extension functions or extension elements

  • regexp: enable exslt regular expression support in XPath (default: True)

  • access_control: access restrictions for network or file system (see XSLTAccessControl)

Keyword arguments of the XSLT call:

  • profile_run: enable XSLT profiling and make the profile available as XML document in result.xslt_profile (default: False)

Other keyword arguments of the call are passed to the stylesheet as parameters.

static set_global_max_depth(max_depth)

The maximum traversal depth that the stylesheet engine will allow. This does not only count the template recursion depth but also takes the number of variables/parameters into account. The required setting for a run depends on both the stylesheet and the input data.

Example:

XSLT.set_global_max_depth(5000)

Note that this is currently a global, module-wide setting because libxslt does not support it at a per-stylesheet level.

static strparam(strval)

Mark an XSLT string parameter that requires quote escaping before passing it into the transformation. Use it like this:

result = transform(doc, some_strval = XSLT.strparam(
    '''it's "Monty Python's" ...'''))

Escaped string parameters can be reused without restriction.

tostring(self, result_tree)

Save result doc to string based on stylesheet output method.

Deprecated:

use str(result_tree) instead.

error_log

The log of errors and warnings of an XSLT execution.

class lxml.etree.XSLTAccessControl(self, read_file=True, write_file=True, create_dir=True, read_network=True, write_network=True)

Bases: object

Access control for XSLT: reading/writing files, directories and network I/O. Access to a type of resource is granted or denied by passing any of the following boolean keyword arguments. All of them default to True to allow access.

  • read_file

  • write_file

  • create_dir

  • read_network

  • write_network

For convenience, there is also a class member DENY_ALL that provides an XSLTAccessControl instance that is readily configured to deny everything, and a DENY_WRITE member that denies all write access but allows read access.

See XSLT.

DENY_ALL = XSLTAccessControl(create_dir=False, read_file=False, read_network=False, write_file=False, write_network=False)
DENY_WRITE = XSLTAccessControl(create_dir=False, read_file=True, read_network=True, write_file=False, write_network=False)
options

The access control configuration as a map of options.

class lxml.etree.XSLTExtension

Bases: object

Base class of an XSLT extension element.

apply_templates(self, context, node, output_parent=None, elements_only=False, remove_blank_text=False)

Call this method to retrieve the result of applying templates to an element.

The return value is a list of elements or text strings that were generated by the XSLT processor. If you pass elements_only=True, strings will be discarded from the result list. The option remove_blank_text=True will only discard strings that consist entirely of whitespace (e.g. formatting). These options do not apply to Elements, only to bare string results.

If you pass an Element as output_parent parameter, the result will instead be appended to the element (including attributes etc.) and the return value will be None. This is a safe way to generate content into the output document directly, without having to take care of special values like text or attributes. Note that the string discarding options will be ignored in this case.

execute(self, context, self_node, input_node, output_parent)

Execute this extension element.

Subclasses must override this method. They may append elements to the output_parent element here, or set its text content. To this end, the input_node provides read-only access to the current node in the input document, and the self_node points to the extension element in the stylesheet.

Note that the output_parent parameter may be None if there is no parent element in the current context (e.g. no content was added to the output tree yet).

process_children(self, context, output_parent=None, elements_only=False, remove_blank_text=False)

Call this method to process the XSLT content of the extension element itself.

The return value is a list of elements or text strings that were generated by the XSLT processor. If you pass elements_only=True, strings will be discarded from the result list. The option remove_blank_text=True will only discard strings that consist entirely of whitespace (e.g. formatting). These options do not apply to Elements, only to bare string results.

If you pass an Element as output_parent parameter, the result will instead be appended to the element (including attributes etc.) and the return value will be None. This is a safe way to generate content into the output document directly, without having to take care of special values like text or attributes. Note that the string discarding options will be ignored in this case.

class lxml.etree._Attrib

Bases: object

A dict-like proxy for the Element.attrib property.

clear()
get(key, default=None)
has_key(key)
items()
iteritems()
iterkeys()
itervalues()
keys()
pop(key, *default)
update(sequence_or_dict)
values()
class lxml.etree._BaseErrorLog

Bases: object

copy()
receive(entry)
last_error
class lxml.etree._Comment

Bases: __ContentOnlyElement

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)
getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib
base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text
class lxml.etree._Document

Bases: object

Internal base class to reference a libxml document.

When instances of this class are garbage collected, the libxml document is cleaned up.

class lxml.etree._DomainErrorLog

Bases: _ErrorLog

clear()
copy()

Creates a shallow copy of this error log and the list of entries.

filter_domains(domains)

Filter the errors by the given domains and return a new error log containing the matches.

filter_from_errors(self)

Convenience method to get all error messages or worse.

filter_from_fatals(self)

Convenience method to get all fatal error messages.

filter_from_level(self, level)

Return a log with all messages of the requested level of worse.

filter_from_warnings(self)

Convenience method to get all warnings or worse.

filter_levels(self, levels)

Filter the errors by the given error levels and return a new error log containing the matches.

filter_types(self, types)

Filter the errors by the given types and return a new error log containing the matches.

receive(entry)
last_error
class lxml.etree._Element

Bases: object

Element class.

References a document object and a libxml node.

By pointing to a Document instance, a reference is kept to _Document as long as there is some pointer to a node in it.

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, element)

Adds a subelement to the end of this element.

clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)

Gets an element attribute.

getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, element)

Inserts a subelement at the given position in this element

items(self)

Gets element attributes, as a sequence. The attributes are returned in an arbitrary order.

iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)

Gets a list of attribute names. The names are returned in an arbitrary order (just like for an ordinary Python dictionary).

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)

Sets an element attribute. In HTML documents (not XML or XHTML), the value None is allowed and creates an attribute without value (just the attribute name).

values(self)

Gets element attribute values as a sequence of strings. The attributes are returned in an arbitrary order.

xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib

Element attribute dictionary. Where possible, use get(), set(), keys(), values() and items() to access element attributes.

base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag

Element tag

tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text

Text before the first subelement. This is either a string or the value None, if there was no text.

class lxml.etree._ElementIterator

Bases: _ElementTagMatcher

Dead but public. :)

class lxml.etree._ElementMatchIterator

Bases: object

class lxml.etree._ElementTagMatcher

Bases: object

Dead but public. :)

class lxml.etree._ElementTree

Bases: object

_setroot(self, root)

Relocate the ElementTree to a new root node.

find(self, path, namespaces=None)

Finds the first toplevel element with given tag. Same as tree.getroot().find(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all elements matching the ElementPath expression. Same as getroot().findall(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds the text for the first element matching the ElementPath expression. Same as getroot().findtext(path)

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

getelementpath(self, element)

Returns a structural, absolute ElementPath expression to find the element. This path can be used in the .find() method to look up the element, provided that the elements along the path and their list of immediate children were not modified in between.

ElementPath has the advantage over an XPath expression (as returned by the .getpath() method) that it does not require additional prefix declarations. It is always self-contained.

getiterator(self, *tags, tag=None)

Returns a sequence or iterator of all elements in document order (depth first pre-order), starting with the root element.

Can be restricted to find only elements with specific tags, see _Element.iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the tree.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getpath(self, element)

Returns a structural, absolute XPath expression to find the element.

For namespaced elements, the expression uses prefixes from the document, which therefore need to be provided in order to make any use of the expression in XPath.

Also see the method getelementpath(self, element), which returns a self-contained ElementPath expression.

getroot(self)

Gets the root element for this tree.

iter(self, tag=None, *tags)

Creates an iterator for the root element. The iterator loops over all elements in this tree, in document order. Note that siblings of the root element (comments or processing instructions) are not returned by the iterator.

Can be restricted to find only elements with specific tags, see _Element.iter.

iterfind(self, path, namespaces=None)

Iterates over all elements matching the ElementPath expression. Same as getroot().iterfind(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

parse(self, source, parser=None, base_url=None)

Updates self with the content of source and returns its root.

relaxng(self, relaxng)

Validate this document using other document.

The relaxng argument is a tree that should contain a Relax NG schema.

Returns True or False, depending on whether validation succeeded.

Note: if you are going to apply the same Relax NG schema against multiple documents, it is more efficient to use the RelaxNG class directly.

write(file, *, encoding=None, method='xml', pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False, docstring=None)
write(self, file, encoding=None, method=”xml”,

pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)

Write the tree to a filename, file or file-like object.

Defaults to ASCII encoding and writing a declaration as needed.

The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, ‘text’, ‘c14n’ or ‘c14n2’. Default is ‘xml’.

With method="c14n" (C14N version 1), the options exclusive, with_comments and inclusive_ns_prefixes request exclusive C14N, include comments, and list the inclusive prefixes respectively.

With method="c14n2" (C14N version 2), the with_comments and strip_text options control the output of comments and text space according to C14N 2.0.

Passing a boolean value to the standalone option will output an XML declaration with the corresponding standalone flag.

The doctype option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.

The compression option enables GZip compression level 1-9.

The inclusive_ns_prefixes should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.

If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.

write_c14n(file, *, exclusive=False, with_comments=True, compression=0, inclusive_ns_prefixes=None)
write_c14n(self, file, exclusive=False, with_comments=True,

compression=0, inclusive_ns_prefixes=None)

C14N write of document. Always writes UTF-8.

The compression option enables GZip compression level 1-9.

The inclusive_ns_prefixes should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.

If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.

NOTE: This method is deprecated as of lxml 4.4 and will be removed in a future release. Use .write(f, method="c14n") instead.

xinclude(self)

Process the XInclude nodes in this document and include the referenced XML fragments.

There is support for loading files through the file system, HTTP and FTP.

Note that XInclude does not support custom resolvers in Python space due to restrictions of libxml2 <= 2.6.29.

xmlschema(self, xmlschema)

Validate this document using other document.

The xmlschema argument is a tree that should contain an XML Schema.

Returns True or False, depending on whether validation succeeded.

Note: If you are going to apply the same XML Schema against multiple documents, it is more efficient to use the XMLSchema class directly.

xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

XPath evaluate in context of document.

namespaces is an optional dictionary with prefix to namespace URI mappings, used by XPath. extensions defines additional extension functions.

Returns a list (nodeset), or bool, float or string.

In case of a list result, return Element for element nodes, string for text and attribute values.

Note: if you are going to apply multiple XPath expressions against the same document, it is more efficient to use XPathEvaluator directly.

xslt(self, _xslt, extensions=None, access_control=None, **_kw)

Transform this document using other document.

xslt is a tree that should be XSLT keyword parameters are XSLT transformation parameters.

Returns the transformed tree.

Note: if you are going to apply the same XSLT stylesheet against multiple documents, it is more efficient to use the XSLT class directly.

docinfo

Information about the document provided by parser and DTD.

parser

The parser that was used to parse the document in this ElementTree.

class lxml.etree._ElementUnicodeResult

Bases: str

capitalize()

Return a capitalized version of the string.

More specifically, make the first character have upper case and the rest lower case.

casefold()

Return a version of the string suitable for caseless comparisons.

center(width, fillchar=' ', /)

Return a centered string of length width.

Padding is done using the specified fill character (default is a space).

count(sub[, start[, end]]) int

Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.

encode(encoding='utf-8', errors='strict')

Encode the string using the codec registered for encoding.

encoding

The encoding in which to encode the string.

errors

The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.

endswith(suffix[, start[, end]]) bool

Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.

expandtabs(tabsize=8)

Return a copy where all tab characters are expanded using spaces.

If tabsize is not given, a tab size of 8 characters is assumed.

find(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

format(*args, **kwargs) str

Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).

format_map(mapping) str

Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).

getparent()
index(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

isalnum()

Return True if the string is an alpha-numeric string, False otherwise.

A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.

isalpha()

Return True if the string is an alphabetic string, False otherwise.

A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.

isascii()

Return True if all characters in the string are ASCII, False otherwise.

ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.

isdecimal()

Return True if the string is a decimal string, False otherwise.

A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.

isdigit()

Return True if the string is a digit string, False otherwise.

A string is a digit string if all characters in the string are digits and there is at least one character in the string.

isidentifier()

Return True if the string is a valid Python identifier, False otherwise.

Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.

islower()

Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.

isnumeric()

Return True if the string is a numeric string, False otherwise.

A string is numeric if all characters in the string are numeric and there is at least one character in the string.

isprintable()

Return True if the string is printable, False otherwise.

A string is printable if all of its characters are considered printable in repr() or if it is empty.

isspace()

Return True if the string is a whitespace string, False otherwise.

A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.

istitle()

Return True if the string is a title-cased string, False otherwise.

In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.

isupper()

Return True if the string is an uppercase string, False otherwise.

A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.

join(iterable, /)

Concatenate any number of strings.

The string whose method is called is inserted in between each given string. The result is returned as a new string.

Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’

ljust(width, fillchar=' ', /)

Return a left-justified string of length width.

Padding is done using the specified fill character (default is a space).

lower()

Return a copy of the string converted to lowercase.

lstrip(chars=None, /)

Return a copy of the string with leading whitespace removed.

If chars is given and not None, remove characters in chars instead.

static maketrans()

Return a translation table usable for str.translate().

If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.

partition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing the original string and two empty strings.

removeprefix(prefix, /)

Return a str with the given prefix string removed if present.

If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.

removesuffix(suffix, /)

Return a str with the given suffix string removed if present.

If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string.

replace(old, new, count=-1, /)

Return a copy with all occurrences of substring old replaced by new.

count

Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are replaced.

rfind(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

rindex(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

rjust(width, fillchar=' ', /)

Return a right-justified string of length width.

Padding is done using the specified fill character (default is a space).

rpartition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing two empty strings and the original string.

rsplit(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including \n \r \t \f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits (starting from the left). -1 (the default value) means no limit.

Splitting starts at the end of the string and works to the front.

rstrip(chars=None, /)

Return a copy of the string with trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

split(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including \n \r \t \f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits (starting from the left). -1 (the default value) means no limit.

Note, str.split() is mainly useful for data that has been intentionally delimited. With natural text that includes punctuation, consider using the regular expression module.

splitlines(keepends=False)

Return a list of the lines in the string, breaking at line boundaries.

Line breaks are not included in the resulting list unless keepends is given and true.

startswith(prefix[, start[, end]]) bool

Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.

strip(chars=None, /)

Return a copy of the string with leading and trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

swapcase()

Convert uppercase characters to lowercase and lowercase characters to uppercase.

title()

Return a version of the string where each word is titlecased.

More specifically, words start with uppercased characters and all remaining cased characters have lower case.

translate(table, /)

Replace each character in the string using the given translation table.

table

Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.

The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.

upper()

Return a copy of the string converted to uppercase.

zfill(width, /)

Pad a numeric string with zeros on the left, to fill a field of the given width.

The string is never truncated.

attrname
is_attribute
is_tail
is_text
class lxml.etree._Entity

Bases: __ContentOnlyElement

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)
getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib
base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

name
nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

text
class lxml.etree._ErrorLog

Bases: _ListErrorLog

clear()
copy()

Creates a shallow copy of this error log and the list of entries.

filter_domains(domains)

Filter the errors by the given domains and return a new error log containing the matches.

filter_from_errors(self)

Convenience method to get all error messages or worse.

filter_from_fatals(self)

Convenience method to get all fatal error messages.

filter_from_level(self, level)

Return a log with all messages of the requested level of worse.

filter_from_warnings(self)

Convenience method to get all warnings or worse.

filter_levels(self, levels)

Filter the errors by the given error levels and return a new error log containing the matches.

filter_types(self, types)

Filter the errors by the given types and return a new error log containing the matches.

receive(entry)
last_error
class lxml.etree._FeedParser

Bases: _BaseParser

close(self)

Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.

This method must be called after passing the last chunk of data into the feed() method. It should only be called when using the feed parser interface, all other usage is undefined.

copy(self)

Create a new parser with the same configuration.

feed(self, data)

Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.

This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the close() method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to call close().

The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the parse() function concurrently.

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last parser run.

feed_error_log

The error log of the last (or current) run of the feed parser.

Note that this is local to the feed parser and thus is different from what the error_log property returns.

resolvers

The custom resolver registry of this parser.

target
version

The version of the underlying XML parser.

class lxml.etree._IDDict

Bases: object

IDDict(self, etree) A dictionary-like proxy class that mapps ID attributes to elements.

The dictionary must be instantiated with the root element of a parsed XML document, otherwise the behaviour is undefined. Elements and XML trees that were created or modified ‘by hand’ are not supported.

copy()
get(id_name)
has_key(id_name)
items()
iteritems()
iterkeys()
itervalues()
keys()
values()
class lxml.etree._ListErrorLog

Bases: _BaseErrorLog

Immutable base version of a list based error log.

copy()

Creates a shallow copy of this error log. Reuses the list of entries.

filter_domains(domains)

Filter the errors by the given domains and return a new error log containing the matches.

filter_from_errors(self)

Convenience method to get all error messages or worse.

filter_from_fatals(self)

Convenience method to get all fatal error messages.

filter_from_level(self, level)

Return a log with all messages of the requested level of worse.

filter_from_warnings(self)

Convenience method to get all warnings or worse.

filter_levels(self, levels)

Filter the errors by the given error levels and return a new error log containing the matches.

filter_types(self, types)

Filter the errors by the given types and return a new error log containing the matches.

receive(entry)
last_error
class lxml.etree._LogEntry

Bases: object

A log message entry from an error log.

Attributes:

  • message: the message text

  • domain: the domain ID (see lxml.etree.ErrorDomains)

  • type: the message type ID (see lxml.etree.ErrorTypes)

  • level: the log level ID (see lxml.etree.ErrorLevels)

  • line: the line at which the message originated (if applicable)

  • column: the character column at which the message originated (if applicable)

  • filename: the name of the file in which the message originated (if applicable)

  • path: the location in which the error was found (if available)

column
domain
domain_name

The name of the error domain. See lxml.etree.ErrorDomains

filename

The file path where the report originated, if any.

level
level_name

The name of the error level. See lxml.etree.ErrorLevels

line
message

The log message string.

path

The XPath for the node where the error was detected.

type
type_name

The name of the error type. See lxml.etree.ErrorTypes

class lxml.etree._ProcessingInstruction

Bases: __ContentOnlyElement

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)

Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.

Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.

getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)
values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib

Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.

base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

target
text
class lxml.etree._RotatingErrorLog

Bases: _ErrorLog

clear()
copy()

Creates a shallow copy of this error log and the list of entries.

filter_domains(domains)

Filter the errors by the given domains and return a new error log containing the matches.

filter_from_errors(self)

Convenience method to get all error messages or worse.

filter_from_fatals(self)

Convenience method to get all fatal error messages.

filter_from_level(self, level)

Return a log with all messages of the requested level of worse.

filter_from_warnings(self)

Convenience method to get all warnings or worse.

filter_levels(self, levels)

Filter the errors by the given error levels and return a new error log containing the matches.

filter_types(self, types)

Filter the errors by the given types and return a new error log containing the matches.

receive(entry)
last_error
class lxml.etree._SaxParserTarget

Bases: object

class lxml.etree._Validator

Bases: object

Base class for XML validators.

_append_log_message(domain, type, level, line, message, filename)
_clear_error_log()
assertValid(self, etree)

Raises DocumentInvalid if the document does not comply with the schema.

assert_(self, etree)

Raises AssertionError if the document does not comply with the schema.

validate(self, etree)

Validate the document using this schema.

Returns true if document is valid, false if not.

error_log

The log of validation errors and warnings.

class lxml.etree._XPathEvaluatorBase

Bases: object

error_log
class lxml.etree._XSLTProcessingInstruction

Bases: PIBase

_init(self)

Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

addnext(self, element)

Adds the element as a following sibling directly after this element.

This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.

addprevious(self, element)

Adds the element as a preceding sibling directly before this element.

This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.

append(self, value)
clear(self, keep_tail=False)

Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.

Pass keep_tail=True to leave the tail text untouched.

cssselect(expr, *, translator='xml')

Run the CSS expression on this element and its children, returning a list of the results.

Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.

extend(self, elements)

Extends the current children by the elements in the iterable.

find(self, path, namespaces=None)

Finds the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds text for the first matching subelement, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

get(self, key, default=None)

Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.

Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.

getchildren(self)

Returns all direct children. The elements are returned in document order.

Deprecated:

Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use list(element) or simply iterate over elements.

getiterator(self, tag=None, *tags)

Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags, see iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the element.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getnext(self)

Returns the following sibling of this element or None.

getparent(self)

Returns the parent of this element or None for the root element.

getprevious(self)

Returns the preceding sibling of this element or None.

getroottree(self)

Return an ElementTree for the root node of the document that contains this element.

This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.

index(self, child, start=None, stop=None)

Find the position of the child within the parent.

This method is not part of the original ElementTree API.

insert(self, index, value)
items(self)
iter(self, tag=None, *tags)

Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.

Can be restricted to find only elements with specific tags: pass "{ns}localname" as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. "localname" is equivalent to "{}localname" (i.e. no namespace) but "*" is "{*}*" (any or no namespace), not "{}*".

You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

iterancestors(self, tag=None, *tags)

Iterate over the ancestors of this element (from parent to parent).

Can be restricted to find only elements with specific tags, see iter.

iterchildren(self, tag=None, *tags, reversed=False)

Iterate over the children of this element.

As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.

iterdescendants(self, tag=None, *tags)

Iterate over the descendants of this element in document order.

As opposed to el.iter(), this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.

iterfind(self, path, namespaces=None)

Iterates over all matching subelements, by tag name or path.

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

itersiblings(self, tag=None, *tags, preceding=False)

Iterate over the following or preceding siblings of this element.

The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.

Can be restricted to find only elements with specific tags, see iter.

itertext(self, tag=None, *tags, with_tail=True)

Iterates over the text content of a subtree.

You can pass tag names to restrict text content to specific elements, see iter.

You can set the with_tail keyword argument to False to skip over tail text.

keys(self)
makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with the same document.

parseXSL(self, parser=None)

Try to parse the stylesheet referenced by this PI and return an ElementTree for it. If the stylesheet is embedded in the same document (referenced via xml:id), find and return an ElementTree for the stylesheet Element.

The optional parser keyword argument can be passed to specify the parser used to read from external stylesheet URLs.

remove(self, element)

Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.

replace(self, old_element, new_element)

Replaces a subelement with the element passed as second argument.

set(self, key, value)

Supports setting the ‘href’ pseudo-attribute in the text of the processing instruction.

values(self)
xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

Evaluate an xpath expression using the element as context node.

attrib

Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.

base

The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.

Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.

Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).

nsmap

Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.

Note that changing the returned dict has no effect on the Element.

prefix

Namespace prefix or None.

sourceline

Original line number as found by the parser or None if unknown.

tag
tail

Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.

target
text
class lxml.etree._XSLTResultTree

Bases: _ElementTree

The result of an XSLT evaluation.

Use str() or bytes() (or unicode() in Python 2.x) to serialise to a string, and the .write_output() method to write serialise to a file.

_setroot(self, root)

Relocate the ElementTree to a new root node.

find(self, path, namespaces=None)

Finds the first toplevel element with given tag. Same as tree.getroot().find(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findall(self, path, namespaces=None)

Finds all elements matching the ElementPath expression. Same as getroot().findall(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

findtext(self, path, default=None, namespaces=None)

Finds the text for the first element matching the ElementPath expression. Same as getroot().findtext(path)

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

getelementpath(self, element)

Returns a structural, absolute ElementPath expression to find the element. This path can be used in the .find() method to look up the element, provided that the elements along the path and their list of immediate children were not modified in between.

ElementPath has the advantage over an XPath expression (as returned by the .getpath() method) that it does not require additional prefix declarations. It is always self-contained.

getiterator(self, *tags, tag=None)

Returns a sequence or iterator of all elements in document order (depth first pre-order), starting with the root element.

Can be restricted to find only elements with specific tags, see _Element.iter.

Deprecated:

Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the tree.iter() method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.

getpath(self, element)

Returns a structural, absolute XPath expression to find the element.

For namespaced elements, the expression uses prefixes from the document, which therefore need to be provided in order to make any use of the expression in XPath.

Also see the method getelementpath(self, element), which returns a self-contained ElementPath expression.

getroot(self)

Gets the root element for this tree.

iter(self, tag=None, *tags)

Creates an iterator for the root element. The iterator loops over all elements in this tree, in document order. Note that siblings of the root element (comments or processing instructions) are not returned by the iterator.

Can be restricted to find only elements with specific tags, see _Element.iter.

iterfind(self, path, namespaces=None)

Iterates over all elements matching the ElementPath expression. Same as getroot().iterfind(path).

The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.

parse(self, source, parser=None, base_url=None)

Updates self with the content of source and returns its root.

relaxng(self, relaxng)

Validate this document using other document.

The relaxng argument is a tree that should contain a Relax NG schema.

Returns True or False, depending on whether validation succeeded.

Note: if you are going to apply the same Relax NG schema against multiple documents, it is more efficient to use the RelaxNG class directly.

write(file, *, encoding=None, method='xml', pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False, docstring=None)
write(self, file, encoding=None, method=”xml”,

pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)

Write the tree to a filename, file or file-like object.

Defaults to ASCII encoding and writing a declaration as needed.

The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, ‘text’, ‘c14n’ or ‘c14n2’. Default is ‘xml’.

With method="c14n" (C14N version 1), the options exclusive, with_comments and inclusive_ns_prefixes request exclusive C14N, include comments, and list the inclusive prefixes respectively.

With method="c14n2" (C14N version 2), the with_comments and strip_text options control the output of comments and text space according to C14N 2.0.

Passing a boolean value to the standalone option will output an XML declaration with the corresponding standalone flag.

The doctype option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.

The compression option enables GZip compression level 1-9.

The inclusive_ns_prefixes should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.

If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.

write_c14n(file, *, exclusive=False, with_comments=True, compression=0, inclusive_ns_prefixes=None)
write_c14n(self, file, exclusive=False, with_comments=True,

compression=0, inclusive_ns_prefixes=None)

C14N write of document. Always writes UTF-8.

The compression option enables GZip compression level 1-9.

The inclusive_ns_prefixes should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.

If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.

NOTE: This method is deprecated as of lxml 4.4 and will be removed in a future release. Use .write(f, method="c14n") instead.

write_output(self, file, *, compression=0)

Serialise the XSLT output to a file or file-like object.

As opposed to the generic .write() method, .write_output() serialises the result as defined by the <xsl:output> tag.

xinclude(self)

Process the XInclude nodes in this document and include the referenced XML fragments.

There is support for loading files through the file system, HTTP and FTP.

Note that XInclude does not support custom resolvers in Python space due to restrictions of libxml2 <= 2.6.29.

xmlschema(self, xmlschema)

Validate this document using other document.

The xmlschema argument is a tree that should contain an XML Schema.

Returns True or False, depending on whether validation succeeded.

Note: If you are going to apply the same XML Schema against multiple documents, it is more efficient to use the XMLSchema class directly.

xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)

XPath evaluate in context of document.

namespaces is an optional dictionary with prefix to namespace URI mappings, used by XPath. extensions defines additional extension functions.

Returns a list (nodeset), or bool, float or string.

In case of a list result, return Element for element nodes, string for text and attribute values.

Note: if you are going to apply multiple XPath expressions against the same document, it is more efficient to use XPathEvaluator directly.

xslt(self, _xslt, extensions=None, access_control=None, **_kw)

Transform this document using other document.

xslt is a tree that should be XSLT keyword parameters are XSLT transformation parameters.

Returns the transformed tree.

Note: if you are going to apply the same XSLT stylesheet against multiple documents, it is more efficient to use the XSLT class directly.

docinfo

Information about the document provided by parser and DTD.

parser

The parser that was used to parse the document in this ElementTree.

xslt_profile

Return an ElementTree with profiling data for the stylesheet run.

class lxml.etree.htmlfile(self, output_file, encoding=None, compression=None, close=False, buffered=True)

Bases: xmlfile

A simple mechanism for incremental HTML serialisation. Works the same as xmlfile.

class lxml.etree.iterparse(self, source, events=('end',), tag=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, remove_blank_text=False, remove_comments=False, remove_pis=False, encoding=None, html=False, recover=None, huge_tree=False, schema=None)

Bases: object

Incremental parser.

Parses XML into a tree and generates tuples (event, element) in a SAX-like fashion. event is any of ‘start’, ‘end’, ‘start-ns’, ‘end-ns’.

For ‘start’ and ‘end’, element is the Element that the parser just found opening or closing. For ‘start-ns’, it is a tuple (prefix, URI) of a new namespace declaration. For ‘end-ns’, it is simply None. Note that all start and end events are guaranteed to be properly nested.

The keyword argument events specifies a sequence of event type names that should be generated. By default, only ‘end’ events will be generated.

The additional tag argument restricts the ‘start’ and ‘end’ events to those elements that match the given tag. The tag argument can also be a sequence of tags to allow matching more than one tag. By default, events are generated for all elements. Note that the ‘start-ns’ and ‘end-ns’ events are not impacted by this restriction.

The other keyword arguments in the constructor are mainly based on the libxml2 parser configuration. A DTD will also be loaded if validation or attribute default values are requested.

Available boolean keyword arguments:
  • attribute_defaults: read default attributes from DTD

  • dtd_validation: validate (if DTD is available)

  • load_dtd: use DTD for parsing

  • no_network: prevent network access for related files

  • remove_blank_text: discard blank text nodes

  • remove_comments: discard comments

  • remove_pis: discard processing instructions

  • strip_cdata: replace CDATA sections by normal text content (default: True)

  • compact: safe memory for short text content (default: True)

  • resolve_entities: replace entities by their text value (default: True)

  • huge_tree: disable security restrictions and support very deep trees

    and very long text content (only affects libxml2 2.7+)

  • html: parse input as HTML (default: XML)

  • recover: try hard to parse through broken input (default: True for HTML,

    False otherwise)

Other keyword arguments:
  • encoding: override the document encoding

  • schema: an XMLSchema to validate against

makeelement(self, _tag, attrib=None, nsmap=None, **_extra)

Creates a new element associated with this parser.

set_element_class_lookup(self, lookup=None)

Set a lookup scheme for element classes generated from this parser.

Reset it by passing None or nothing.

error_log

The error log of the last (or current) parser run.

resolvers

The custom resolver registry of the last (or current) parser run.

root
version

The version of the underlying XML parser.

class lxml.etree.iterwalk(self, element_or_tree, events=('end',), tag=None)

Bases: object

A tree walker that generates events from an existing tree as if it was parsing XML data with iterparse().

Just as for iterparse(), the tag argument can be a single tag or a sequence of tags.

After receiving a ‘start’ or ‘start-ns’ event, the children and descendants of the current element can be excluded from iteration by calling the skip_subtree() method.

skip_subtree()

Prevent descending into the current subtree. Instead, the next returned event will be the ‘end’ event of the current element (if included), ignoring any children or descendants.

This has no effect right after an ‘end’ or ‘end-ns’ event.

class lxml.etree.xmlfile(self, output_file, encoding=None, compression=None, close=False, buffered=True)

Bases: object

A simple mechanism for incremental XML serialisation.

Usage example:

with xmlfile("somefile.xml", encoding='utf-8') as xf:
    xf.write_declaration(standalone=True)
    xf.write_doctype('<!DOCTYPE root SYSTEM "some.dtd">')

    # generate an element (the root element)
    with xf.element('root'):
         # write a complete Element into the open root element
         xf.write(etree.Element('test'))

         # generate and write more Elements, e.g. through iterparse
         for element in generate_some_elements():
             # serialise generated elements into the XML file
             xf.write(element)

         # or write multiple Elements or strings at once
         xf.write(etree.Element('start'), "text", etree.Element('end'))

If ‘output_file’ is a file(-like) object, passing close=True will close it when exiting the context manager. By default, it is left to the owner to do that. When a file path is used, lxml will take care of opening and closing the file itself. Also, when a compression level is set, lxml will deliberately close the file to make sure all data gets compressed and written.

Setting buffered=False will flush the output after each operation, such as opening or closing an xf.element() block or calling xf.write(). Alternatively, calling xf.flush() can be used to explicitly flush any pending output when buffering is enabled.

lxml.etree.Comment(text=None)

Comment element factory. This factory function creates a special element that will be serialized as an XML comment.

lxml.etree.Element(_tag, attrib=None, nsmap=None, **_extra)

Element factory. This function returns an object implementing the Element interface.

Also look at the _Element.makeelement() and _BaseParser.makeelement() methods, which provide a faster way to create an Element within a specific document or parser context.

lxml.etree.ElementTree(element=None, file=None, parser=None)

ElementTree wrapper class.

lxml.etree.Entity(name)

Entity factory. This factory function creates a special element that will be serialized as an XML entity reference or character reference. Note, however, that entities will not be automatically declared in the document. A document that uses entity references requires a DTD to define the entities.

lxml.etree.Extension(module, function_mapping=None, ns=None)

Build a dictionary of extension functions from the functions defined in a module or the methods of an object.

As second argument, you can pass an additional mapping of attribute names to XPath function names, or a list of function names that should be taken.

The ns keyword argument accepts a namespace URI for the XPath functions.

lxml.etree.FunctionNamespace(ns_uri)

Retrieve the function namespace object associated with the given URI.

Creates a new one if it does not yet exist. A function namespace can only be used to register extension functions.

Usage:

>>> ns_functions = FunctionNamespace("http://schema.org/Movie")
>>> @ns_functions  # uses function name
... def add2(x):
...     return x + 2
>>> @ns_functions("add3")  # uses explicit name
... def add_three(x):
...     return x + 3
lxml.etree.HTML(text, parser=None, base_url=None)

Parses an HTML document from a string constant. Returns the root node (or the result returned by a parser target). This function can be used to embed “HTML literals” in Python code.

To override the parser with a different HTMLParser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).

lxml.etree.PI(target, text=None)

ProcessingInstruction(target, text=None)

ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.

lxml.etree.ProcessingInstruction(target, text=None)

ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.

lxml.etree.SubElement(_parent, _tag, attrib=None, nsmap=None, **_extra)

Subelement factory. This function creates an element instance, and appends it to an existing element.

lxml.etree.XML(text, parser=None, base_url=None)

Parses an XML document or fragment from a string constant. Returns the root node (or the result returned by a parser target). This function can be used to embed “XML literals” in Python code, like in

>>> root = XML("<root><test/></root>")
>>> print(root.tag)
root

To override the parser with a different XMLParser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).

lxml.etree.XMLDTDID(text, parser=None, base_url=None)

Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of ID attributes as defined by the DTD. The elements referenced by the ID are stored as dictionary values.

Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.

lxml.etree.XMLID(text, parser=None, base_url=None)

Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of ‘id’ attributes. The elements referenced by the ID are stored as dictionary values.

lxml.etree.XPathEvaluator(etree_or_element, namespaces=None, extensions=None, regexp=True, smart_strings=True)

Creates an XPath evaluator for an ElementTree or an Element.

The resulting object can be called with an XPath expression as argument and XPath variables provided as keyword arguments.

Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass smart_strings=False.

lxml.etree.adopt_external_document(capsule, parser=None)

Unpack a libxml2 document pointer from a PyCapsule and wrap it in an lxml ElementTree object.

This allows external libraries to build XML/HTML trees using libxml2 and then pass them efficiently into lxml for further processing.

If a parser is provided, it will be used for configuring the lxml document. No parsing will be done.

The capsule must have the name "libxml2:xmlDoc" and its pointer value must reference a correct libxml2 document of type xmlDoc*. The creator of the capsule must take care to correctly clean up the document using an appropriate capsule destructor. By default, the libxml2 document will be copied to let lxml safely own the memory of the internal tree that it uses.

If the capsule context is non-NULL, it must point to a C string that can be compared using strcmp(). If the context string equals "destructor:xmlFreeDoc", the libxml2 document will not be copied but the capsule invalidated instead by clearing its destructor and name. That way, lxml takes ownership of the libxml2 document in memory without creating a copy first, and the capsule destructor will not be called. The document will then eventually be cleaned up by lxml using the libxml2 API function xmlFreeDoc() once it is no longer used.

If no copy is made, later modifications of the tree outside of lxml should not be attempted after transferring the ownership.

lxml.etree.canonicalize(xml_data=None, *, out=None, from_file=None, **options)

Convert XML to its C14N 2.0 serialised form.

If out is provided, it must be a file or file-like object that receives the serialised canonical XML output (text, not bytes) through its .write() method. To write to a file, open it in text mode with encoding “utf-8”. If out is not provided, this function returns the output as text string.

Either xml_data (an XML string, tree or Element) or file (a file path or file-like object) must be provided as input.

The configuration options are the same as for the C14NWriterTarget.

lxml.etree.cleanup_namespaces(tree_or_element, top_nsmap=None, keep_ns_prefixes=None)

Remove all namespace declarations from a subtree that are not used by any of the elements or attributes in that tree.

If a ‘top_nsmap’ is provided, it must be a mapping from prefixes to namespace URIs. These namespaces will be declared on the top element of the subtree before running the cleanup, which allows moving namespace declarations to the top of the tree.

If a ‘keep_ns_prefixes’ is provided, it must be a list of prefixes. These prefixes will not be removed as part of the cleanup.

lxml.etree.clear_error_log()

Clear the global error log. Note that this log is already bound to a fixed size.

Note: since lxml 2.2, the global error log is local to a thread and this function will only clear the global error log of the current thread.

lxml.etree.dump(elem, pretty_print=True, with_tail=True)

Writes an element tree or element structure to sys.stdout. This function should be used for debugging only.

lxml.etree.fromstring(text, parser=None, base_url=None)

Parses an XML document or fragment from a string. Returns the root node (or the result returned by a parser target).

To override the default parser with a different parser you can pass it to the parser keyword argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).

lxml.etree.fromstringlist(strings, parser=None)

Parses an XML document from a sequence of strings. Returns the root node (or the result returned by a parser target).

To override the default parser with a different parser you can pass it to the parser keyword argument.

lxml.etree.get_default_parser()
lxml.etree.indent(tree, space='  ', level=0)

Indent an XML document by inserting newlines and indentation space after elements.

tree is the ElementTree or Element to modify. The (root) element itself will not be changed, but the tail text of all elements in its subtree will be adapted.

space is the whitespace to insert for each indentation level, two space characters by default.

level is the initial indentation level. Setting this to a higher value than 0 can be used for indenting subtrees that are more deeply nested inside of a document.

lxml.etree.iselement(element)

Checks if an object appears to be a valid element object.

lxml.etree.parse(source, parser=None, base_url=None)

Return an ElementTree object loaded with source elements. If no parser is provided as second argument, the default parser is used.

The source can be any of the following:

  • a file name/path

  • a file object

  • a file-like object

  • a URL using the HTTP or FTP protocol

To parse from a string, use the fromstring() function instead.

Note that it is generally faster to parse from a file path or URL than from an open file object or file-like object. Transparent decompression from gzip compressed sources is supported (unless explicitly disabled in libxml2).

The base_url keyword allows setting a URL for the document when parsing from a file-like object. This is needed when looking up external entities (DTD, XInclude, …) with relative paths.

lxml.etree.parseid(source, parser=None)

Parses the source into a tuple containing an ElementTree object and an ID dictionary. If no parser is provided as second argument, the default parser is used.

Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.

lxml.etree.register_namespace(prefix, uri)

Registers a namespace prefix that newly created Elements in that namespace will use. The registry is global, and any existing mapping for either the given prefix or the namespace URI will be removed.

lxml.etree.set_default_parser(parser=None)

Set a default parser for the current thread. This parser is used globally whenever no parser is supplied to the various parse functions of the lxml API. If this function is called without a parser (or if it is None), the default parser is reset to the original configuration.

Note that the pre-installed default parser is not thread-safe. Avoid the default parser in multi-threaded environments. You can create a separate parser for each thread explicitly or use a parser pool.

lxml.etree.set_element_class_lookup(lookup=None)

Set the global element class lookup method.

This defines the main entry point for looking up element implementations. The standard implementation uses the ParserBasedElementClassLookup to delegate to different lookup schemes for each parser.

Warning

This should only be changed by applications, not by library packages. In most cases, parser specific lookups should be preferred, which can be configured via set_element_class_lookup() (and the same for HTML parsers).

Globally replacing the element class lookup by something other than a ParserBasedElementClassLookup will prevent parser specific lookup schemes from working. Several tools rely on parser specific lookups, including lxml.html and lxml.objectify.

lxml.etree.strip_attributes(tree_or_element, *attribute_names)

Delete all attributes with the provided attribute names from an Element (or ElementTree) and its descendants.

Attribute names can contain wildcards as in _Element.iter.

Example usage:

strip_attributes(root_element,
                 'simpleattr',
                 '{http://some/ns}attrname',
                 '{http://other/ns}*')
lxml.etree.strip_elements(tree_or_element, *tag_names, with_tail=True)

Delete all elements with the provided tag names from a tree or subtree. This will remove the elements and their entire subtree, including all their attributes, text content and descendants. It will also remove the tail text of the element unless you explicitly set the with_tail keyword argument option to False.

Tag names can contain wildcards as in _Element.iter.

Note that this will not delete the element (or ElementTree root element) that you passed even if it matches. It will only treat its descendants. If you want to include the root element, check its tag name directly before even calling this function.

Example usage:

strip_elements(some_element,
    'simpletagname',             # non-namespaced tag
    '{http://some/ns}tagname',   # namespaced tag
    '{http://some/other/ns}*'    # any tag from a namespace
    lxml.etree.Comment           # comments
    )
lxml.etree.strip_tags(tree_or_element, *tag_names)

Delete all elements with the provided tag names from a tree or subtree. This will remove the elements and their attributes, but not their text/tail content or descendants. Instead, it will merge the text content and children of the element into its parent.

Tag names can contain wildcards as in _Element.iter.

Note that this will not delete the element (or ElementTree root element) that you passed even if it matches. It will only treat its descendants.

Example usage:

strip_tags(some_element,
    'simpletagname',             # non-namespaced tag
    '{http://some/ns}tagname',   # namespaced tag
    '{http://some/other/ns}*'    # any tag from a namespace
    Comment                      # comments (including their text!)
    )
lxml.etree.tostring(element_or_tree, *, encoding=None, method='xml', xml_declaration=None, pretty_print=False, with_tail=True, standalone=None, doctype=None, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)
tostring(element_or_tree, encoding=None, method=”xml”,

xml_declaration=None, pretty_print=False, with_tail=True, standalone=None, doctype=None, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False, )

Serialize an element to an encoded string representation of its XML tree.

Defaults to ASCII encoding without XML declaration. This behaviour can be configured with the keyword arguments ‘encoding’ (string) and ‘xml_declaration’ (bool). Note that changing the encoding to a non UTF-8 compatible encoding will enable a declaration by default.

You can also serialise to a Unicode string without declaration by passing the name 'unicode' as encoding (or the str function in Py3 or unicode in Py2). This changes the return value from a byte string to an unencoded unicode string.

The keyword argument ‘pretty_print’ (bool) enables formatted XML.

The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, plain ‘text’ (text content without tags), ‘c14n’ or ‘c14n2’. Default is ‘xml’.

With method="c14n" (C14N version 1), the options exclusive, with_comments and inclusive_ns_prefixes request exclusive C14N, include comments, and list the inclusive prefixes respectively.

With method="c14n2" (C14N version 2), the with_comments and strip_text options control the output of comments and text space according to C14N 2.0.

Passing a boolean value to the standalone option will output an XML declaration with the corresponding standalone flag.

The doctype option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.

You can prevent the tail text of the element from being serialised by passing the boolean with_tail option. This has no impact on the tail text of children, which will always be serialised.

lxml.etree.tostringlist(element_or_tree, *args, **kwargs)

Serialize an element to an encoded string representation of its XML tree, stored in a list of partial strings.

This is purely for ElementTree 1.3 compatibility. The result is a single string wrapped in a list.

lxml.etree.tounicode(element_or_tree, *, method='xml', pretty_print=False, with_tail=True, doctype=None)
tounicode(element_or_tree, method=”xml”, pretty_print=False,

with_tail=True, doctype=None)

Serialize an element to the Python unicode representation of its XML tree.

Deprecated:

use tostring(el, encoding='unicode') instead.

Note that the result does not carry an XML encoding declaration and is therefore not necessarily suited for serialization to byte streams without further treatment.

The boolean keyword argument ‘pretty_print’ enables formatted XML.

The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’ or plain ‘text’.

You can prevent the tail text of the element from being serialised by passing the boolean with_tail option. This has no impact on the tail text of children, which will always be serialised.

lxml.etree.use_global_python_log(log)

Replace the global error log by an etree.PyErrorLog that uses the standard Python logging package.

Note that this disables access to the global error log from exceptions. Parsers, XSLT etc. will continue to provide their normal local error log.

Note: prior to lxml 2.2, this changed the error log globally. Since lxml 2.2, the global error log is local to a thread and this function will only set the global error log of the current thread.