lxml.objectify module¶

The lxml.objectify module implements a Python object API for XML. It is based on lxml.etree.

class lxml.objectify.BoolElement¶

Bases: lxml.objectify.IntElement

Boolean type base on string values: ‘true’ or ‘false’.

Note that this inherits from IntElement to mimic the behaviour of Python’s bool type.

_init(self)¶: Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

pyval¶

class lxml.objectify.ElementMaker(self, namespace=None, nsmap=None, annotate=True, makeelement=None)¶

Bases: object

An ElementMaker that can be used for constructing trees.

Example:

>>> M = ElementMaker(annotate=False)
>>> attributes = {'class': 'par'}
>>> html = M.html( M.body( M.p('hello', attributes, M.br, 'objectify', style="font-weight: bold") ) )

>>> from lxml.etree import tostring
>>> print(tostring(html, method='html').decode('ascii'))
<html><body><p style="font-weight: bold" class="par">hello<br>objectify</p></body></html>

To create tags that are not valid Python identifiers, call the factory directly and pass the tag name as first argument:

>>> root = M('tricky-tag', 'some text')
>>> print(root.tag)
tricky-tag
>>> print(root.text)
some text

Note that this module has a predefined ElementMaker instance called E.

class lxml.objectify.FloatElement¶

Bases: lxml.objectify.NumberElement

_init(self)¶: Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

class lxml.objectify.IntElement¶

Bases: lxml.objectify.NumberElement

_init(self)¶: Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

class lxml.objectify.LongElement¶

Bases: lxml.objectify.NumberElement

_init(self)¶: Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.

class lxml.objectify.NoneElement¶

Bases: lxml.objectify.ObjectifiedDataElement

pyval¶

class lxml.objectify.NumberElement¶

Bases: lxml.objectify.ObjectifiedDataElement

_setValueParser(function)¶

Set the function that parses the Python value from a string.

Do not use this unless you know what you are doing.

pyval¶

class lxml.objectify.ObjectPath(path)¶

Bases: object

Immutable object that represents a compiled object path.

Example for a path: ‘root.child[1].{other}child[25]’

addattr(self, root, value)¶

Append a value to the target element in a subtree.

If any of the children on the path does not exist, it is created.

hasattr(self, root)¶

setattr(self, root, value)¶

Set the value of the target element in a subtree.

If any of the children on the path does not exist, it is created.

find¶

class lxml.objectify.ObjectifiedDataElement¶

Bases: lxml.objectify.ObjectifiedElement

This is the base class for all data type Elements. Subclasses should override the ‘pyval’ property and possibly the __str__ method.

_setText(s)¶: For use in subclasses only. Don’t use unless you know what you are doing.

pyval¶

class lxml.objectify.ObjectifiedElement¶

Bases: lxml.etree.ElementBase

Main XML Element class.

Element children are accessed as object attributes. Multiple children with the same name are available through a list index. Example:

>>> root = XML("<root><c1><c2>0</c2><c2>1</c2></c1></root>")
>>> second_c2 = root.c1.c2[1]
>>> print(second_c2.text)
1

Note that you cannot (and must not) instantiate this class or its subclasses.

addattr(self, tag, value)¶

Add a child value to the element.

As opposed to append(), it sets a data value, not an element.

countchildren(self)¶: Return the number of children of this element, regardless of their name.

descendantpaths(self, prefix=None)¶: Returns a list of object path expressions for all descendants.

getchildren(self)¶: Returns a sequence of all direct children. The elements are returned in document order.

text¶

class lxml.objectify.ObjectifyElementClassLookup(self, tree_class=None, empty_data_class=None)¶

Bases: lxml.etree.ElementClassLookup

Element class lookup method that uses the objectify classes.

class lxml.objectify.PyType(self, name, type_check, type_class, stringify=None)¶

Bases: object

User defined type.

Named type that contains a type check function, a type class that inherits from ObjectifiedDataElement and an optional “stringification” function. The type check must take a string as argument and raise ValueError or TypeError if it cannot handle the string value. It may be None in which case it is not considered for type guessing. For registered named types, the ‘stringify’ function (or unicode() if None) is used to convert a Python object with type name ‘name’ to the string representation stored in the XML tree.

Example:

PyType('int', int, MyIntClass).register()

Note that the order in which types are registered matters. The first matching type will be used.

register(self, before=None, after=None)¶

Register the type.

The additional keyword arguments ‘before’ and ‘after’ accept a sequence of type names that must appear before/after the new type in the type list. If any of them is not currently known, it is simply ignored. Raises ValueError if the dependencies cannot be fulfilled.

unregister(self)¶

name¶

stringify¶

type_check¶

xmlSchemaTypes¶

The list of XML Schema datatypes this Python type maps to.

Note that this must be set before registering the type!

class lxml.objectify.StringElement¶

Bases: lxml.objectify.ObjectifiedDataElement

String data class.

Note that this class does not support the sequence protocol of strings: len(), iter(), str_attr[0], str_attr[0:1], etc. are not supported. Instead, use the .text attribute to get a ‘real’ string.

strlen()¶

pyval¶

lxml.objectify.DataElement(_value, attrib=None, nsmap=None, _pytype=None, _xsi=None, **_attributes)¶

Create a new element from a Python value and XML attributes taken from keyword arguments or a dictionary passed as second argument.

Automatically adds a ‘pytype’ attribute for the Python type of the value, if the type can be identified. If ‘_pytype’ or ‘_xsi’ are among the keyword arguments, they will be used instead.

If the _value argument is an ObjectifiedDataElement instance, its py:pytype, xsi:type and other attributes and nsmap are reused unless they are redefined in attrib and/or keyword arguments.

lxml.objectify.Element(_tag, attrib=None, nsmap=None, _pytype=None, **_attributes)¶

Objectify specific version of the lxml.etree Element() factory that always creates a structural (tree) element.

NOTE: requires parser based element class lookup activated in lxml.etree!

lxml.objectify.XML(xml, parser=None, base_url=None)¶

Objectify specific version of the lxml.etree XML() literal factory that uses the objectify parser.

You can pass a different parser as second argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).

lxml.objectify.__unpickleElementTree(data)¶

lxml.objectify.annotate(element_or_tree, ignore_old=True, ignore_xsi=False, empty_pytype=None, empty_type=None, annotate_xsi=0, annotate_pytype=1)¶

Recursively annotates the elements of an XML tree with ‘xsi:type’ and/or ‘py:pytype’ attributes.

If the ‘ignore_old’ keyword argument is True (the default), current ‘py:pytype’ attributes will be ignored for the type annotation. Set to False if you want reuse existing ‘py:pytype’ information (iff appropriate for the element text value).

If the ‘ignore_xsi’ keyword argument is False (the default), existing ‘xsi:type’ attributes will be used for the type annotation, if they fit the element text values.

Note that the mapping from Python types to XSI types is usually ambiguous. Currently, only the first XSI type name in the corresponding PyType definition will be used for annotation. Thus, you should consider naming the widest type first if you define additional types.

The default ‘py:pytype’ annotation of empty elements can be set with the empty_pytype keyword argument. Pass ‘str’, for example, to make string values the default.

The default ‘xsi:type’ annotation of empty elements can be set with the empty_type keyword argument. The default is not to annotate empty elements. Pass ‘string’, for example, to make string values the default.

The keyword arguments ‘annotate_xsi’ (default: 0) and ‘annotate_pytype’ (default: 1) control which kind(s) of annotation to use.

lxml.objectify.deannotate(element_or_tree, pytype=True, xsi=True, xsi_nil=False, cleanup_namespaces=False)¶

Recursively de-annotate the elements of an XML tree by removing ‘py:pytype’ and/or ‘xsi:type’ attributes and/or ‘xsi:nil’ attributes.

If the ‘pytype’ keyword argument is True (the default), ‘py:pytype’ attributes will be removed. If the ‘xsi’ keyword argument is True (the default), ‘xsi:type’ attributes will be removed. If the ‘xsi_nil’ keyword argument is True (default: False), ‘xsi:nil’ attributes will be removed.

Note that this does not touch the namespace declarations by default. If you want to remove unused namespace declarations from the tree, pass the option cleanup_namespaces=True.

lxml.objectify.dump(_Element element not None)¶: Return a recursively generated string representation of an element.

lxml.objectify.enable_recursive_str(on=True)¶: Enable a recursively generated tree representation for str(element), based on objectify.dump(element).

lxml.objectify.fromstring(xml, parser=None, base_url=None)¶

Objectify specific version of the lxml.etree fromstring() function that uses the objectify parser.

You can pass a different parser as second argument.

The base_url keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).

lxml.objectify.getRegisteredTypes()¶

Returns a list of the currently registered PyType objects.

To add a new type, retrieve this list and call unregister() for all entries. Then add the new type at a suitable position (possibly replacing an existing one) and call register() for all entries.

This is necessary if the new type interferes with the type check functions of existing ones (normally only int/float/bool) and must the tried before other types. To add a type that is not yet parsable by the current type check functions, you can simply register() it, which will append it to the end of the type list.

lxml.objectify.makeparser(remove_blank_text=True, **kw)¶

Create a new XML parser for objectify trees.

You can pass all keyword arguments that are supported by etree.XMLParser(). Note that this parser defaults to removing blank text. You can disable this by passing the remove_blank_text boolean keyword option yourself.

lxml.objectify.parse(f, parser=None, base_url=None)¶

Parse a file or file-like object with the objectify parser.

You can pass a different parser as second argument.

The base_url keyword allows setting a URL for the document when parsing from a file-like object. This is needed when looking up external entities (DTD, XInclude, …) with relative paths.

lxml.objectify.pyannotate(element_or_tree, ignore_old=False, ignore_xsi=False, empty_pytype=None)¶

Recursively annotates the elements of an XML tree with ‘pytype’ attributes.

If the ‘ignore_old’ keyword argument is True (the default), current ‘pytype’ attributes will be ignored and replaced. Otherwise, they will be checked and only replaced if they no longer fit the current text value.

Setting the keyword argument ignore_xsi to True makes the function additionally ignore existing xsi:type annotations. The default is to use them as a type hint.

The default annotation of empty elements can be set with the empty_pytype keyword argument. The default is not to annotate empty elements. Pass ‘str’, for example, to make string values the default.

lxml.objectify.pytypename(obj)¶: Find the name of the corresponding PyType for a Python object.

lxml.objectify.set_default_parser(new_parser=None)¶

Replace the default parser used by objectify’s Element() and fromstring() functions.

The new parser must be an etree.XMLParser.

Call without arguments to reset to the original parser.

lxml.objectify.set_pytype_attribute_tag(attribute_tag=None)¶

Change name and namespace of the XML attribute that holds Python type information.

Do not use this unless you know what you are doing.

Reset by calling without argument.

Default: “{http://codespeak.net/lxml/objectify/pytype}pytype”

lxml.objectify.xsiannotate(element_or_tree, ignore_old=False, ignore_pytype=False, empty_type=None)¶

Recursively annotates the elements of an XML tree with ‘xsi:type’ attributes.

If the ‘ignore_old’ keyword argument is True (the default), current ‘xsi:type’ attributes will be ignored and replaced. Otherwise, they will be checked and only replaced if they no longer fit the current text value.

Note that the mapping from Python types to XSI types is usually ambiguous. Currently, only the first XSI type name in the corresponding PyType definition will be used for annotation. Thus, you should consider naming the widest type first if you define additional types.

Setting the keyword argument ignore_pytype to True makes the function additionally ignore existing pytype annotations. The default is to use them as a type hint.

The default annotation of empty elements can be set with the empty_type keyword argument. The default is not to annotate empty elements. Pass ‘string’, for example, to make string values the default.