The public C-API of lxml.etree

As of version 1.1, lxml.etree provides a public C-API. This allows external C extensions to efficiently access public functions and classes of lxml, without going through the Python API.

The API is described in the file etreepublic.pxd, which is directly c-importable by extension modules implemented in Pyrex or Cython.

Writing external modules in Cython

This is the easiest way of extending lxml at the C level. A Cython (or Pyrex) module should start like this:

# My Cython extension

# import the public functions and classes of lxml.etree
cimport etreepublic as cetree

# import the lxml.etree module in Python
cdef object etree
from lxml import etree

# initialize the access to the C-API of lxml.etree
cetree.import_lxml__etree()

From this line on, you can access all public functions of lxml.etree from the cetree namespace like this:

# build a tag name from namespace and element name
py_tag = cetree.namespacedNameFromNsName("http://some/url", "myelement")

Public lxml classes are easily subclassed. For example, to implement and set a new default element class, you can write Cython code like the following:

from etreepublic cimport ElementBase
cdef class NewElementClass(ElementBase):
     def set_value(self, myval):
         self.set("my_attribute", myval)

etree.set_element_class_lookup(
     etree.DefaultElementClassLookup(element=NewElementClass))

Writing external modules in C

If you really feel like it, you can also interface with lxml.etree straight from C code. All you have to do is include the header file for the public API, import the lxml.etree module and then call the import function:

/* My C extension */

/* common includes */
#include "Python.h"
#include "stdio.h"
#include "string.h"
#include "stdarg.h"
#include "libxml/xmlversion.h"
#include "libxml/encoding.h"
#include "libxml/hash.h"
#include "libxml/tree.h"
#include "libxml/xmlIO.h"
#include "libxml/xmlsave.h"
#include "libxml/globals.h"
#include "libxml/xmlstring.h"

/* lxml.etree specific includes */
#include "lxml-version.h"
#include "etree_defs.h"
#include "etree.h"

/* setup code */
import_lxml__etree()

Note that including etree.h does not automatically include the header files it requires. Note also that the above list of common includes may not be sufficient.

Table Of Contents

This Page