How to build lxml from source

To build lxml from source, you need libxml2 and libxslt properly installed.

Pyrex

The lxml.etree module is written in Pyrex. To build lxml from source, you therefore need a working Pyrex installation. Pyrex now supports EasyInstall, so you can install it by running the following command as super-user:

easy_install Pyrex

Note that Pyrex up to and including version 0.9.4 has known problems when compiling lxml with gcc 4.0 or Python 2.4. Do not use it. If you want to build lxml from non-release sources, please install Pyrex version 0.9.4.1 or later.

Subversion

The lxml package is developed in a Subversion repository. You can retrieve the current developer version by calling:

svn co http://codespeak.net/svn/lxml/trunk lxml

This will create a directory lxml and download the source into it. You can also browse the repository through the web or use your favourite SVN client to access it.

The distutils approach

Usually, building lxml is done through distutils. Do a Subversion checkout (or download the source tar-ball and unpack it) and then type:

python setup.py build

If you want to test lxml from the source directory, it is better to build it in-place like this:

python setup.py  build_ext -i

or, in Unix-like environments:

make

If you then place lxml's src directory on your PYTHONPATH somehow, you can import lxml.etree and play with it.

Running the tests and reporting errors

The source distribution (tgz) contains a test suite for lxml. You can run it from the top-level directory:

python test.py

Note that the test script only tests the in-place build (see distutils building above), as it searches the src directory. You can use the following one-step command to trigger an in-place build and test it:

make clean test

To run the ElementTree and cElementTree compatibility tests, make sure you have lxml on your PYTHONPATH first, then run:

python selftest.py

and:

python selftest2.py

If the tests give failures, errors, or worse, segmentation faults, we'd really like to know. Please contact us on the mailing list, and please specify the version of lxml, libxml2, libxslt and Python you were using, as well as your operating system type (Linux, Windows, MacOs, ...).

Static linking on Windows

Most operating systems have proper package management that makes installing current versions of libxml2 and libxslt easy. The most famous exception is Microsoft Windows, which entirely lacks these capabilities. It can therefore be interesting to statically link the external libraries into lxml.etree to avoid having to install them separately. David Sankel proposed the following approach.

Download lxml and all required libraries to the same directory. The iconv, libxml2, libxslt, and zlib libraries are all available from the ftp site ftp://ftp.zlatkovic.com/pub/libxml/.

Your directory should now have the following files in it (although possibly different versions):

iconv-1.9.1.win32.zip
libxml2-2.6.23.win32.zip
libxslt-1.1.15.win32.zip
lxml-1.0.0.tgz
zlib-1.2.3.win32.zip

Now extract each of those files in the same directory. This should give you something like this:

iconv-1.9.1.win32/
iconv-1.9.1.win32.zip
libxml2-2.6.23.win32/
libxml2-2.6.23.win32.zip
libxslt-1.1.15.win32/
libxslt-1.1.15.win32.zip
lxml-1.0.0/
lxml-1.0.0.tgz
zlib-1.2.3.win32/
zlib-1.2.3.win32.zip

Go to the lxml-1.0.0 directory and edit the file setup.py. There should be a section near the top that looks like this:

def setupStaticBuild():
    cflags = [
        ]
    xslt_libs = [
        ]
    result = (cflags, xslt_libs)
    # return result
    raise NotImplementedError, \
          "Static build not configured, see doc/build.txt"

Change this section to something like this, but take care to use the correct version numbers:

def setupStaticBuild():
    cflags = [
       "-I..\\libxml2-2.6.23.win32\\include ",
       "-I..\\libxslt-1.1.15.win32\\include",
       "-I..\\zlib-1.2.3.win32\\include",
       "-I..\\iconv-1.9.1.win32\\include"
       ]
    xslt_libs = [
       "..\\libxml2-2.6.23.win32\\lib\\libxml2_a.lib",
       "..\\libxslt-1.1.15.win32\\lib\\libxslt_a.lib",
       "..\\libxslt-1.1.15.win32\\lib\\libexslt_a.lib",
       "..\\zlib-1.2.3.win32\\lib\\zlib.lib",
       "..\\iconv-1.9.1.win32\\lib\\iconv_a.lib"
       ]
    result = (cflags, xslt_libs)
    return result

The _a part of the library names means that we are linking statically against the named library files. If you want to use dynamic libraries, you need to link against the DLL version of the libraries.

Now you should be able to pass the --static option to setup.py and everything should work well. Try calling:

python setup.py bdist_wininst --static

This will create a windows installer in the pkg directory.

Building Debian packages from SVN sources

Andreas Pakulat proposed the following approach.

Eventually dpkg-buildpackage will tell you that some dependecies are missing, you can either install them manually or run apt-get build-dep lxml

That will give you .deb packages in the parent directory which can be installed using dpkg -i.