Chapter 2. Installation


      Make things as simple as possible, but not any simpler.
      

--Albert Einstein

The Python Distutils are used to build and install PyTables, so it is fairly simple to get the application up and running. If you want to install the package from sources you can go on reading to the next section.

However, if you are running Windows and want to install precompiled binaries, you can jump straight to Section 2.2. In addition, packages are available for many different Linux distributions, for instance T2 Project, Debian, or Ubuntu, among others. There are also packages for other Unices like FreeBSD or MacOSX (see Darwinports or Fink repositories).

2.1. Installation from source

These instructions are for both Unix/MacOS X and Windows systems. If you are using Windows, it is assumed that you have a recent version of MS Visual C++ compiler installed. A GCC compiler is assumed for Unix, but other compilers should work as well.

Extensions in PyTables have been developed in Pyrex (see [5]) and the C language. You can rebuild everything from scratch if you have Pyrex installed, but this is not necessary, as the Pyrex compiled source is included in the source distribution.

To compile PyTables you will need a recent version of Python, the HDF5 (C flavor) library from http://hdfgroup.org/, and the NumPy (see [8]) package. Although you won't need numarray (see [10]) or Numeric (see [9]) in order to compile PyTables, they are supported; you only need a reasonably recent version of them (>= 1.5.2 for numarray and >= 24.2 for Numeric) if you plan on using them in your applications. If you already have numarray and/or Numeric installed, the test driver module will detect them and will run the tests for numarray and/or Numeric automatically.

2.1.1. Prerequisites

First, make sure that you have at least Python 2.4, HDF5 1.6.5 and NumPy 1.0.3 or higher installed (for testing purposes, we are using HDF5 1.8.1 and NumPy 1.1 currently). If you don't, fetch and install them before proceeding.

Compile and install these packages (but see Section 2.2.1 for instructions on how to install precompiled binaries if you are not willing to compile the prerequisites on Windows systems).

For compression (and possibly improved performance), you will need to install the Zlib (see [12]), which is also required by HDF5 as well. You may also optionally install the excellent LZO compression library (see [13] and Section 5.3). The high-performance bzip2 compression library can also be used with PyTables (see [14]).

Unix

setup.py will detect HDF5, LZO, or bzip2 libraries and include files under /usr or /usr/local; this will cover most manual installations as well as installations from packages. If setup.py can not find libhdf5 (or liblzo, or libbz2 that you may wish to use) or if you have several versions of a library installed and want to use a particular one, then you can set the path to the resource in the environment, by setting the values of the HDF5_DIR, LZO_DIR, or BZIP2_DIR environment variables to the path to the particular resource. You may also specify the locations of the resource root directories on the setup.py command line. For example:

--hdf5=/stuff/hdf5-1.8.0
--lzo=/stuff/lzo-2.02
--bzip2=/stuff/bzip2-1.0.4

If your HDF5 library was built as a shared library not in the runtime load path, then you can specify the additional linker flags needed to find the shared library on the command line as well. For example:

--lflags="-Xlinker -rpath -Xlinker /stuff/hdf5-1.8.0/lib"

You may also want to try setting the LD_LIBRARY_PATH environment variable to point to the directory where the shared libraries can be found. Check your compiler and linker documentation as well as the Python Distutils documentation for the correct syntax or environment variable names.

It is also possible to link with specific libraries by setting the LIBS environment variable:

LIBS="hdf5-1.8.0 nsl"

Finally, you can give additional flags to your compiler by passing them to the --cflags flag:

--cflags="-w -O3"

In the above case, a gcc compiler is used and you instructed it to suppress all the warnings and set the level 3 of optimization.

Windows

You can get ready-to-use Windows binaries and other development files for most of the following libraries from the GnuWin32 project (see [15]).

Once you have installed the prerequisites, setup.py needs to know where the necessary library stub (.lib) and header (.h) files are installed. You can set the path to the include and dll directories for the HDF5 (mandatory) and LZO or BZIP2 (optional) libraries in the environment, by setting the values of the HDF5_DIR, LZO_DIR, or BZIP2_DIR environment variables to the path to the particular resource. For example:

set HDF5_DIR=c:\stuff\5-165-win
set LZO_DIR=c:\stuff\lzo-1-08
set BZIP2_DIR=c:\stuff\bzip2-1-0-3

You may also specify the locations of the resource root directories on the setup.py command line. For example:

--hdf5=c:\stuff\5-165-win
--lzo=c:\stuff\lzo-1-08
--bzip2=c:\stuff\bzip2-1-0-3

2.1.2. PyTables package installation

Once you have installed the HDF5 library and the NumPy package, you can proceed with the PyTables package itself:

  1. Run this command from the main PyTables distribution directory, including any extra command line arguments as discussed above:

    python setup.py build_ext --inplace

    Depending on the compiler flags used when compiling your Python executable, there may appear many warnings. Don't worry, almost all of them are caused by variables declared but never used. That's normal in Pyrex extensions.

  2. To run the test suite, execute this command:

    Unix

    In the sh shell and its variants:

    PYTHONPATH=.:$PYTHONPATH  python tables/tests/test_all.py

    or, if you prefer:

    PYTHONPATH=.:$PYTHONPATH  python -c "import tables; tables.test()"

    Both commands do the same thing.

    Windows

    Open the command prompt (cmd.exe or command.com) and type:

    set PYTHONPATH=.;%PYTHONPATH%
    python tables\tests\test_all.py

    or:

    set PYTHONPATH=.;%PYTHONPATH%
    python -c "import tables; tables.test()"

    If you would like to see verbose output from the tests simply add the -v flag or the word verbose to the command line. You can also run only the tests in a particular test module. For example, to execute just the test_types test suite, you only have to specify it:

    python tables/tests/test_types.py -v  # change to backslashes for win

    You have other options to pass to the test_all.py driver:

    python tables/tests/test_all.py --heavy  # change to backslashes for win

    The command above runs every test in the test unit. Beware, it can take a lot of time, CPU and memory resources to complete.

    python tables/tests/test_all.py --show-versions  # change to backslashes for win

    The command above shows the versions for all the packages that PyTables relies on. Please be sure to include this when reporting bugs.

    python tables/tests/test_all.py --show-memory  # only under Linux 2.6.x

    The command above prints out the evolution of the memory consumption after each test module completion. It's useful for locating memory leaks in PyTables (or packages behind it). Only valid for Linux 2.6.x kernels.

    And last, but not least, in case a test fails, please run the failing test module again and enable the verbose output:

    python tables/tests/test_<module>.py -v verbose

    and, very important, obtain your PyTables version information by using the --show-versions flag (see above) and send back both outputs to developers so that we may continue improving PyTables.

    If you run into problems because Python can not load the HDF5 library or other shared libraries:

    Unix

    Try setting the LD_LIBRARY_PATH or equivalent environment variable to point to the directory where the missing libraries can be found.

    Windows

    Put the DLL libraries (hdf5dll.dll and, optionally, lzo1.dll and bzip2.dll) in a directory listed in your PATH environment variable. The setup.py installation program will print out a warning to that effect if the libraries can not be found.

  3. To install the entire PyTables Python package, change back to the root distribution directory and run the following command (make sure you have sufficient permissions to write to the directories where the PyTables files will be installed):

    python setup.py install

    Of course, you will need super-user privileges if you want to install PyTables on a system-protected area. You can select, though, a different place to install the package using the --prefix flag:

    python setup.py install --prefix="/home/myuser/mystuff"

    Have in mind, however, that if you use the --prefix flag to install in a non-standard place, you should properly setup your PYTHONPATH environment variable, so that the Python interpreter would be able to find your new PyTables installation.

    You have more installation options available in the Distutils package. Issue a:

    python setup.py install --help

    for more information on that subject.

That's it! Now you can skip to the next chapter to learn how to use PyTables.

2.2. Binary installation (Windows)

This section is intended for installing precompiled binaries on Windows platforms. You may also find it useful for instructions on how to install binary prerequisites even if you want to compile PyTables itself on Windows.

2.2.1. Windows prerequisites

First, make sure that you have Python 2.4 or higher and NumPy 1.0.3 or higher installed (PyTables binaries have been built using NumPy 1.1). The binaries already include DLLs for HDF5 (1.6.7), zlib1 (1.2.3), szlib (2.1, uncompression support only) and bzip2 (1.0.4). The LZO DLL can't be included because of license issues (but read below for directives to install it if you want so).

To enable compression with the optional LZO library (see the Section 5.3 for hints about how it may be used to improve performance), fetch and install the LZO (choose v1.x, LZO v2.x is not supported in precompiled Windows builds) from [15]. Normally, you will only need to fetch and install the <package>-<version>-bin.zip file and copy the lzo1.dll file in a directory in the PATH environment variable (for example C:\WINDOWS\SYSTEM32) or python_installation_path\Lib\site-packages\tables (the last directory may not exist yet, so if you want to install the DLL there, you should do so after installing the PyTables package), so that it can be found by the PyTables extensions.

Please note that PyTables has internal machinery for dealing with uninstalled optional compression libraries, so, you don't need to install the LZO dynamic library if you don't want to.

2.2.2. PyTables package installation

Download the tables-<version>.win32-py<version>.exe file and execute it.

You can (and you should) test your installation by running the next commands:

>>> import tables
>>> tables.test()

on your favorite python shell. If all the tests pass (possibly with a few warnings, related to the potential unavailability of LZO lib) you already have a working, well-tested copy of PyTables installed! If any test fails, please copy the output of the error messages as well as the output of:

>>> tables.print_versions()

and mail them to the developers so that the problem can be fixed in future releases.

You can proceed now to the next chapter to see how to use PyTables.