Chapter 4. Library Reference

PyTables implements several classes to represent the different nodes in the object tree. They are named File, Group, Leaf, Table, Array, CArray, EArray, VLArray and UnImplemented. Another one allows the user to complement the information on these different objects; its name is AttributeSet. Finally, another important class called IsDescription allows to build a Table record description by declaring a subclass of it. Many other classes are defined in PyTables, but they can be regarded as helpers whose goal is mainly to declare the data type properties of the different first class objects and will be described at the end of this chapter as well.

An important function, called openFile is responsible to create, open or append to files. In addition, a few utility functions are defined to guess if the user supplied file is a PyTables or HDF5 file. These are called isPyTablesFile() and isHDF5File(), respectively. Finally, there exists a function called whichLibVersion that informs about the versions of the underlying C libraries (for example, the HDF5 or the Zlib).

Let's start discussing the first-level variables and functions available to the user, then the different classes defined in PyTables.

4.1. tables variables and functions

4.1.1. Global variables

__version__

The PyTables version number.

hdf5Version

The underlying HDF5 library version number.

4.1.2. Global functions

copyFile(srcfilename, dstfilename, overwrite=False, **kwargs)

An easy way of copying one PyTables file to another.

This function allows you to copy an existing PyTables file named srcfilename to another file called dstfilename. The source file must exist and be readable. The destination file can be overwritten in place if existing by asserting the overwrite argument.

This function is a shorthand for the File.copyFile() method, which acts on an already opened file. kwargs takes keyword arguments used to customize the copying process. See the documentation of File.copyFile() (see description) for a description of those arguments.

isHDF5File(filename)

Determine whether a file is in the HDF5 format.

When successful, it returns a true value if the file is an HDF5 file, false otherwise. If there were problems identifying the file, an HDF5ExtError is raised.

isPyTablesFile(filename)

Determine whether a file is in the PyTables format.

When successful, it returns a true value if the file is a PyTables file, false otherwise. The true value is the format version string of the file. If there were problems identifying the file, an HDF5ExtError is raised.

openFile(filename, mode='r', title='', trMap={}, rootUEP="/", filters=None)

Open a PyTables (or generic HDF5) file and returns a File object.

filename

The name of the file (supports environment variable expansion). It is suggested that it should have any of ".h5", ".hdf" or ".hdf5" extensions, although this is not mandatory.

mode

The mode to open the file. It can be one of the following:

'r'

read-only; no data can be modified.

'w'

write; a new file is created (an existing file with the same name would be deleted).

'a'

append; an existing file is opened for reading and writing, and if the file does not exist it is created.

'r+'

is similar to 'a', but the file must already exist.

title

If filename is new, this will set a title for the root group in this file. If filename is not new, the title will be read from disk, and this will not have any effect.

trMap

A dictionary to map names in the object tree Python namespace into different HDF5 names in file namespace. The keys are the Python names, while the values are the HDF5 names. This is useful when you need to use HDF5 node names with invalid or reserved words in Python.

rootUEP

The root User Entry Point. This is a group in the HDF5 hierarchy which will be taken as the starting point to create the object tree. The group has to be named after its HDF5 name and can be a path. If it does not exist, an HDF5ExtError exception is issued. Use this if you do not want to build the entire object tree, but rather only a subtree of it.

filters

An instance of the Filters class (see Section 4.17.1) that provides information about the desired I/O filters applicable to the leaves that hang directly from root (unless other filters properties are specified for these leaves). Besides, if you do not specify filter properties for its child groups, they will inherit these ones. So, if you open a new file with this parameter set, all the leaves that would be created in the file will recursively inherit this filtering properties (again, if you don't prevent that from happening by specifying other filters on the child groups or leaves).

nodeCacheSize

The number of unreferenced nodes to be kept in memory. Least recently used nodes are unloaded from memory when this number of loaded nodes is reached. To load a node again, simply access it as usual. Nodes referenced by user variables are not taken into account nor unloaded.

whichLibVersion(name)

Get version information about a C library.

If the library indicated by name is available, this function returns a 3-tuple containing the major library version as an integer, its full version as a string, and the version date as a string. If the library is not available, None is returned.

The currently supported library names are hdf5, zlib, lzo, ucl (in process of being deprecated) and bzip2. If another name is given, a ValueError is raised.

4.2. The File class

An instance of this class is returned when a PyTables file is opened with the openFile() function. It offers methods to manipulate (create, rename, delete...) nodes and handle their attributes, as well as methods to traverse the object tree. The user entry point to the object tree attached to the HDF5 file is represented in the rootUEP attribute. Other attributes are available.

File objects support an Undo/Redo mechanism which can be enabled with the enableUndo() method. Once the Undo/Redo mechanism is enabled, explicit marks (with an optional unique name) can be set on the state of the database using the mark() method. There are two implicit marks which are always available: the initial mark (0) and the final mark (-1). Both the identifier of a mark and its name can be used in undo and redo operations.

Hierarchy manipulation operations (node creation, movement and removal) and attribute handling operations (setting and deleting) made after a mark can be undone by using the undo() method, which returns the database to the state of a past mark. If undo() is not followed by operations that modify the hierarchy or attributes, the redo() method can be used to return the database to the state of a future mark. Else, future states of the database are forgotten.

Note that data handling operations can not be undone nor redone by now. Also, hierarchy manipulation operations on nodes that do not support the Undo/Redo mechanism issue an UndoRedoWarning before changing the database.

The Undo/Redo mechanism is persistent between sessions and can only be disabled by calling the disableUndo() method.

4.2.1. File instance variables

filename

The name of the opened file.

format_version

The PyTables version number of this file.

isopen

True if the underlying file is open, false otherwise.

mode

The mode in which the file was opened.

title

The title of the root group in the file.

trMap

A dictionary that maps node names between PyTables and HDF5 domain names. Its initial values are set from the trMap parameter passed to the openFile function. You cannot change its contents after a file is opened.

rootUEP

The UEP (user entry point) group in the file (see description).

filters

Default filter properties for the root group (see 4.17.1).

root

The root of the object tree hierarchy (a Group instance).

objects

A dictionary which maps path names to objects, for every visible node in the tree (deprecated, see note below).

groups

A dictionary which maps path names to objects, for every visible group in the tree (deprecated, see note below).

leaves

A dictionary which maps path names to objects, for every visible leaf in the tree (deprecated, see note below).

Note: From PyTables 1.2 on, the dictionaries objects, groups and leaves are just instances of objects faking the old functionality. Actually, they internally use File.getNode() (see description) and File.walknodes() (see description), which are recommended instead.

4.2.2. File methods

createGroup(where, name, title='', filters=None, createparents=False)

Create a new Group instance with name name in where location.

where

The parent group where the new group will hang from. where parameter can be a path string (for example "/level1/group5"), or another Group instance.

name

The name of the new group.

title

A description for this group.

filters

An instance of the Filters class (see Section 4.17.1) that provides information about the desired I/O filters applicable to the leaves that hangs directly from this new group (unless other filters properties are specified for these leaves). Besides, if you do not specify filter properties for its child groups, they will inherit these ones.

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

createTable(where, name, description, title='', filters=None, expectedrows=10000, createparents=False)

Create a new Table instance with name name in where location. See the Section 4.6 for a description of the Table class.

where

The parent group where the new table will hang from. where parameter can be a path string (for example "/level1/leaf5"), or Group instance.

name

The name of the new table.

description

This is an object that describes the table, that is, how many columns has it, and properties for each column: the type, the shape, etc. as well as other table properties.

description can be any of the next several objects:

A user-defined class

This should inherit from the IsDescription class (see 4.16.1) where table fields are specified.

A dictionary

For example, when you do not know beforehand which structure will have your table). See Section 3.4 for an example of use.

A RecArray

This object from the numarray package is also accepted, and all the information about columns and other metadata is used as a basis to create the Table object. Moreover, if the RecArray has actual data this is also injected on the newly created Table object.

A NestedRecArray

Finally, if you want to have nested columns in your table, you can use this object (see Appendix B) and all the information about columns and other metadata is used as a basis to create the Table object. Moreover, if the NestedRecArray has actual data this is also injected on the newly created Table object.

title

A description for this object.

filters

An instance of the Filters class (see Section 4.17.1) that provides information about the desired I/O filters to be applied during the life of this object.

expectedrows

An user estimate of the number of records that will be on table. If not provided, the default value is appropriate for tables until 10 MB in size (more or less). If you plan to save bigger tables you should provide a guess; this will optimize the HDF5 B-Tree creation and management process time and memory used. See Section 5.1 for a discussion on that issue.

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

createArray(where, name, object, title='', createparents=False)

Create a new Array instance with name name in where location. See the Section 4.10 for a description of the Array class.

object

The regular array to be saved. Currently accepted values are: NumPy, Numeric, numarray arrays (including CharArray string numarrays) or other native Python types, provided that they are regular (i.e. they are not like [[1,2],2]) and homogeneous (i.e. all the elements are of the same type). Also, objects that have some of their dimensions equal to zero are not supported (use an EArray object if you want to create an array with one of its dimensions equal to 0).

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

See createTable description for more information on the where, name and title, parameters.

createCArray(where, name, shape, atom, title='', filters=None, createparents=False)

Create a new CArray instance with name name in where location. See the Section 4.11 for a description of the CArray class.

shape

The shape of the objects to be saved.

atom

An Atom instance representing the shape, type and flavor of the chunk of the objects to be saved.

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

See createTable description for more information on the where, name and title, parameters.

createEArray(where, name, atom, title='', filters=None, expectedrows=1000, createparents=False)

Create a new EArray instance with name name in where location. See the Section 4.12 for a description of the EArray class.

atom

An Atom instance representing the shape, type and flavor of the atomic objects to be saved. One (and only one) of the shape dimensions must be 0. The dimension being 0 means that the resulting EArray object can be extended along it. Multiple enlargeable dimensions are not supported right now. See Section 4.16.3 for the supported set of Atom class descendants.

expectedrows

In the case of enlargeable arrays this represents an user estimate about the number of row elements that will be added to the growable dimension in the EArray object. If not provided, the default value is 1000 rows. If you plan to create both much smaller or much bigger EArrays try providing a guess; this will optimize the HDF5 B-Tree creation and management process time and the amount of memory used.

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

See createTable description for more information on the where, name, title, and filters parameters.

createVLArray(where, name, atom=None, title='', filters=None, expectedsizeinMB=1.0, createparents=False)

Create a new VLArray instance with name name in where location. See the Section 4.13 for a description of the VLArray class.

atom

An Atom instance representing the shape, type and flavor of the atomic object to be saved. See Section 4.16.3 for the supported set of Atom class descendants.

expectedsizeinMB

An user estimate about the size (in MB) in the final VLArray object. If not provided, the default value is 1 MB. If you plan to create both much smaller or much bigger VLA's try providing a guess; this will optimize the HDF5 B-Tree creation and management process time and the amount of memory used.

createparents

Whether to create the needed groups for the parent path to exist (not done by default).

See createTable description for more information on the where, name, title, and filters parameters.

getNode(where, name=None, classname=None)

Get the node under where with the given name.

where can be a Node instance or a path string leading to a node. If no name is specified, that node is returned.

If a name is specified, this must be a string with the name of a node under where. In this case the where argument can only lead to a Group instance (else a TypeError is raised). The node called name under the group where is returned.

In both cases, if the node to be returned does not exist, a NoSuchNodeError is raised. Please, note that hidden nodes are also considered.

If the classname argument is specified, it must be the name of a class derived from Node. If the node is found but it is not an instance of that class, a NoSuchNodeError is also raised.

isVisibleNode(path)

Is the node under path visible?

If the node does not exist, a NoSuchNodeError is raised.

getNodeAttr(where, attrname, name=None)

Returns the attribute attrname under where.name location.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

attrname

The name of the attribute to get.

setNodeAttr(where, attrname, attrvalue, name=None)

Sets the attribute attrname with value attrvalue under where.name location. If the node already has a large number of attributes, a PerformanceWarning will be issued.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

attrname

The name of the attribute to set on disk.

attrvalue

The value of the attribute to set. Any kind of python object (like string, ints, floats, lists, tuples, dicts, small Numeric/NumPy/numarray objects...) can be stored as an attribute. However, if necessary, (c)Pickle is automatically used so as to serialize objects that you might want to save (see 4.15 for details).

delNodeAttr(where, attrname, name=None)

Delete the attribute attrname in where.name location.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

attrname

The name of the attribute to delete on disk.

copyNodeAttrs(where, dstnode, name=None)

Copy the attributes from node where.name to dstnode.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

dstnode

This is the destination node where the attributes will be copied. It can be either a path string or a Node object.

iterNodes(where, classname=None)

Returns an iterator yielding children nodes hanging from where. These nodes are alpha-numerically sorted by its node name.

where

This argument works as in getNode() (see description), referencing the node to be acted upon.

classname

If the name of a class derived from Node is supplied in the classname parameter, only instances of that class (or subclasses of it) will be returned.

listNodes(where, classname=None)

Returns a list with children nodes hanging from where. The list is alpha-numerically sorted by node name.

where

This argument works as in getNode() (see description), referencing the node to be acted upon.

classname

If the name of a class derived from Node is supplied in the classname parameter, only instances of that class (or subclasses of it) will be returned.

removeNode(where, name=None, recursive=False)

Removes the object node name under where location.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

recursive

If not supplied, the object will be removed only if it has no children; if it does, a NodeError will be raised. If supplied with a true value, the object and all its descendants will be completely removed.

copyNode(where, newparent=None, newname=None, name=None, overwrite=False, recursive=False, createparents=False, **kwargs)

Copy the node specified by where and name to newparent/newname.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

newparent

The destination group that the node will be copied to (a path name or a Group instance). If newparent is None, the parent of the source node is selected as the new parent.

newname

The name to be assigned to the new copy in its destination (a string). If newname is None or not specified, the name of the source node is used.

overwrite

Whether the possibly existing node newparent/newname should be overwritten or not. Note that trying to copy over an existing node without overwriting it will issue a NodeError.

recursive

Specifies whether the copy should recurse into children of the copied node. This argument is ignored for leaf nodes. The default is not recurse.

createparents

Whether to create the needed groups for the new parent path to exist (not done by default).

kwargs

Additional keyword arguments may be passed to customize the copying process. The supported arguments depend on the kind of node being copied. The following are some of them:

title

The new title for the destination. If None, the original title is used. This only applies to the topmost node for recursive copies.

filters

Specifying this parameter overrides the original filter properties in the source node. If specified, it must be an instance of the Filters class (see Section 4.17.1). The default is to copy the filter attribute from the source node.

copyuserattrs

You can prevent the user attributes from being copied by setting this parameter to False. The default is to copy them.

start, stop, step

Specify the range of rows in child leaves to be copied; the default is to copy all the rows.

stats

This argument may be used to collect statistics on the copy process. When used, it should be a dictionary with keys groups, leaves and bytes having a numeric value. Their values will be incremented to reflect the number of groups, leaves and bytes, respectively, that have been copied in the operation.

renameNode(where, newname, name=None)

Change the name of the node specified by where and name to newname.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

newname

The new name to be assigned to the node (a string).

moveNode(where, newparent=None, newname=None, name=None, overwrite=False, createparents=False)

Move the node specified by where and name to newparent/newname.

where, name

These arguments work as in getNode() (see description), referencing the node to be acted upon.

newparent

The destination group the node will be moved to (a path name or a Group instance). If newparent is None, the original node parent is selected as the new parent.

newname

The new name to be assigned to the node in its destination (a string). If newname is None or not specified, the original node name is used.

The other arguments work as in Node._f_move() (see description).

walkGroups(where='/')

Iterator that returns the list of Groups (not Leaves) hanging from (and including) where. The where Group is listed first (pre-order), then each of its child Groups (following an alpha-numerical order) is also traversed, following the same procedure. If where is not supplied, the root object is used.

where

The origin group. Can be a path string or Group instance.

walkNodes(where="/", classname="")

Recursively iterate over the nodes in the File instance. It takes two parameters:

where

If supplied, the iteration starts from (and includes) this group.

classname

(String) If supplied, only instances of this class are returned.

Example of use:

# Recursively print all the nodes hanging from '/detector'
print "Nodes hanging from group '/detector':"
for node in h5file.walkNodes("/detector"):
    print node

copyChildren(srcgroup, dstgroup, overwrite=False, recursive=False, createparents=False, **kwargs)

Copy the children of a group into another group.

This method copies the nodes hanging from the source group srcgroup into the destination group dstgroup. Existing destination nodes can be replaced by asserting the overwrite argument. If the recursive argument is true, all descendant nodes of srcnode are recursively copied. If createparents is true, the needed groups for the given destination parent group path to exist will be created.

kwargs takes keyword arguments used to customize the copying process. See the documentation of Group._f_copyChildren() (see description) for a description of those arguments.

copyFile(dstfilename, overwrite=False, **kwargs)

Copy the contents of this file to dstfilename.

dstfilename must be a path string indicating the name of the destination file. If it already exists, the copy will fail with an IOError, unless the overwrite argument is true, in which case the destination file will be overwritten in place. In this last case, the destination file should be closed or ugly errors will happen.

Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. Arguments unknown to nodes are simply ignored. Check the documentation for copying operations of nodes to see which options they support.

Copying a file usually has the beneficial side effect of creating a more compact and cleaner version of the original file.

flush()

Flush all the leaves in the object tree.

close()

Flush all the leaves in object tree and close the file.

isUndoEnabled()

Is the Undo/Redo mechanism enabled?

Returns True if the Undo/Redo mechanism has been enabled for this file, False otherwise. Please, note that this mechanism is persistent, so a newly opened PyTables file may already have Undo/Redo support.

enableUndo(filters=Filters( complevel=1))

Enable the Undo/Redo mechanism.

This operation prepares the database for undoing and redoing modifications in the node hierarchy. This allows mark(), undo(), redo() and other methods to be called.

The filters argument, when specified, must be an instance of class Filters (see Section 4.17.1) and is meant for setting the compression values for the action log. The default is having compression enabled, as the gains in terms of space can be considerable. You may want to disable compression if you want maximum speed for Undo/Redo operations.

Calling enableUndo() when the Undo/Redo mechanism is already enabled raises an UndoRedoError.

disableUndo()

Disable the Undo/Redo mechanism.

Disabling the Undo/Redo mechanism leaves the database in the current state and forgets past and future database states. This makes mark(), undo(), redo() and other methods fail with an UndoRedoError.

Calling disableUndo() when the Undo/Redo mechanism is already disabled raises an UndoRedoError.

mark(name=None)

Mark the state of the database.

Creates a mark for the current state of the database. A unique (and immutable) identifier for the mark is returned. An optional name (a string) can be assigned to the mark. Both the identifier of a mark and its name can be used in undo() and redo() operations. When the name has already been used for another mark, an UndoRedoError is raised.

This method can only be called when the Undo/Redo mechanism has been enabled. Otherwise, an UndoRedoError is raised.

getCurrentMark()

Get the identifier of the current mark.

Returns the identifier of the current mark. This can be used to know the state of a database after an application crash, or to get the identifier of the initial implicit mark after a call to enableUndo().

This method can only be called when the Undo/Redo mechanism has been enabled. Otherwise, an UndoRedoError is raised.

undo(mark=None)

Go to a past state of the database.

Returns the database to the state associated with the specified mark. Both the identifier of a mark and its name can be used. If the mark is omitted, the last created mark is used. If there are no past marks, or the specified mark is not older than the current one, an UndoRedoError is raised.

This method can only be called when the Undo/Redo mechanism has been enabled. Otherwise, an UndoRedoError is raised.

redo(mark=None)

Go to a future state of the database.

Returns the database to the state associated with the specified mark. Both the identifier of a mark and its name can be used. If the mark is omitted, the next created mark is used. If there are no future marks, or the specified mark is not newer than the current one, an UndoRedoError is raised.

This method can only be called when the Undo/Redo mechanism has been enabled. Otherwise, an UndoRedoError is raised.

goto(mark)

Go to a specific mark of the database.

Returns the database to the state associated with the specified mark. Both the identifier of a mark and its name can be used.

This method can only be called when the Undo/Redo mechanism has been enabled. Otherwise, an UndoRedoError is raised.

4.2.3. File special methods

Following are described the methods that automatically trigger actions when a File instance is accessed in a special way.

__contains__(path)

Is there a node with that path?

Returns True if the file has a node with the given path (a string), False otherwise.

__iter__()

Iterate over the children on the File instance. However, this does not accept parameters. This iterator is recursive.

Example of use:

# Recursively list all the nodes in the object tree
h5file = tables.openFile("vlarray1.h5")
print "All nodes in the object tree:"
for node in h5file:
    print node

__str__()

Prints a short description of the File object.

Example of use:

>>> f=tables.openFile("data/test.h5")
>>> print f
data/test.h5 (File) 'Table Benchmark'
Last modif.: 'Mon Sep 20 12:40:47 2004'
Object Tree:
/ (Group) 'Table Benchmark'
/tuple0 (Table(100L,)) 'This is the table title'
/group0 (Group) ''
/group0/tuple1 (Table(100L,)) 'This is the table title'
/group0/group1 (Group) ''
/group0/group1/tuple2 (Table(100L,)) 'This is the table title'
/group0/group1/group2 (Group) ''

__repr__()

Prints a detailed description of the File object.

4.3. The Node class

This is the base class for all nodes in a PyTables hierarchy. It is an abstract class, i.e. it may not be directly instantiated; however, every node in the hierarchy is an instance of this class.

A PyTables node is always hosted in a PyTables file, under a parent group, at a certain depth in the node hierarchy. A node knows its own name in the parent group and its own path name in the file. When using a translation map (see 4.2), its HDF5 name might differ from its PyTables name.

All the previous information is location-dependent, i.e. it may change when moving or renaming a node in the hierarchy. A node also has location-independent information, such as its HDF5 object identifier and its attribute set.

This class gathers the operations and attributes (both location-dependent and independent) which are common to all PyTables nodes, whatever their type is. Nonetheless, due to natural naming restrictions, the names of all of these members start with a reserved prefix (see 4.4).

Sub-classes with no children (i.e. leaf nodes) may define new methods, attributes and properties to avoid natural naming restrictions. For instance, _v_attrs may be shortened to attrs and _f_rename to rename. However, the original methods and attributes should still be available.

4.3.1. Node instance variables

Location dependent

_v_file

The hosting File instance (see 4.2).

_v_parent

The parent Group instance (see 4.4).

_v_depth

The depth of this node in the tree (an non-negative integer value).

_v_name

The name of this node in its parent group (a string).

_v_hdf5name

The name of this node in the hosting HDF5 file (a string).

_v_pathname

The path of this node in the tree (a string).

_v_rootgroup

The root group instance. This is deprecated; please use node._v_file.root.

Location independent

_v_objectID

The identifier of this node in the hosting HDF5 file.

_v_attrs

The associated AttributeSet instance (see 4.15 ).

Attribute shorthands

_v_title

A description of this node. A shorthand for TITLE attribute.

4.3.2. Node methods

Hierarchy manipulation

_f_close()

Close this node in the tree.

This releases all resources held by the node, so it should not be used again. On nodes with data, it may be flushed to disk.

The closing operation is not recursive, i.e. closing a group does not close its children.

_f_isOpen()

Is this node open?

_f_remove(recursive=False)

Remove this node from the hierarchy.

If the node has children, recursive removal must be stated by giving recursive a true value; otherwise, a NodeError will be raised.

_f_rename(newname)

Rename this node in place.

Changes the name of a node to newname (a string).

_f_move(newparent=None, newname=None, overwrite=False, createparents=False)

Move or rename this node.

Moves a node into a new parent group, or changes the name of the node. newparent can be a Group object or a pathname in string form. If it is not specified or None, the current parent group is chosen as the new parent. newname must be a string with a new name. If it is not specified or None, the current name is chosen as the new name. If createparents is true, the needed groups for the given new parent group path to exist will be created.

Moving a node across databases is not allowed, nor it is moving a node into itself. These result in a NodeError. However, moving a node over itself is allowed and simply does nothing. Moving over another existing node is similarly not allowed, unless the optional overwrite argument is true, in which case that node is recursively removed before moving.

Usually, only the first argument will be used, effectively moving the node to a new location without changing its name. Using only the second argument is equivalent to renaming the node in place.

_f_copy(newparent=None, newname=None, overwrite=False, recursive=False, createparents=False, **kwargs)

Copy this node and return the new node.

Creates and returns a copy of the node, maybe in a different place in the hierarchy. newparent can be a Group object or a pathname in string form. If it is not specified or None, the current parent group is chosen as the new parent. newname must be a string with a new name. If it is not specified or None, the current name is chosen as the new name. If recursive copy is stated, all descendants are copied as well. If ucreateparents is true, the needed groups for the given new parent group path to exist will be created.

Copying a node across databases is supported but can not be undone. Copying a node over itself is not allowed, nor it is recursively copying a node into itself. These result in a NodeError. Copying over another existing node is similarly not allowed, unless the optional overwrite argument is true, in which case that node is recursively removed before copying.

Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. See the documentation for the particular node type.

Using only the first argument is equivalent to copying the node to a new location without changing its name. Using only the second argument is equivalent to making a copy of the node in the same group.

_f_isVisible()

Is this node visible?

Attribute handling

_f_getAttr(name)

Get a PyTables attribute from this node.

If the named attribute does not exist, an AttributeError is raised.

_f_setAttr(name, value)

Set a PyTables attribute for this node.

If the node already has a large number of attributes, a PerformanceWarning is issued.

_f_delAttr(name)

Delete a PyTables attribute from this node.

If the named attribute does not exist, an AttributeError is raised.

4.4. The Group class

Instances of this class are a grouping structure containing instances of zero or more groups or leaves, together with supporting metadata.

Working with groups and leaves is similar in many ways to working with directories and files, respectively, in a Unix filesystem. As with Unix directories and files, objects in the object tree are often described by giving their full (or absolute) path names. This full path can be specified either as a string (like in '/group1/group2') or as a complete object path written in natural name schema (like in file.root.group1.group2) as discussed in the Section 1.2.

A collateral effect of the natural naming schema is that names of Group members must be carefully chosen to avoid colliding with existing children node names. For this reason and not to pollute the children namespace, it is explicitly forbidden to assign normal attributes to Group instances, and all existing members start with some reserved prefixes, like _f_ (for methods) or _v_ (for instance variables). Any attempt to set a new child node whose name starts with one of these prefixes will raise a ValueError exception.

Another effect of natural naming is that nodes having reserved Python names and other non-allowed Python names (like for example $a or 44) can not be accessed using the node.child syntax. You will be forced to use getattr(node, child) and delattr(node, child) to access them.

You can also make use of the trMap (translation map dictionary) parameter in the openFile function (see description) in order to translate HDF5 names not suited for natural naming into more convenient ones.

4.4.1. Group instance variables

These instance variables are provided in addition to those in Node (see 4.3).

_v_nchildren

The number of children hanging from this group.

_v_children

Dictionary with all nodes hanging from this group.

_v_groups

Dictionary with all groups hanging from this group.

_v_leaves

Dictionary with all leaves hanging from this group.

_v_filters

Default filter properties for child nodes —see 4.17.1. A shorthand for FILTERS attribute.

4.4.2. Group methods

This class defines the __setattr__, __getattr__ and __delattr__ methods, and they set, get and delete ordinary Python attributes as normally intended. In addition to that, __getattr__ allows getting child nodes by their name for the sake of easy interaction on the command line, as long as there is no Python attribute with the same name. Groups also allow the interactive completion (when using readline) of the names of child nodes. For instance:

nchild = group._v_nchildren  # get a Python attribute

# Add a Table child called "table" under "group".
h5file.createTable(group, 'table', myDescription)

table = group.table          # get the table child instance
group.table = 'foo'          # set a Python attribute
# (PyTables warns you here about using the name of a child node.)
foo = group.table            # get a Python attribute
del group.table              # delete a Python attribute
table = group.table          # get the table child instance again

Caveat: The following methods are documented for completeness, and they can be used without any problem. However, you should use the high-level counterpart methods in the File class, because these are most used in documentation and examples, and are a bit more powerful than those exposed here.

These methods are provided in addition to those in Node (see 4.3).

_f_getChild(childname)

Get the child called childname of this group.

If the child exists (be it visible or not), it is returned. Else, a NoSuchNodeError is raised.

_f_copy(newparent, newname, overwrite=False, recursive=False, createparents=False, **kwargs)

Copy this node and return the new one.

This method has the behavior described in Node._f_copy() (see description). In addition, it recognizes the following keyword arguments:

title

The new title for the destination. If omitted or None, the original title is used. This only applies to the topmost node in recursive copies.

filters

Specifying this parameter overrides the original filter properties in the source node. If specified, it must be an instance of the Filters class (see Section 4.17.1). The default is to copy the filter properties from the source node.

copyuserattrs

You can prevent the user attributes from being copied by setting this parameter to False. The default is to copy them.

stats

This argument may be used to collect statistics on the copy process. When used, it should be a dictionary with keys 'groups', 'leaves' and 'bytes' having a numeric value. Their values will be incremented to reflect the number of groups, leaves and bytes, respectively, that have been copied during the operation.

_f_iterNodes(classname=None)

Returns an iterator yielding all the object nodes hanging from this instance. The nodes are alpha-numerically sorted by its node name. If a classname parameter is supplied, it will only return instances of this class (or subclasses of it).

_f_listNodes(classname=None)

Returns a list with all the object nodes hanging from this instance. The list is alpha-numerically sorted by node name. If a classname parameter is supplied, it will only return instances of this class (or subclasses of it).

_f_walkGroups()

Iterate over the list of Groups (not Leaves) hanging from (and including) self. This Group is listed first (pre-order), then each of its child Groups (following an alpha-numerical order) is also traversed, following the same procedure.

_f_walkNodes(classname=None, recursive=True)

Iterate over the nodes in the Group instance. It takes two parameters:

classname

(String) If supplied, only instances of this class are returned.

recursive

(Integer) If false, only children hanging immediately after the group are returned. If true, a recursion over all the groups hanging from it is performed.

Example of use:

# Recursively print all the arrays hanging from '/'
print "Arrays the object tree '/':"
for array in h5file.root._f_walkNodes("Array", recursive=1):
    print array

_f_close()

Close this node in the tree.

This method has the behavior described in Node._f_close() (see description). It should be noted that this operation disables access to nodes descending from this group. Therefore, if you want to explicitly close them, you will need to walk the nodes hanging from this group before closing it.

_f_copyChildren(dstgroup, overwrite=False, recursive=False, createparents=False, **kwargs)

Copy the children of this group into another group.

Children hanging directly from this group are copied into dstgroup, which can be a Group (see 4.4) object or its pathname in string form. If createparents is true, the needed groups for the given destination group path to exist will be created.

The operation will fail with a NodeError if there is a child node in the destination group with the same name as one of the copied children from this one, unless overwrite is true; in this case, the former child node is recursively removed before copying the later.

By default, nodes descending from children groups of this node are not copied. If the recursive argument is true, all descendant nodes of this node are recursively copied.

Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. Arguments unknown to nodes are simply ignored. Check the documentation for copying operations of nodes to see which options they support.

4.4.3. Group special methods

Following are described the methods that automatically trigger actions when a Group instance is accessed in a special way.

__setattr__(name, value)

Set a Python attribute called name with the given value.

This method stores an ordinary Python attribute in the object. It does not store new children nodes under this group; for that, use the File.create*() methods (see 4.2). It does neither store a PyTables node attribute; for that, use File.setNodeAttr() (see description), Node._f_setAttr() (see description) or Node._v_attrs (see _v_attrs).

If there is already a child node with the same name, a NaturalNameWarning will be issued and the child node will not be accessible via natural naming nor getattr(). It will still be available via File.getNode() (see description), Group._f_getChild() (see description) and children dictionaries in the group (if visible).

__getattr__(name)

Get a Python attribute or child node called name.

If the object has a Python attribute called name, its value is returned. Else, if the node has a child node called name, it is returned. Else, an AttributeError is raised.

__delattr__(name)

Delete a Python attribute called name.

This method deletes an ordinary Python attribute from the object. It does not remove children nodes from this group; for that, use File.removeNode() (see description) or Node._f_remove() (see description). It does neither delete a PyTables node attribute; for that, use File.delNodeAttr() (see description), Node._f_delAttr() (see description) or Node._v_attrs (see _v_attrs).

If there were an attribute and a child node with the same name, the child node will be made accessible again via natural naming.

__contains__(name)

Is there a child with that name?

Returns True if the group has a child node (visible or hidden) with the given name (a string), False otherwise.

__iter__()

Iterate over the children on the group instance. However, this does not accept parameters. This iterator is not recursive.

Example of use:

# Non-recursively list all the nodes hanging from '/detector'
print "Nodes in '/detector' group:"
for node in h5file.root.detector:
    print node

__str__()

Prints a short description of the Group object.

Example of use:

>>> f=tables.openFile("data/test.h5")
>>> print f.root.group0
/group0 (Group) 'First Group'
>>>

__repr__()

Prints a detailed description of the Group object.

Example of use:

>>> f=tables.openFile("data/test.h5")
>>> f.root.group0
/group0 (Group) 'First Group'
  children := ['tuple1' (Table), 'group1' (Group)]
>>>

4.5. The Leaf class

The goal of this class is to provide a place to put common functionality of all its descendants as well as provide a way to help classifying objects on the tree. A Leaf object is an end-node, that is, a node that can hang directly from a group object, but that is not a group itself and, thus, it can not have descendants. Right now, the set of end-nodes is composed by Table, Array, CArray, EArray, VLArray and UnImplemented class instances. In fact, all the previous classes inherit from the Leaf class.

4.5.1. Leaf instance variables

These instance variables are provided in addition to those in Node (see 4.3).

shape

The shape of data in the leaf.

byteorder

The byte ordering of data in the leaf.

filters

Filter properties for this leaf —see 4.17.1.

name

The name of this node in its parent group (a string). An alias for Node._v_name.

hdf5name

The name of this node in the hosting HDF5 file (a string). An alias for Node._v_hdf5name.

objectID

The identifier of this node in the hosting HDF5 file. An alias for Node._v_objectID.

attrs

The associated AttributeSet instance (see 4.15). An alias for Node._v_attrs.

title

A description for this node. An alias for Node._v_title.

4.5.2. Leaf methods

flush()

Flush pending data to disk.

Saves whatever remaining buffered data to disk. It also releases I/O buffers, so, if you are filling many objects (i.e. tables) in the same PyTables session, please, call flush() extensively so as to help PyTables to keep memory requirements low.

_f_close(flush=True)

Close this node in the tree.

This method has the behavior described in Node._f_close() (see description). Besides that, the optional argument flush tells whether to flush pending data to disk or not before closing.

close(flush=True)

Close this node in the tree.

This method is completely equivalent to _f_close().

isOpen()

Is this node open?

This method is completely equivalent to _f_isOpen().

remove()

Remove this node from the hierarchy.

This method has the behavior described in Node._f_remove() (see description). Please, note that there is no recursive flag since leaves do not have child nodes.

copy(newparent, newname, overwrite=False, createparents=False, **kwargs)

Copy this node and return the new one.

This method has the behavior described in Node._f_copy() (see description). Please, note that there is no recursive flag since leaves do not have child nodes. In addition, this method recognizes the following keyword arguments:

title

The new title for the destination. If omitted or None, the original title is used.

filters

Specifying this parameter overrides the original filter properties in the source node. If specified, it must be an instance of the Filters class (see Section 4.17.1). The default is to copy the filter properties from the source node.

copyuserattrs

You can prevent the user attributes from being copied by setting this parameter to False. The default is to copy them.

start, stop, step

Specify the range of rows in child leaves to be copied; the default is to copy all the rows.

stats

This argument may be used to collect statistics on the copy process. When used, it should be a dictionary with keys 'groups', 'leaves' and 'bytes' having a numeric value. Their values will be incremented to reflect the number of groups, leaves and bytes, respectively, that have been copied during the operation.

rename(newname)

Rename this node in place.

This method has the behavior described in Node._f_rename() (see description).

move(newparent=None, newname=None, overwrite=False, createparents=False)

Move or rename this node.

This method has the behavior described in Node._f_move() (see description).

_f_isVisible()

Is this node visible?

This method has the behavior described in Node._f_isVisible() (see description).

getAttr(name)

Get a PyTables attribute from this node.

This method has the behavior described in Node._f_getAttr() (see description).

setAttr(name, value)

Set a PyTables attribute for this node.

This method has the behavior described in Node._f_setAttr() (see description).

delAttr(name)

Delete a PyTables attribute from this node.

This method has the behavior described in Node._f_delAttr() (see description).

4.6. The Table class

Instances of this class represents table objects in the object tree. It provides methods to read/write data and from/to table objects in the file.

Data can be read from or written to tables by accessing to an special object that hangs from Table. This object is an instance of the Row class (see 4.6.4). See the tutorial sections Chapter 3 on how to use the Row interface. The columns of the tables can also be easily accessed (and more specifically, they can be read but not written) by making use of the Column class, through the use of an extension of the natural naming schema applied inside the tables. See the Section 4.9 for some examples of use of this capability.

Note that this object inherits all the public attributes and methods that Leaf already has.

Finally, during the description of the different methods, there will appear references to a particular object called NestedRecArray. This inherits from numarray.records.RecArray and is designed to keep columns that have nested datatypes. Please, see Appendix B for info on these objects.

4.6.1. Table instance variables

description

A Description (see 4.8) instance describing the structure of this table.

row

The associated Row instance (see 4.6.4).

nrows

The number of rows in this table.

rowsize

The size in bytes of each row in the table.

cols

A Cols (see Section 4.7) instance that serves as an accessor to Column (see Section 4.9) objects.

colnames

A tuple containing the (possibly nested) names of the columns in the table.

coltypes

Maps the name of a column to its datatype.

colstypes

Maps the name of a column to its data string type.

colshapes

Maps the name of a column to it shape.

colitemsizes

Maps the name of a column to the size of its base items.

coldflts

Maps the name of a column to its default.

colindexed

Is the column which name is used as a key indexed? (dictionary)

indexed

Does this table have any indexed columns?

indexprops

Index properties for this table (an IndexProps instance, see 4.17.2).

flavor

The default flavor for this table. This determines the type of objects returned during input (i.e. read) operations. It can take the "numarray" (default) or "numpy" values. Its value is derived from the _v_flavor attribute of the IsDescription metaclass (see 4.16.1) or, if the table has been created directly from a numarray or NumPy object, the flavor is set to the appropriate value.

4.6.2. Table methods

getEnum(colname)

Get the enumerated type associated with the named column.

If the column named colname (a string) exists and is of an enumerated type, the corresponding Enum instance (see 4.17.4) is returned. If it is not of an enumerated type, a TypeError is raised. If the column does not exist, a KeyError is raised.

append(rows)

Append a series of rows to this Table instance. rows is an object that can keep the rows to be append in several formats, like a NestedRecArray (see Appendix B), a RecArray, a NumPy object, a list of tuples, list of Numeric/numarray/NumPy objects, string, Python buffer or None (no append will result). Of course, this rows object has to be compliant with the underlying format of the Table instance or a ValueError will be issued.

Example of use:

from tables import *
class Particle(IsDescription):
    name        = StringCol(16, pos=1)   # 16-character String
    lati        = IntCol(pos=2)        # integer
    longi       = IntCol(pos=3)        # integer
    pressure    = Float32Col(pos=4)    # float  (single-precision)
    temperature = FloatCol(pos=5)      # double (double-precision)

fileh = openFile("test4.h5", mode = "w")
table = fileh.createTable(fileh.root, 'table', Particle, "A table")
# Append several rows in only one call
table.append([("Particle:     10", 10, 0, 10*10, 10**2),
              ("Particle:     11", 11, -1, 11*11, 11**2),
              ("Particle:     12", 12, -2, 12*12, 12**2)])
fileh.close()

col(name)

Get a column from the table.

If a column called name exists in the table, it is read and returned as a numarray object, or as a NumPy object (whatever is more appropriate depending on the flavor of the table). If it does not exist, a KeyError is raised.

Example of use:

narray = table.col('var2')

That statement is equivalent to:

narray = table.read(field='var2')

Here you can see how this method can be used as a shorthand for the read() (see description) method.

iterrows(start=None, stop=None, step=1)

Returns an iterator yielding Row (see Section 4.6.4) instances built from rows in table. If a range is supplied (i.e. some of the start, stop or step parameters are passed), only the appropriate rows are returned. Else, all the rows are returned. See also the __iter__() special method in Section 4.6.3 for a shorter way to call this iterator.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

Example of use:

result = [ row['var2'] for row in table.iterrows(step=5)
if row['var1'] <= 20 ]

Note: This iterator can be nested (see example in description).

itersequence(sequence, sort=True)

Iterate over a sequence of row coordinates.

sequence

Can be any object that supports the __getitem__ special method, like lists, tuples, Numeric/NumPy/numarray objects, etc.

sort

If true, means that sequence will be sorted out so that the I/O process would get better performance. If your sequence is already sorted or you don't want to sort it, put this parameter to 0. The default is to sort the sequence.

Note: This iterator can be nested (see example in description).

read(start=None, stop=None, step=1, field=None, flavor=None)

Returns the actual data in Table. If field is not supplied, it returns the data as a NestedRecArray (see Appendix B) object table.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

The rest of the parameters are described next:

field

If specified, only the column field is returned as an homogeneous numarray/NumPy/Numeric object, depending on the flavor. If this is not supplied, all the fields are selected and a NestedRecArray (see Appendix B) or NumPy object is returned. Nested fields can be specified in the field parameter by using a '/' character as a separator between fields (e.g. Info/value).

flavor

Passing a flavor parameter make an additional conversion to happen in the default returned object. flavor can have any of the next values: "numarray" "numpy", "python" or "numeric" (only if field has been specified). If flavor is not specified, then it will take the value of self.flavor.

readCoordinates(coords, field=None, flavor=None)

Read a set of rows given their indexes into an in-memory object.

This method works much like the read() method (see description), but it uses a sequence (coords) of row indexes to select the wanted columns, instead of a column range.

It returns the selected rows in a NestedRecArray object (see Appendix B). If flavor is provided, an additional conversion to an object of this flavor is made, just as in read().

modifyRows(start=None, stop=None, step=1, rows=None)

Modify a series of rows in the [start:stop:step] extended slice range. If you pass None to stop, all the rows existing in rows will be used.

rows can be either a recarray or a structure that is able to be converted to any of them and compliant with the table format.

Returns the number of modified rows.

It raises an ValueError in case the rows parameter could not be converted to an object compliant with table description.

It raises an IndexError in case the modification will exceed the length of the table.

modifyColumn(start=None, stop=None, step=1, column=None, colname=None)

Modify a series of rows in the [start:stop:step] extended slice row range. If you pass None to stop, all the rows existing in column will be used.

column can be either a NestedRecArray (see Appendix B), RecArray, numarray, NumPy object, list or tuple that is able to be converted into a NestedRecArray compliant with the specified colname column of the table.

colname specifies the column name of the table to be modified.

Returns the number of modified rows.

It raises an ValueError in case the column parameter could not be converted into an object compliant with column description.

It raises an IndexError in case the modification will exceed the length of the table.

modifyColumns(start=None, stop=None, step=1, columns=None, names=None)

Modify a series of rows in the [start:stop:step] extended slice row range. If you pass None to stop, all the rows existing in columns will be used.

columns can be either a NestedRecArray (see Appendix B), RecArray, a NumPy object, a list of arrays or list or tuples (the columns) that are able to be converted to a NestedRecArray compliant with the specified column names subset of the table format.

names specifies the column names of the table to be modified.

Returns the number of modified rows.

It raises an ValueError in case the columns parameter could not be converted to an object compliant with table description.

It raises an IndexError in case the modification will exceed the length of the table.

removeRows(start, stop=None)

Removes a range of rows in the table. If only start is supplied, this row is to be deleted. If a range is supplied, i.e. both the start and stop parameters are passed, all the rows in the range are removed. A step parameter is not supported, and it is not foreseen to implement it anytime soon.

start

Sets the starting row to be removed. It accepts negative values meaning that the count starts from the end. A value of 0 means the first row.

stop

Sets the last row to be removed to stop - 1, i.e. the end point is omitted (in the Python range tradition). It accepts, likewise start, negative values. A special value of None (the default) means removing just the row supplied in start.

removeIndex(index)

Remove the index associated with the specified column.

The argument colname should be the name of a column. If the column is not indexed, nothing happens. If it does not exist, a KeyError is raised.

This index can be created again by calling the createIndex() (see description) method of the appropriate Column object.

flushRowsToIndex()

Add remaining rows in buffers to non-dirty indexes. This can be useful when you have chosen non-automatic indexing for the table (see Section 4.17.2) and want to update the indexes on it.

reIndex()

Recompute all the existing indexes in table. This can be useful when you suspect that, for any reason, the index information for columns is no longer valid and want to rebuild the indexes on it.

reIndexDirty()

Recompute the existing indexes in table, but only if they are dirty. This can be useful when you have set the reindex parameter to 0 in IndexProps constructor (see description) for the table and want to update the indexes after a invalidating index operation (Table.removeRows, for example).

where(condition, start=None, stop=None, step=None)

Iterate over values fulfilling a condition.

This method returns an iterator yielding Row (see 4.6.4) instances built from rows in the table that satisfy the given condition over a column. If that column is indexed, its index will be used in order to accelerate the search. Else, the in-kernel iterator (with has still better performance than standard Python selections) will be chosen instead. Please, check the Section 5.2 for more information about the performance of the different searching modes.

Moreover, if a range is supplied (i.e. some of the start, stop or step parameters are passed), only the rows in that range and fulfilling the condition are returned. The meaning of the start, stop and step parameters is the same as in the range() Python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1.

You can mix this method with standard Python selections in order to have complex queries. It is strongly recommended that you pass the most restrictive condition as the parameter to this method if you want to achieve maximum performance.

Example of use:

passvalues=[]
for row in table.where(0 < table.cols.col1 < 0.3, step=5):
    if row['col2'] <= 20:
        passvalues.append(row['col3'])
print "Values that pass the cuts:", passvalues

Note that, from PyTables 1.1 on, you can nest several iterators over the same table. For example:

for p in rout.where(rout.cols.pressure < 16):
    for q in rout.where(rout.cols.pressure < 9):
        for n in rout.where(rout.cols.energy < 10):
            print "pressure, energy:", p['pressure'],n['energy']

In this example, iterators returned by where() have been used, but you may as well use any of the other reading iterators that the Table object offers. Look at examples/nested-iter.py for the full code.

whereAppend(dstTable, condition, start=None, stop=None, step=None)

Append rows fulfilling the condition to the dstTable table.

dstTable must be capable of taking the rows resulting from the query, i.e. it must have columns with the expected names and compatible types. The meaning of the other arguments is the same as in the where() method (see description).

The number of rows appended to dstTable is returned as a result.

getWhereList(condition, flavor=None)

Get the row coordinates that fulfill the condition parameter. This method will take advantage of an indexed column to speed-up the search.

flavor is the desired type of the returned list. It can take the "numarray", "numpy", "numeric" or "python" values. The default is returning an object of the same flavor than self.flavor.

4.6.3. Table special methods

Following are described the methods that automatically trigger actions when a Table instance is accessed in a special way (e.g., table["var2"] will be equivalent to a call to table.__getitem__("var2")).

__iter__()

It returns the same iterator than Table.iterrows(0,0,1). However, this does not accept parameters.

Example of use:

result = [ row['var2'] for row in table if row['var1'] <= 20 ]

Which is equivalent to:

result = [ row['var2'] for row in table.iterrows()
                       if row['var1'] <= 20 ]

Note: This iterator can be nested (see example in description).

__getitem__(key)

Get a row or a range of rows from the table.

If the key argument is an integer, the corresponding table row is returned as a numarray.records.Record or as a tables.nestedrecords.NestedRecord object, whichever is more appropriate. If key is a slice, the range of rows determined by it is returned as a numarray.records.RecArray or as a tables.nestedrecords.NestedRecArray object, whichever is more appropriate.

Using a string as key to get a column is supported but deprecated. Please use the col() (see description) method.

Example of use:

record = table[4]
recarray = table[4:1000:2]

Those statements are equivalent to:

record = table.read(start=4)[0]
recarray = table.read(start=4, stop=1000, step=2)

Here you can see how indexing and slicing can be used as shorthands for the read() (see description) method.

__setitem__(key, value)

It takes different actions depending on the type of the key parameter:

key is an Integer

The corresponding table row is set to value. value must be a List or Tuple capable of being converted to the table field format.

key is a Slice

The row slice determined by key is set to value. value must be a NestedRecArray object or a RecArray object or a list of rows capable of being converted to the table field format.

Example of use:

# Modify just one existing row
table[2] = [456,'db2',1.2]
# Modify two existing rows
rows = numarray.records.array([[457,'db1',1.2],[6,'de2',1.3]],
formats="i4,a3,f8")
table[1:3:2] = rows

Which is equivalent to:

table.modifyRows(start=2, rows=[456,'db2',1.2])
rows = numarray.records.array([[457,'db1',1.2],[6,'de2',1.3]],
formats="i4,a3,f8")
table.modifyRows(start=1, step=2, rows=rows)

4.6.4. The Row class

This class is used to fetch and set values on the table fields. It works very much like a dictionary, where the keys are the field names of the associated table and the values are the values of those fields in a specific row.

This object turns out to actually be an extension type, so you won't be able to access its documentation interactively. However, you will be able to access some of its internal attributes through the use of Python properties. In addition, there are some important methods that are useful for adding and modifying values in tables.

Row attributes

nrow

Property that returns the current row number in the table. It is useful to know which row is being dealt with in the middle of a loop or iterator.

Row methods

append()

Once you have filled the proper fields for the current row, calling this method actually append these new data to the disk (actually data are written to the output buffer).

Example of use:

row = table.row
for i in xrange(nrows):
    row['col1'] = i-1
    row['col2'] = 'a'
    row['col3'] = -1.0
    row.append()
table.flush()

Please, note that, after the loop in which Row.append() has been called, it is always convenient to make a call to Table.flush() in order to avoid losing the last rows that can be in internal buffers.

update()

This allows you to modify values of your tables when you are in the middle of table iterators, like Table.iterrows() (see description) or Table.where() (see description). Once you have filled the proper fields for the current row, calling this method actually commits these data to the disk (actually data are written to the output buffer).

Example of use:

for row in table.iterrows(step=10):
    row['col1'] = row.nrow
    row['col2'] = 'b'
    row['col3'] = 0.0
    row.update()

which modifies every tenth row in table. Or:

for row in table.where(table.cols.col1 > 3):
    row['col1'] = row.nrow
    row['col2'] = 'b'
    row['col3'] = 0.0
    row.update()

which just updates the rows with values in first column bigger than 3.

4.7. The Cols class

This class is used as an accessor to the table columns following the natural name convention, so that you can access the different columns because there exists one attribute with the name of the columns for each associated column, which can be a Column instance (non-nested column) or another Cols instance (nested column).

Columns under a Cols accessor can be accessed as attributes of it. For instance, if table.cols is a Cols instance with a column named col1 under it, the later can be accessed as table.cols.col1. If col1 is nested and contains a col2 column, this can be accessed as table.cols.col1.col2 and so on and so forth.

4.7.1. Cols instance variables

_v_colnames

A list of the names of the columns (or nested columns) hanging directly from this Cols instance. The order of the names matches the order of their respective columns in the containing table.

_v_colpathnames

A list of the complete pathnames of the columns hanging directly from this Cols instance. If the table does not contain nested columns, this is exactly the same as _v_colnames attribute.

_v_table

The parent Table instance.

_v_desc

The associated Description (see Section 4.9) instance.

4.7.2. Cols methods

_f_col(colname)

Return a handler to the colname column. If colname is a nested column, a Cols instance is returned. If colname is a non-nested column a Column object is returned instead.

__getitem__(key)

Get a row or a range of rows from the Cols accessor.

If the key argument is an integer, the corresponding Cols row is returned as a numarray.records.Record or as a tables.nestedrecords.NestedRecord object, whichever is more appropriate. If key is a slice, the range of rows determined by it is returned as a numarray.records.RecArray or as a tables.nestedrecords.NestedRecArray object, whichever is more appropriate.

Using a string as key to get a column is supported but deprecated. Please use the col() (see description) method.

Example of use:

record = table.cols[4]  # equivalent to table[4]
recarray = table.cols.Info[4:1000:2]

Those statements are equivalent to:

nrecord = table.read(start=4)[0]
nrecarray = table.read(start=4, stop=1000, step=2).field('Info')

Here you can see how a mix of natural naming, indexing and slicing can be used as shorthands for the read() (see description) method.

__setitem__(key)

Set a row or a range of rows to the Cols accessor.

If the key argument is an integer, the corresponding Cols row is set to the value object. If key is a slice, the range of rows determined by it is set to the value object.

Example of use:

table.cols[4] = record
table.cols.Info[4:1000:2] = recarray

Those statements are equivalent to:

table.modifyRows(4, rows=record)
table.modifyColumn(4, 1000, 2, colname='Info', column=recarray)

Here you can see how a mix of natural naming, indexing and slicing can be used as shorthands for the modifyRows() and modifyColumn() (see description and description) methods.

4.8. The Description class

The instances of the Description class provide a description of the structure of a table.

An instance of this class is automatically bound to Table (see 4.6) objects when they are created. It provides a browseable representation of the structure of the table, made of non-nested (Col —see 4.16.2) and nested (Description) columns. It also contains information that will allow you to build NestedRecArray (see Appendix B) objects suited for the different columns in a table (be they nested or not).

Column descriptions (see Col class in 4.16.2) under a description can be accessed as attributes of it. For instance, if table.description is a Description instance with a column named col1 under it, the later can be accessed as table.description.col1. If col1 is nested and contains a col2 column, this can be accessed as table.description.col1.col2.

4.8.1. Description instance variables

_v_name

The name of this description instance. If description is the root of the nested type (or the description of a flat table), its name will be the empty string ('').

_v_names

A list of the names of the columns hanging directly from this description instance. The order of the names matches the order of their respective columns in the containing description.

_v_pathnames

A list of the pathnames of the columns hanging directly from this description. If the table does not contain nested columns, this is exactly the same as _v_names attribute.

_v_nestedNames

A nested list of the names of all the columns hanging directly from this description instance. You can use this for the names argument of NestedRecArray factory functions.

_v_nestedFormats

A nested list of the numarray string formats (and shapes) of all the columns hanging directly from this description instance. You can use this for the formats argument of NestedRecArray factory functions.

_v_nestedDescr

A nested list of pairs of (name, format) tuples for all the columns under this table or nested column. You can use this for the descr argument of NestedRecArray factory functions.

_v_types

A dictionary mapping the names of non-nested columns hanging directly from this description instance to their respective numarray types.

_v_stypes

A dictionary mapping the names of non-nested columns hanging directly from this description instance to their respective string types.

_v_shapes

A dictionary mapping the names of non-nested columns hanging directly from this description instance to their respective shapes.

_v_dflts

A dictionary mapping the names of non-nested columns hanging directly from this description instance to their respective default values. Please, note that all the default values are kept internally as numarray objects.

_v_colObjects

A dictionary mapping the names of the columns hanging directly from this description instance to their respective descriptions (Col —see 4.16.2— or Description —see 4.8 — instances).

_v_itemsizes

A dictionary mapping the names of non-nested columns hanging directly from this description instance to their respective item size (in bytes).

_v_nestedlvl

The level of the description in the nested datatype.

4.8.2. Description methods

_f_walk(type='All')

Iterate over nested columns.

If type is 'All' (the default), all column description objects (Col and Description instances) are returned in top-to-bottom order (pre-order).

If type is 'Col' or 'Description', only column descriptions of that type are returned.

4.9. The Column class

Each instance of this class is associated with one column of every table. These instances are mainly used to fetch and set actual data from the table columns, but there are a few other associated methods to deal with indexes.

4.9.1. Column instance variables

table

The parent Table instance.

name

The name of the associated column.

pathname

The complete pathname of the associated column. This is mainly useful in nested columns; for non-nested ones this value is the same a name.

type

The data type of the column.

shape

The shape of the column.

index

The associated Index object (see 4.17.3) to this column (None if it does not exist).

dirty

Whether the index is dirty or not (property).

4.9.2. Column methods

createIndex()

Create an Index (see 4.17.3) object for this column.

reIndex()

Recompute the index associated with this column. This can be useful when you suspect that, for any reason, the index information is no longer valid and want to rebuild it.

reIndexDirty()

Recompute the existing index only if it is dirty. This can be useful when you have set the reindex parameter to 0 in IndexProps constructor (see description) for the table and want to update the column's index after a invalidating index operation (Table.removeRows, for example).

removeIndex()

Delete the associated column's index. After doing that, you will loose the indexation information on disk. However, you can always re-create it using the createIndex() method (see description).

4.9.3. Column special methods

__getitem__(key)

Returns a column element or slice. It takes different actions depending on the type of the key parameter:

key is an Integer

The corresponding element in the column is returned as a scalar object or as a numarray object, depending on its shape.

key is a Slice

The row range determined by this slice is returned as a numarray object.

Example of use:

print "Column handlers:"
for name in table.colnames:
    print table.cols[name]
print
print "Some selections:"
print "Select table.cols.name[1]-->", table.cols.name[1]
print "Select table.cols.name[1:2]-->", table.cols.name[1:2]
print "Select table.cols.lati[1:3]-->", table.cols.lati[1:3]
print "Select table.cols.pressure[:]-->", table.cols.pressure[:]
print "Select table.cols['temperature'][:]-->", table.cols['temperature'][:]

and the output of this for a certain arbitrary table is:

Column handlers:
/table.cols.name (Column(1,), CharType)
/table.cols.lati (Column(2,), Int32)
/table.cols.longi (Column(1,), Int32)
/table.cols.pressure (Column(1,), Float32)
/table.cols.temperature (Column(1,), Float64)

Some selections:
Select table.cols.name[1]--> Particle:     11
Select table.cols.name[1:2]--> ['Particle:     11']
Select table.cols.lati[1:3]--> [[11 12]
 [12 13]]
Select table.cols.pressure[:]--> [  90.  110.  132.]
Select table.cols['temperature'][:]--> [ 100.  121.  144.]

See the examples/table2.py for a more complete example.

__setitem__(key, value)

It takes different actions depending on the type of the key parameter:

key is an Integer

The corresponding element in the column is set to value. value must be a scalar or numarray/NumPy object, depending on column's shape.

key is a Slice

The row slice determined by key is set to value. value must be a list of elements or a numarray/NumPy.

Example of use:

# Modify row 1
table.cols.col1[1] = -1
# Modify rows 1 and 3
table.cols.col1[1::2] = [2,3]

Which is equivalent to:

# Modify row 1
table.modifyColumns(start=1, columns=[[-1]], names=["col1"])
# Modify rows 1 and 3
columns = numarray.records.fromarrays([[2,3]], formats="i4")
table.modifyColumns(start=1, step=2, columns=columns, names=["col1"])

4.10. The Array class

Represents an array on file. It provides methods to write/read data to/from array objects in the file. This class does not allow you to enlarge the datasets on disk; see the EArray descendant in Section 4.12 if you want enlargeable dataset support and/or compression features. See also CArray in Section 4.11

The array data types supported are the same as the set provided by the numarray package. For details of these data types see Appendix A, or the numarray reference manual ([12]).

An interesting property of the Array class is that it remembers the flavor of the object that has been saved so that if you saved, for example, a List, you will get a List during readings afterwards, or if you saved a NumPy array, you will get a NumPy object.

Note that this object inherits all the public attributes and methods that Leaf already provides.

4.10.1. Array instance variables

flavor

The object representation for this array. It can be any of "numarray", "numpy", "numeric" or "python" values.

nrows

The length of the first dimension of the array.

nrow

On iterators, this is the index of the current row.

type

The type class of the represented array.

stype

The string type of the represented array.

itemsize

The size of the base items. Specially useful for CharType objects.

4.10.2. Array methods

Note that, as this object has no internal I/O buffers, it is not necessary to use the flush() method inherited from Leaf in order to save its internal state to disk. When a writing method call returns, all the data is already on disk.

getEnum()

Get the enumerated type associated with this array.

If this array is of an enumerated type, the corresponding Enum instance (see 4.17.4) is returned. If it is not of an enumerated type, a TypeError is raised.

iterrows(start=None, stop=None, step=1)

Returns an iterator yielding numarray instances built from rows in array. The return rows are taken from the first dimension in case of an Array and CArray instance and the enlargeable dimension in case of an EArray instance. If a range is supplied (i.e. some of the start, stop or step parameters are passed), only the appropriate rows are returned. Else, all the rows are returned. See also the and __iter__() special methods in Section 4.10.3 for a shorter way to call this iterator.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

Example of use:

result = [ row for row in arrayInstance.iterrows(step=4) ]

read(start=None, stop=None, step=1)

Read the array from disk and return it as a numarray (default) object, or an object with the same original flavor that it was saved. It accepts start, stop and step parameters to select rows (the first dimension in the case of an Array and CArray instance and the enlargeable dimension in case of an EArray) for reading.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

4.10.3. Array special methods

Following are described the methods that automatically trigger actions when an Array instance is accessed in a special way (e.g., array[2:3,...,::2] will be equivalent to a call to array.__getitem__(slice(2,3, None), Ellipsis, slice(None, None, 2))).

__iter__()

It returns the same iterator than Array.iterrows(0,0,1). However, this does not accept parameters.

Example of use:

result = [ row[2] for row in array ]

Which is equivalent to:

result = [ row[2] for row in array.iterrows(0, 0, 1) ]

__getitem__(key)

It returns a numarray (default) object (or an object with the same original flavor that it was saved) containing the slice of rows stated in the key parameter. The set of allowed tokens in key is the same as extended slicing in python (the Ellipsis token included).

Example of use:

array1 = array[4]   # array1.shape == array.shape[1:]
array2 = array[4:1000:2]  # len(array2.shape) == len(array.shape)
array3 = array[::2, 1:4, :]
array4 = array[1, ..., ::2, 1:4, 4:] # General slice selection

__setitem__(key, value)

Sets an Array element, row or extended slice. It takes different actions depending on the type of the key parameter:

key is an integer:

The corresponding row is assigned to value. If needed, this value is broadcasted to fit the specified row.

key is a slice:

The row slice determined by it is assigned to value. If needed, this value is broadcasted to fit in the desired range. If the slice to be updated exceeds the actual shape of the array, only the values in the existing range are updated, i.e. the index error will be silently ignored. If value is a multidimensional object, then its shape must be compatible with the slice specified in key, otherwise, a ValueError will be issued.

Example of use:

a1[0] = 333       # Assign an integer to a Integer Array row
a2[0] = "b"       # Assign a string to a string Array row
a3[1:4] = 5       # Broadcast 5 to slice 1:4
a4[1:4:2] = "xXx" # Broadcast "xXx" to slice 1:4:2
# General slice update (a5.shape = (4,3,2,8,5,10)
a5[1, ..., ::2, 1:4, 4:] = arange(1728, shape=(4,3,2,4,3,6))

4.11. The CArray class

This is a child of the Array class (see 4.10) and as such, CArray represents an array on the file. The difference is that CArray has a chunked layout and, as a consequence, it also supports compression. You can use this class to easily save or load array (or array slices) objects to or from disk, with compression support included.

4.11.1. CArray instance variables

In addition to the attributes that CArray inherits from Array, it supports some more that provide information about the filters used.

atom

An Atom (see 4.16.3) instance representing the shape, type and flavor of the atomic objects to be saved.

4.11.2. Example of use

See below a small example of CArray class. The code is available in examples/carray1.py.

import numarray
import tables

fileName = 'carray1.h5'
shape = (200,300)
atom = tables.UInt8Atom(shape = (128,128))
filters = tables.Filters(complevel=5, complib='zlib')

h5f = tables.openFile(fileName,'w')
ca = h5f.createCArray(h5f.root, 'carray', shape, atom, filters=filters)
# Fill a hyperslab in ca. The array will be converted to UInt8 elements
ca[10:60,20:70] = numarray.ones((50,50))
h5f.close()

# Re-open a read another hyperslab
h5f = tables.openFile(fileName)
print h5f
print h5f.root.carray[8:12, 18:22]
h5f.close()

The output for the previous script is something like:

carray1.h5 (File) ''
Last modif.: 'Thu Jun 16 10:47:18 2005'
Object Tree:
/ (RootGroup) ''
/carray (CArray(200L, 300L)) ''

[[0 0 0 0]
 [0 0 0 0]
 [0 0 1 1]
 [0 0 1 1]]

4.12. The EArray class

This is a child of the Array class (see 4.10) and as such, EArray represents an array on the file. The difference is that EArray allows to enlarge datasets along any single dimension[13] you select. Another important difference is that it also supports compression.

So, in addition to the attributes and methods that EArray inherits from Array, it supports a few more that provide a way to enlarge the arrays on disk. Following are described the new variables and methods as well as some that already exist in Array but that differ somewhat on the meaning and/or functionality in the EArray context.

4.12.1. EArray instance variables

atom

An Atom (see 4.16.3) instance representing the shape, type and flavor of the atomic objects to be saved. One of the dimensions of the shape is 0, meaning that the array can be extended along it.

extdim

The enlargeable dimension, i.e. the dimension this array can be extended along.

nrows

The length of the enlargeable dimension of the array.

4.12.2. EArray methods

getEnum()

Get the enumerated type associated with this array.

If this array is of an enumerated type, the corresponding Enum instance (see 4.17.4) is returned. If it is not of an enumerated type, a TypeError is raised.

append(sequence)

Appends a sequence to the underlying dataset. Obviously, this sequence must have the same type as the EArray instance; otherwise a TypeError is issued. In the same way, the dimensions of the sequence have to conform to those of EArray, that is, all the dimensions have to be the same except, of course, that of the enlargeable dimension which can be of any length (even 0!).

Example of use (code available in examples/earray1.py):

import tables
from numarray import strings

fileh = tables.openFile("earray1.h5", mode = "w")
a = tables.StringAtom(shape=(0,), length=8)
# Use 'a' as the object type for the enlargeable array
array_c = fileh.createEArray(fileh.root, 'array_c', a, "Chars")
array_c.append(strings.array(['a'*2, 'b'*4], itemsize=8))
array_c.append(strings.array(['a'*6, 'b'*8, 'c'*10], itemsize=8))

# Read the string EArray we have created on disk
for s in array_c:
    print "array_c[%s] => '%s'" % (array_c.nrow, s)
# Close the file
fileh.close()

and the output is:

array_c[0] => 'aa'
array_c[1] => 'bbbb'
array_c[2] => 'aaaaaa'
array_c[3] => 'bbbbbbbb'
array_c[4] => 'cccccccc'

4.13. The VLArray class

Instances of this class represents array objects in the object tree with the property that their rows can have a variable number of (homogeneous) elements (called atomic objects, or just atoms). Variable length arrays (or VLA's for short), similarly to Table instances, can have only one dimension, and likewise Table, the compound elements (the atoms) of the rows of VLArrays can be fully multidimensional objects.

VLArray provides methods to read/write data from/to variable length array objects residents on disk. Also, note that this object inherits all the public attributes and methods that Leaf already has.

4.13.1. VLArray instance variables

atom

An Atom (see 4.16.3) instance representing the shape, type and flavor of the atomic objects to be saved.

nrow

On iterators, this is the index of the current row.

nrows

The total number of rows.

4.13.2. VLArray methods

getEnum()

Get the enumerated type associated with this array.

If this array is of an enumerated type, the corresponding Enum instance (see 4.17.4) is returned. If it is not of an enumerated type, a TypeError is raised.

append(sequence, *objects)

Append objects in the sequence to the array.

This method appends the objects in the sequence to a single row in this array. The type of individual objects must be compliant with the type of atoms in the array. In the case of variable length strings, the very string to append is the sequence.

Example of use (code available in examples/vlarray1.py):

import tables
from numpy import *   # or, from numarray import *

# Create a VLArray:
fileh = tables.openFile("vlarray1.h5", mode = "w")
vlarray = fileh.createVLArray(fileh.root, 'vlarray1',
tables.Int32Atom(flavor="numpy"),
                 "ragged array of ints", Filters(complevel=1))
# Append some (variable length) rows:
vlarray.append(array([5, 6]))
vlarray.append(array([5, 6, 7]))
vlarray.append([5, 6, 9, 8])

# Now, read it through an iterator:
for x in vlarray:
    print vlarray.name+"["+str(vlarray.nrow)+"]-->", x

# Close the file
fileh.close()

The output of the previous program looks like this:

vlarray1[0]--> [5 6]
vlarray1[1]--> [5 6 7]
vlarray1[2]--> [5 6 9 8]

The objects argument is only retained for backwards compatibility; please do not use it.

iterrows(start=None, stop=None, step=1)

Returns an iterator yielding one row per iteration. If a range is supplied (i.e. some of the start, stop or step parameters are passed), only the appropriate rows are returned. Else, all the rows are returned. See also the __iter__() special methods in Section 4.13.3 for a shorter way to call this iterator.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

Example of use:

for row in vlarray.iterrows(step=4):
    print vlarray.name+"["+str(vlarray.nrow)+"]-->", row

read(start=None, stop=None, step=1)

Returns the actual data in VLArray. As the lengths of the different rows are variable, the returned value is a python list, with as many entries as specified rows in the range parameters.

The meaning of the start, stop and step parameters is the same as in the range() python function, except that negative values of step are not allowed. Moreover, if only start is specified, then stop will be set to start+1. If you do not specify neither start nor stop, then all the rows in the object are selected.

4.13.3. VLArray special methods

Following are described the methods that automatically trigger actions when a VLArray instance is accessed in a special way (e.g., vlarray[2:5] will be equivalent to a call to vlarray.__getitem__(slice(2,5,None)).

__iter__()

It returns the same iterator than VLArray.iterrows(0,0,1). However, this does not accept parameters.

Example of use:

result = [ row for row in vlarray ]

Which is equivalent to:

result = [ row for row in vlarray.iterrows() ]

__getitem__(key)

It returns the slice of rows determined by key, which can be an integer index or an extended slice. The returned value is a list of objects of type array.atom.type.

Example of use:

list1 = vlarray[4]
list2 = vlarray[4:1000:2]

__setitem__(keys, value)

Updates a vlarray row described by keys by setting it to value. Depending on the value of keys, the action taken is different:

keys is an integer:

It refers to the number of row to be modified. The value object must be type and shape compatible with the object that exists in the vlarray row.

keys is a tuple:

The first element refers to the row to be modified, and the second element to the range (so, it can be an integer or an slice) of the row that will be updated. As above, the value object must be type and shape compatible with the object specified in the vlarray row and range.

Note: When updating VLStrings (codification UTF-8) or Objects atoms, there is a problem: one can only update values with exactly the same bytes than in the original row. With UTF-8 encoding this is problematic because, for instance, 'c' takes 1 byte, but ' ' takes two. The same applies when using Objects atoms, because when cPickle applies to a class instance (for example), it does not guarantee to return the same number of bytes than over other instance, even of the same class than the former. These facts effectively limit the number of objects than can be updated in VLArrays.

Example of use:

vlarray[0] = vlarray[0]*2+3
vlarray[99,3:] = arange(96)*2+3
# Negative values for start and stop (but not step) are supported
vlarray[99,-99:-89:2] = vlarray[5]*2+3

4.14. The UnImplemented class

Instances of this class represents an unimplemented dataset in a generic HDF5 file. When reading such a file (i.e. one that has not been created with PyTables, but with some other HDF5 library based tool), chances are that the specific combination of datatypes and/or dataspaces in some dataset might not be supported by PyTables yet. In such a case, this dataset will be mapped into the UnImplemented class and hence, the user will still be able to build the complete object tree of this generic HDF5 file, as well as enabling the access (both read and write) of the attributes of this dataset and some metadata. Of course, the user won't be able to read the actual data on it.

This is an elegant way to allow users to work with generic HDF5 files despite the fact that some of its datasets would not be supported by PyTables. However, if you are really interested in having access to an unimplemented dataset, please, get in contact with the developer team.

This class does not have any public instance variables, except those inherited from the Leaf class (see 4.5).

4.15. The AttributeSet class

Represents the set of attributes of a node (Leaf or Group). It provides methods to create new attributes, open, rename or delete existing ones.

Like in Group instances, AttributeSet instances make use of the natural naming convention, i.e. you can access the attributes on disk like if they were normal AttributeSet attributes. This offers the user a very convenient way to access (but also to set and delete) node attributes by simply specifying them like a normal attribute class.

Caveat emptor: All Python data types are supported. In particular, multidimensional numarray objects are saved natively as multidimensional objects in the HDF5 file. Python strings are also saved natively as HDF5 strings, and loaded back as Python strings. However, the rest of the data types including the Python scalar ones (i.e. Int, Long and Float) and more general objects (like NumPy or Numeric) are serialized using cPickle, so you will be able to correctly retrieve them only from a Python-aware HDF5 library. So, if you want to save Python scalar values and be able to read them with generic HDF5 tools, you should make use of scalar numarray objects (for example numarray.array(1, type=numarray.Int64)). In the same way, attributes in HDF5 native files will be always mapped into numarray objects. Specifically, a multidimensional attribute will be mapped into a multidimensional numarray and an scalar will be mapped into a scalar numarray (for example, an attribute of type H5T_NATIVE_LLONG will be read and returned as a numarray.array(X, type=numarray.Int64) scalar).

One more warning: because of the various potential difficulties in restoring a Python object stored in an attribute, you may end up getting a cPickle string where a Python object is expected. If this is the case, you may wish to run cPickle.loads() on that string to get an idea of where things went wrong, as shown in this example:

>>> import tables
>>>
>>> class MyClass(object):
...   foo = 'bar'
...
>>> # An object of my custom class.
... myObject = MyClass()
>>>
>>> h5f = tables.openFile('test.h5', 'w')
>>> h5f.root._v_attrs.obj = myObject  # store the object
>>> print h5f.root._v_attrs.obj.foo  # retrieve it
bar
>>> h5f.close()
>>>
>>> # Delete class of stored object and reopen the file.
... del MyClass, myObject
>>>
>>> h5f = tables.openFile('test.h5', 'r')
>>> print h5f.root._v_attrs.obj.foo
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'str' object has no attribute 'foo'
>>> # Let us inspect the object to see what is happening.
... print repr(h5f.root._v_attrs.obj)
'ccopy_reg\n_reconstructor\np1\n(c__main__\nMyClass\np2\nc__builtin__\nobject\np3\nNtRp4\n.'
>>> # Maybe unpickling the string will yield more information:
... import cPickle
>>> cPickle.loads(h5f.root._v_attrs.obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'MyClass'
>>> # So the problem was not in the stored object,
... # but in the *environment* where it was restored.
... h5f.close()

4.15.1. AttributeSet instance variables

_v_node

The parent node instance.

_v_attrnames

List with all attribute names.

_v_attrnamessys

List with system attribute names.

_v_attrnamesuser

List with user attribute names.

4.15.2. AttributeSet methods

Note that this class defines the __setattr__, __getattr__ and __delattr__ and they work as normally intended. Any scalar (string, ints or floats) attribute is supported natively as an attribute. However, (c)Pickle is automatically used so as to serialize other kind of objects (like lists, tuples, dicts, small NumPy/Numeric/numarray objects, ...) that you might want to save. If an attribute is set on a target node that already has a large number of attributes, a PerformanceWarning will be issued.

With these special methods, you can access, assign or delete attributes on disk by just using the next constructs:

leaf.attrs.myattr = "str attr"  # Set a string (native support)
leaf.attrs.myattr2 = 3          # Set an integer (native support)
leaf.attrs.myattr3 = [3,(1,2)]  # A generic object (Pickled)
attrib = leaf.attrs.myattr      # Get the attribute myattr
del leaf.attrs.myattr           # Delete the attribute myattr

_f_copy(where)

Copy the user attributes (as well as certain system attributes) to where object. where has to be a Group or Leaf instance.

_f_list(attrset="user")

Return a list of attribute names of the parent node. attrset selects the attribute set to be used. A user value returns only the user attributes and this is the default. sys returns only the system attributes. all returns both the system and user attributes.

_f_rename(oldattrname, newattrname)

Rename an attribute.

4.16. Declarative classes

In this section a series of classes that are meant to declare datatypes that are required for primary PyTables (like Table or VLArray ) objects are described.

4.16.1. The IsDescription class

This class is designed to be used as an easy, yet meaningful way to describe the properties of Table objects through the definition of derived classes that inherit properties from it. In order to define such a class, you must declare it as descendant of IsDescription, with as many attributes as columns you want in your table. The name of each attribute will become the name of a column, and its value will hold a description of it.

Ordinary columns can be described using instances of the Col (see Section 4.16.2) class. Nested columns can be described by using classes derived from IsDescription or instances of it. Derived classes can be declared in place (in which case the column takes the name of the class) or referenced by name, and they can have a _v_pos special attribute which sets the position of the nested column among its sibling columns.

Once you have created a description object, you can pass it to the Table constructor, where all the information it contains will be used to define the table structure. See the Section 3.4 for an example on how that works.

See below for a complete list of the special attributes that can be specified to complement the metadata of an IsDescription class.

IsDescription special attributes

_v_flavor

The flavor of the table. It can take "numarray" (default) or "numpy" values. This determines the type of objects returned during input (i.e. read) operations.

_v_indexprops

An instance of the IndexProps class (see Section 4.17.2). You can use this to alter the properties of the index creation process for a table.

_v_pos

Sets the position of a possible nested column description among its sibling columns.

4.16.2. The Col class and its descendants

The Col class is used as a mean to declare the different properties of a table column. In addition, a series of descendant classes are offered in order to make these column descriptions easier to the user. In general, it is recommended to use these descendant classes, as they are more meaningful when found in the middle of the code.

Col instance attributes

type

The type class of the column.

stype

The string type of the column.

recarrtype

The string type, in RecArray format, of the column.

shape

The shape of the column.

itemsize

The size of the base items. Specially useful for StringCol objects.

indexed

Whether this column is meant to be indexed or not.

_v_pos

The position of this column with regard to its column siblings.

_v_name

The name of this column

_v_pathname

The complete pathname of the column. This is mainly useful in nested columns; for non-nested ones this value is the same a _v_name.

Col methods

None.

Col constructors

A description of the different constructors with their parameters follows:

Col(dtype="Float64", shape=1, dflt=None, pos=None, indexed=0)

Declare the properties of a Table column.

dtype

The data type for the column. All types listed in Appendix A are valid data types for columns. The type description is accepted both in string-type format and as a numarray data type.

shape

An integer or a tuple, that specifies the number of dtype items for each element (or shape, for multidimensional elements) of this column. For CharType columns, the last dimension is used as the length of the character strings. However, for this kind of objects, the use of StringCol subclass is strongly recommended.

dflt

The default value for elements of this column. If the user does not supply a value for an element while filling a table, this default value will be written to disk. If the user supplies an scalar value for a multidimensional column, this value is automatically broadcasted to all the elements in the column cell. If dflt is not supplied, an appropriate zero value (or null string) will be chosen by default. Please, note that all the default values are kept internally as numarray objects.

pos

By default, columns are arranged in memory following an alpha-numerical order of the column names. In some situations, however, it is convenient to impose a user defined ordering. pos parameter allows the user to force the desired ordering.

indexed

Whether this column should be indexed for better performance in table selections.

StringCol(length=None, dflt=None, shape=1, pos=None, indexed=0)

Declare a column to be of type CharType. The length parameter sets the length of the strings. The meaning of the other parameters are like in the Col class.

BoolCol(dflt=0, shape=1, pos=None, indexed=0)

Define a column to be of type Bool. The meaning of the parameters are the same of those in the Col class.

IntCol(dflt=0, shape=1, itemsize=4, sign=1, pos=None, indexed=0)

Declare a column to be of type IntXX, depending on the value of itemsize parameter, that sets the number of bytes of the integers in the column. sign determines whether the integers are signed or not. The meaning of the other parameters are the same of those in the Col class.

This class has several descendants:

Int8Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type Int8.

UInt8Col(dflt=0, shape=1, pos=None,indexed=0)

Define a column of type UInt8.

Int16Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type Int16.

UInt16Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type UInt16.

Int32Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type Int32.

UInt32Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type UInt32.

Int64Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type Int64.

UInt64Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type UInt64.

FloatCol(dflt=0.0, shape=1, itemsize=8, pos=None, indexed=0)

Define a column to be of type FloatXX, depending on the value of itemsize. The itemsize parameter sets the number of bytes of the floats in the column and the default is 8 bytes (double precision). The meaning of the other parameters are the same as those in the Col class.

This class has two descendants:

Float32Col(dflt=0.0, shape=1, pos=None, indexed=0)

Define a column of type Float32.

Float64Col(dflt=0.0, shape=1, pos=None, indexed=0)

Define a column of type Float64.

ComplexCol(dflt=0.+0.j, shape=1, itemsize=16, pos=None)

Define a column to be of type ComplexXX, depending on the value of itemsize. The itemsize parameter sets the number of bytes of the complex types in the column and the default is 16 bytes (double precision complex). The meaning of the other parameters are the same as those in the Col class.

ComplexCol columns and its descendants do not support indexation.

This class has two descendants:

Complex32Col(dflt=0.+0.j, shape=1, pos=None)

Define a column of type Complex32.

Complex64Col(dflt=0+0.j, shape=1, pos=None)

Define a column of type Complex64.

TimeCol(dflt=0, shape=1, itemsize=8, pos=None, indexed=0)

Define a column to be of type Time. Two kinds of time columns are supported depending on the value of itemsize: 4-byte signed integer and 8-byte double precision floating point columns (the default ones). The meaning of the other parameters are the same as those in the Col class.

Time columns have a special encoding in the HFD5 file. See Appendix A for more information on those types.

This class has two descendants:

Time32Col(dflt=0, shape=1, pos=None, indexed=0)

Define a column of type Time32.

Time64Col(dflt=0.0, shape=1, pos=None, indexed=0)

Define a column of type Time64.

EnumCol(enum, dflt, dtype='UInt32', shape=1, pos=None, indexed=False)

Description of a column of an enumerated type.

Instances of this class describe a table column which stores enumerated values. Those values belong to an enumerated type, defined by the first argument (enum) in the constructor of EnumCol, which accepts the same kinds of arguments as Enum (see 4.17.4). The enumerated type is stored in the enum attribute of the column.

A default value must be specified as the second argument (dflt) in the constructor; it must be the name (a string) of one of the enumerated values in the enumerated type. Once the column is created, the corresponding concrete value is stored in its dflt attribute. If the name does not match any value in the enumerated type, a KeyError is raised.

A numarray data type might be specified in order to determine the base type used for storing the values of enumerated values in memory and disk. The data type must be able to represent each and every concrete value in the enumeration. If it is not, a TypeError is raised. The default base type is unsigned 32-bit integer, which is sufficient for most cases.

The stype attribute of enumerated columns is always 'Enum', while the type attribute is the data type used for storing concrete values.

The shape, position and indexed attributes of the column are treated as with other column description objects (see 4.16.2).

4.16.3. The Atom class and its descendants.

The Atom class is a descendant of the Col class (see 4.16.2) and is meant to declare the different properties of the base element (also known as atom) of CArray, EArray and VLArray objects. The Atom instances have the property that their length is always the same. However, you can grow objects along the extensible dimension in the case of EArray or put a variable number of them on a VLArray row. Moreover, the atoms are not restricted to scalar values, and they can be fully multidimensional objects.

A series of descendant classes are offered in order to make the use of these element descriptions easier. In general, it is recommended to use these descendant classes, as they are more meaningful when found in the middle of the code.

Atom instance variables

In addition to the variables that it inherits from the Col class, it has the next additional attributes:

flavor

The object representation for this atom. See below on constructors description for Atom class the possible values it can take.

Atom methods

atomsize()

Returns the total length, in bytes, of the element base atom. If its shape is has one zero element on it (for use in EArrays, for example), this is replaced by an one in order to compute the atom size correctly.

Atom constructors

A description of the different constructors with their parameters follows:

Atom(dtype="Float64", shape=1, flavor="numarray")

Define properties for the base elements of CArray, EArray and VLArray objects.

dtype

The data type for the base element. See the Appendix A for a relation of data types supported. The type description is accepted both in string-type format and as a numarray data type.

shape

In a EArray context, it is a tuple specifying the shape of the object, and one (and only one) of its dimensions must be 0, meaning that the EArray object will be enlarged along this axis. In the case of a VLArray, it can be an integer with a value of 1 (one) or a tuple, that specifies whether the atom is an scalar (in the case of a 1) or has multiple dimensions (in the case of a tuple). For CharType elements, the last dimension is used as the length of the character strings. However, for this kind of objects, the use of StringAtom subclass is strongly recommended.

flavor

The object representation for this atom. It can be any of "numarray", "numpy" or "python" for the character types and "numarray", "numpy", "numeric" or "python" for the numerical types. If specified, the read atoms will be converted to that specific flavor. If not specified, the atoms will remain in their native format (i.e. numarray).

StringAtom(shape=1, length=None, flavor="numarray")

Define an atom to be of CharType type. The meaning of the shape parameter is the same as in the Atom class. length sets the length of the strings atoms. flavor can be whether "numarray", "numpy" or "python". Unicode strings are not supported by this type; see the VLStringAtom class if you want Unicode support (only available for VLAtom objects).

BoolAtom(shape=1, flavor="numarray")

Define an atom to be of type Bool. The meaning of the parameters are the same of those in the Atom class.

IntAtom(shape=1, itemsize=4, sign=1, flavor="numarray")

Define an atom to be of type IntXX, depending on the value of itemsize parameter, that sets the number of bytes of the integers that conform the atom. sign determines whether the integers are signed or not. The meaning of the other parameters are the same of those in the Atom class.

This class has several descendants:

Int8Atom(shape=1, flavor="numarray")

Define an atom of type Int8.

UInt8Atom(shape=1, flavor="numarray")

Define an atom of type UInt8.

Int16Atom(shape=1, flavor="numarray")

Define an atom of type Int16.

UInt16Atom(shape=1, flavor="numarray")

Define an atom of type UInt16.

Int32Atom(shape=1, flavor="numarray")

Define an atom of type Int32.

UInt32Atom(shape=1, flavor="numarray")

Define an atom of type UInt32.

Int64Atom(shape=1, flavor="numarray")

Define an atom of type Int64.

UInt64Atom(shape=1, flavor="numarray")

Define an atom of type UInt64.

FloatAtom(shape=1, itemsize=8, flavor="numarray")

Define an atom to be of FloatXX type, depending on the value of itemsize. The itemsize parameter sets the number of bytes of the floats in the atom and the default is 8 bytes (double precision). The meaning of the other parameters are the same as those in the Atom class.

This class has two descendants:

Float32Atom(shape=1, flavor="numarray")

Define an atom of type Float32.

Float64Atom(shape=1, flavor="numarray")

Define an atom of type Float64.

ComplexAtom(shape=1, itemsize=16, flavor="numarray")

Define an atom to be of ComplexXX type, depending on the value of itemsize. The itemsize parameter sets the number of bytes of the floats in the atom and the default is 16 bytes (double precision complex). The meaning of the other parameters are the same as those in the Atom class.

This class has two descendants:

Complex32Atom(shape=1, flavor="numarray")

Define an atom of type Complex32.

Complex64Atom(shape=1, flavor="numarray")

Define an atom of type Complex64.

TimeAtom(shape=1, itemsize=8, flavor="numarray")

Define an atom to be of type Time. Two kinds of time atoms are supported depending on the value of itemsize: 4-byte signed integer and 8-byte double precision floating point atoms (the default ones). The meaning of the other parameters are the same as those in the Atom class.

Time atoms have a special encoding in the HFD5 file. See Appendix A for more information on those types.

This class has two descendants:

Time32Atom(shape=1, flavor="numarray")

Define an atom of type Time32.

Time64Atom(shape=1, flavor="numarray")

Define an atom of type Time64.

EnumAtom(enum, dtype='UInt32', shape=1, flavor='numarray')

Description of an atom of an enumerated type.

Instances of this class describe the atom type used by an array to store enumerated values. Those values belong to an enumerated type.

The meaning of the enum and dtype arguments is the same as in EnumCol (see 4.16.2). The shape and flavor arguments have the usual meaning of other Atom classes (the flavor applies to the representation of concrete read values).

Enumerated atoms also have stype and type attributes with the same values as in EnumCol.

Now, there come two special classes, ObjectAtom and VLString, that actually do not descend from Atom, but which goal is so similar that they should be described here. The difference between them and the Atom and descendants classes is that these special classes does not allow multidimensional atoms, nor multiple values per row. A flavor can not be specified neither as it is immutable (see below).

Caveat emptor: You are only allowed to use these classes to create VLArray objects, not CArray and EArray objects.

ObjectAtom()

This class is meant to fit any kind of object in a row of an VLArray instance by using cPickle behind the scenes. Due to the fact that you can not foresee how long will be the output of the cPickle serialization (i.e. the atom already has a variable length), you can only fit a representant of it per row. However, you can still pass several parameters to the VLArray.append() method as they will be regarded as a tuple of compound objects (the parameters), so that we still have only one object to be saved in a single row. It does not accept parameters and its flavor is automatically set to "Object", so the reads of rows always returns an arbitrary python object. You can regard ObjectAtom types as an easy way to save an arbitrary number of generic python objects in a VLArray object.

VLStringAtom()

This class describes a row of the VLArray class, rather than an atom. It differs from the StringAtom class in that you can only add one instance of it to one specific row, i.e. the VLArray.append() method only accepts one object when the base atom is of this type. Besides, it supports Unicode strings (contrarily to StringAtom) because it uses the UTF-8 codification (this is why its atomsize() method returns always 1) when serializing to disk. It does not accept any parameter and because its flavor is automatically set to "VLString", the reads of rows always returns a python string. See the Section D.3.5 if you are curious on how this is implemented at the low-level. You can regard VLStringAtom types as an easy way to save generic variable length strings.

See examples/vlarray1.py and examples/vlarray2.py for further examples on VLArrays, including object serialization and Unicode string management.

4.17. Helper classes

In this section are listed classes that does not fit in any other section and that mainly serve for ancillary purposes.

4.17.1. The Filters class

This class is meant to serve as a container that keeps information about the filter properties associated with the enlargeable leaves, that is Table, EArray and VLArray as well as CArray.

The public variables of Filters are listed below:

complevel

The compression level (0 means no compression).

complib

The compression filter used (in case of compressed dataset).

shuffle

Whether the shuffle filter is active or not.

fletcher32

Whether the fletcher32 filter is active or not.

There are no Filters public methods with the exception of the constructor itself that is described next.

Filters(complevel=0, complib="zlib", shuffle=1, fletcher32=0)

The parameters that can be passed to the Filters class constructor are:

complevel

Specifies a compress level for data. The allowed range is 0-9. A value of 0 disables compression. The default is that compression is disabled, that balances between compression effort and CPU consumption.

complib

Specifies the compression library to be used. Right now, "zlib" (default), "lzo", "ucl" and "bzip2" values are supported. See Section 5.3 for some advice on which library is better suited to your needs.

shuffle

Whether or not to use the shuffle filter present in the HDF5 library. This is normally used to improve the compression ratio (at the cost of consuming a little bit more CPU time). A value of 0 disables shuffling and 1 makes it active. The default value depends on whether compression is enabled or not; if compression is enabled, shuffling defaults to be active, else shuffling is disabled.

fletcher32

Whether or not to use the fletcher32 filter in the HDF5 library. This is used to add a checksum on each data chunk. A value of 0 disables the checksum and it is the default.

Of course, you can also create an instance and then assign the ones you want to change. For example:

import numarray as na
from tables import *

fileh = openFile("test5.h5", mode = "w")
atom = Float32Atom(shape=(0,2))
filters = Filters(complevel=1, complib = "lzo")
filters.fletcher32 = 1
arr = fileh.createEArray(fileh.root, 'earray', atom, "A growable array",
                         filters = filters)
# Append several rows in only one call
arr.append(na.array([[1., 2.],
                     [2., 3.],
                     [3., 4.]], type=na.Float32))

# Print information on that enlargeable array
print "Result Array:"
print repr(arr)

fileh.close()

This enforces the use of the LZO library, a compression level of 1 and a fletcher32 checksum filter as well. See the output of this example:

Result Array:
/earray (EArray(3L, 2), fletcher32, shuffle, lzo(1)) 'A growable array'
  type = Float32
  shape = (3L, 2)
  itemsize = 4
  nrows = 3
  extdim = 0
  flavor = 'numarray'
  byteorder = 'little'

4.17.2. The IndexProps class

You can use this class to set/unset the properties in the indexing process of a Table column. To use it, create an instance, and assign it to the special attribute _v_indexprops in a table description class (see 4.16.1) or dictionary.

The public variables of IndexProps are listed below:

auto

Whether an existing index should be updated or not after a table append operation.

reindex

Whether the table columns are to be re-indexed after an invalidating index operation.

filters

The filter settings for the different Table indexes.

There are no IndexProps public methods with the exception of the constructor itself that is described next.

IndexProps(auto=1, reindex=1, filters=None)

The parameters that can be passed to the IndexProps class constructor are:

auto

Specifies whether an existing index should be updated or not after a table append operation. The default is enable automatic index updates.

reindex

Specifies whether the table columns are to be re-indexed after an invalidating index operation (like for example, after a Table.removeRows call). The default is to reindex after operations that invalidate indexes.

filters

Sets the filter properties for Column indexes. It has to be an instance of the Filters (see 4.17.1) class. A None value means that the default settings for the Filters object are selected.

4.17.3. The Index class

This class is used to keep the indexing information for table columns. It is actually a descendant of the Group class, with some added functionality.

It has no methods intended for programmer's use, but it has some attributes that may be interesting for him.

Index instance variables

column

The column object this index belongs to.

type

The type class for the index.

itemsize

The size of the atomic items. Specially useful for columns of CharType type.

nelements

The total number of elements in index.

dirty

Whether the index is dirty or not.

filters

The Filters (see Section 4.17.1) instance for this index.

4.17.4. The Enum class

Each instance of this class represents an enumerated type. The values of the type must be declared exhaustively and named with strings, and they might be given explicit concrete values, though this is not compulsory. Once the type is defined, it can not be modified.

There are three ways of defining an enumerated type. Each one of them corresponds to the type of the only argument in the constructor of Enum:

  • Sequence of names: each enumerated value is named using a string, and its order is determined by its position in the sequence; the concrete value is assigned automatically:

    >>> boolEnum = Enum(['True', 'False'])
  • Mapping of names: each enumerated value is named by a string and given an explicit concrete value. All of the concrete values must be different, or a ValueError will be raised.

    >>> priority = Enum({'red': 20, 'orange': 10, 'green': 0})
    >>> colors = Enum({'red': 1, 'blue': 1})
    Traceback (most recent call last):
      ...
    ValueError: enumerated values contain duplicate concrete values: 1
  • Enumerated type: in that case, a copy of the original enumerated type is created. Both enumerated types are considered equal.

    >>> prio2 = Enum(priority)
    >>> priority == prio2
    True

Please, note that names starting with _ are not allowed, since they are reserved for internal usage:

>>> prio2 = Enum(['_xx'])
Traceback (most recent call last):
  ...
ValueError: name of enumerated value can not start with ``_``: '_xx'

The concrete value of an enumerated value is obtained by getting its name as an attribute of the Enum instance (see __getattr__()) or as an item (see __getitem__()). This allows comparisons between enumerated values and assigning them to ordinary Python variables:

>>> redv = priority.red
>>> redv == priority['red']
True
>>> redv > priority.green
True
>>> priority.red == priority.orange
False

The name of the enumerated value corresponding to a concrete value can also be obtained by using the __call__() method of the enumerated type. In this way you get the symbolic name to use it later with __getitem__():

>>> priority(redv)
'red'
>>> priority.red == priority[priority(priority.red)]
True

(If you ask, the __getitem__() method is not used for this purpose to avoid ambiguity in the case of using strings as concrete values.)

Special methods

__getitem__(name)

Get the concrete value of the enumerated value with that name.

The name of the enumerated value must be a string. If there is no value with that name in the enumeration, a KeyError is raised.

__getattr__(name)

Get the concrete value of the enumerated value with that name.

The name of the enumerated value must be a string. If there is no value with that name in the enumeration, an AttributeError is raised.

__contains__(name)

Is there an enumerated value with that name in the type?

If the enumerated type has an enumerated value with that name, True is returned. Otherwise, False is returned. The name must be a string.

This method does not check for concrete values matching a value in an enumerated type. For that, please use the __call__() method.

__call__(value, *default)

Get the name of the enumerated value with that concrete value.

If there is no value with that concrete value in the enumeration and a second argument is given as a default, this is returned. Else, a ValueError is raised.

This method can be used for checking that a concrete value belongs to the set of concrete values in an enumerated type.

__len__()

Return the number of enumerated values in the enumerated type.

__iter__()

Iterate over the enumerated values.

Enumerated values are returned as (name, value) pairs in no particular order.

__eq__(other)

Is the other enumerated type equivalent to this one?

Two enumerated types are equivalent if they have exactly the same enumerated values (i.e. with the same names and concrete values).

__repr__()

Return the canonical string representation of the enumeration. The output of this method can be evaluated to give a new enumeration object that will compare equal to this one.



[13] In the future, multiple enlargeable dimensions might be implemented as well.