This tutorial provides a relatively simple walking tour through the basic concepts of SQLAlchemy. You may wish to skip it and dive into the main manual which is more reference-oriented. The examples in this tutorial comprise a fully working interactive Python session, and are guaranteed to be functioning courtesy of doctest.
Installing SQLAlchemy from scratch is most easily achieved with setuptools. (setuptools installation). Just run this from the command-line:
# easy_install SQLAlchemy
This command will download the latest version of SQLAlchemy from the Python Cheese Shop and install it to your system.
Otherwise, you can install from the distribution using the setup.py
script:
# python setup.py install
SQLAlchemy is designed to operate with a DBAPI implementation built for a particular database, and includes support for the most popular databases. If you have one of the supported DBAPI implementations, you can proceed to the following section. Otherwise SQLite is an easy-to-use database to get started with, which works with plain files or in-memory databases.
SQLite is included with Python 2.5 and greater.
If you are working with Python 2.3 or 2.4, SQLite and the Python API for SQLite can be installed from the following packages:
Note that the SQLite library download is not required with Windows, as the Windows Pysqlite library already includes it linked in. Pysqlite and SQLite can also be installed on Linux or FreeBSD via pre-made packages or from sources.
To start connecting to databases and begin issuing queries, we want to import the base of SQLAlchemy's functionality, which is provided under the module name of sqlalchemy
. For the purposes of this tutorial, we will import its full list of symbols into our own local namespace.
>>> from sqlalchemy import *
Note that importing using the *
operator pulls all the names from sqlalchemy
into the local module namespace, which in a real application can produce name conflicts. Therefore its recommended in practice to either import the individual symbols desired (i.e. from sqlalchemy import Table, Column
) or to import under a distinct namespace (i.e. import sqlalchemy as sa
).
After our imports, the next thing we need is a handle to the desired database, represented by an Engine
object. This object handles the business of managing connections and dealing with the specifics of a particular database. Below, we will make a SQLite connection to a file-based database called "tutorial.db".
>>> db = create_engine('sqlite:///tutorial.db')
Technically, the above statement did not make an actual connection to the sqlite database just yet. As soon as we begine working with the engine, it will start creating connections. In the case of SQLite, the tutorial.db
file will actually be created at the moment it is first used, if the file does not exist already.
For full information on creating database engines, including those for SQLite and others, see Database Engines.
Now that the basics of installing SQLAlchemy and connecting to our database are established, we can start getting in to actually doing something. But first, a little bit of explanation is required.
A central concept of SQLAlchemy is that it actually contains two distinct areas of functionality, one of which builds upon the other. One is a SQL Construction Language and the other is an Object Relational Mapper ("ORM" for short). The SQL construction language allows you to construct objects called ClauseElements
which represent SQL expressions. These ClauseElements can then be executed against any database, where they are compiled into strings that are appropriate for the target database, and return an object called a ResultProxy
, which is essentially a result set object that acts very much like a deluxe version of the dbapi cursor
object.
The Object Relational Mapper (ORM) is a set of tools completely distinct from the SQL Construction Language which serve the purpose of mapping Python object instances into database rows, providing a rich selection interface with which to retrieve instances from tables as well as a comprehensive solution to persisting changes on those instances back into the database. When working with the ORM, its underlying workings as well as its public API make extensive use of the SQL Construction Language, however the general theory of operation is slightly different. Instead of working with database rows directly, you work with your own user-defined classes and object instances. Additionally, the method of issuing queries to the database is different, as the ORM handles the job of generating most of the SQL required, and instead requires more information about what kind of class instances you'd like to load and where you'd like to put them.
Where SA is somewhat unique, more powerful, and slightly more complicated is that the two areas of functionality can be mixed together in many ways. A key strategy to working with SA effectively is to have a solid awareness of these two distinct toolsets, and which concepts of SA belong to each - even some publications have confused the SQL Construction Language with the ORM. The key difference between the two is that when you're working with cursor-like result sets its the SQL Construction Language, and when working with collections of your own class instances its the Object Relational Mapper.
This tutorial will first focus on the basic configuration that is common to using both the SQL Construction Language as well as the ORM, which is to declare information about your database called table metadata. This will be followed by some constructed SQL examples, and then into usage of the ORM utilizing the same data we established in the SQL construction examples.
Configuring SQLAlchemy for your database consists of creating objects called Tables
, each of which represent an actual table in the database. A collection of Table
objects resides in a MetaData
object which is essentially a table collection. We will create a handy form of MetaData
that automatically connects to our Engine
(connecting a schema object to an Engine is called binding):
>>> metadata = BoundMetaData(db)
An equivalent operation is to create the BoundMetaData
object directly with an Engine URL, which calls the create_engine
call for us:
>>> metadata = BoundMetaData('sqlite:///tutorial.db')
Now, when we tell "metadata" about the tables in our database, we can issue CREATE statements for those tables, as well as execute SQL statements derived from them, without needing to open or close any connections; that will be all done automatically.
Note that SQLALchemy fully supports the usage of explicit Connection objects for all SQL operations, which may be in conjunction with plain MetaData
objects that are entirely unbound to any Engine, providing a more decoupled pattern that allows finer-grained control of connections than the "bound" approach this tutorial will present. For the purposes of this tutorial, we will stick with "bound" objects, as it allows us to focus more on SA's general concepts, leaving explicit connection management as a more advanced topic.
With metadata
as our established home for tables, lets make a Table for it:
>>> users_table = Table('users', metadata, ... Column('user_id', Integer, primary_key=True), ... Column('user_name', String(40)), ... Column('password', String(10)) ... )
As you might have guessed, we have just defined a table named users
which has three columns: user_id
(which is a primary key column), user_name
and password
. Currently it is just an object that doesn't necessarily correspond to an existing table in our database. To actually create the table, we use the create()
method. To make it interesting, we will have SQLAlchemy echo the SQL statements it sends to the database, by setting the echo
flag on the Engine
associated with our BoundMetaData
:
>>> metadata.engine.echo = True >>> users_table.create() CREATE TABLE users ( user_id INTEGER NOT NULL, user_name VARCHAR(40), password VARCHAR(10), PRIMARY KEY (user_id) ) ...
Alternatively, the users
table might already exist (such as, if you're running examples from this tutorial for the second time), in which case you can just skip the create()
method call. You can even skip defining the individual columns in the users
table and ask SQLAlchemy to load its definition from the database:
>>> users_table = Table('users', metadata, autoload=True) >>> list(users_table.columns)[0].name 'user_id'
Loading a table's columns from the database is called reflection. Documentation on table metadata, including reflection, is available in Database Meta Data.
Inserting is achieved via the insert()
method, which defines a clause object (known as a ClauseElement
) representing an INSERT statement:
>>> i = users_table.insert() >>> i <sqlalchemy.sql._Insert object at 0x...> >>> # the string form of the Insert object is a generic SQL representation >>> print i INSERT INTO users (user_id, user_name, password) VALUES (?, ?, ?)
Since we created this insert statement object from the users
table which is bound to our Engine
, the statement itself is also bound to the Engine
, and supports executing itself. The execute()
method of the clause object will compile the object into a string according to the underlying dialect of the Engine to which the statement is bound, and will then execute the resulting statement.
>>> # insert a single row >>> i.execute(user_name='Mary', password='secure') INSERT INTO users (user_name, password) VALUES (?, ?) ['Mary', 'secure'] COMMIT <sqlalchemy.engine.base.ResultProxy object at 0x...> >>> # insert multiple rows simultaneously >>> i.execute({'user_name':'Tom'}, {'user_name':'Fred'}, {'user_name':'Harry'}) INSERT INTO users (user_name) VALUES (?) [['Tom'], ['Fred'], ['Harry']] COMMIT <sqlalchemy.engine.base.ResultProxy object at 0x...>
Note that the VALUES
clause of each INSERT
statement was automatically adjusted to correspond to the parameters sent to the execute()
method. This is because the compilation step of a ClauseElement
takes into account not just the constructed SQL object and the specifics of the type of database being used, but the execution parameters sent along as well.
When constructing clause objects, SQLAlchemy will bind all literal values into bind parameters. On the construction side, bind parameters are always treated as named parameters. At compilation time, SQLAlchemy will convert them into their proper format, based on the paramstyle of the underlying DBAPI. This works equally well for all named and positional bind parameter formats described in the DBAPI specification.
Documentation on inserting: Inserts.
Let's check that the data we have put into users
table is actually there. The procedure is analogous to the insert example above, except you now call the select()
method off the users
table:
>>> s = users_table.select() >>> print s SELECT users.user_id, users.user_name, users.password FROM users >>> r = s.execute() SELECT users.user_id, users.user_name, users.password FROM users []
This time, we won't ignore the return value of execute()
. Its an instance of ResultProxy
, which is a result-holding object that behaves very similarly to the cursor
object one deals with directly with a database API:
>>> r <sqlalchemy.engine.base.ResultProxy object at 0x...> >>> r.fetchone() (1, u'Mary', u'secure') >>> r.fetchall() [(2, u'Tom', None), (3, u'Fred', None), (4, u'Harry', None)]
Query criterion for the select is specified using Python expressions, using the Column
objects in the Table
as a base. All expressions constructed from Column
objects are themselves instances of ClauseElements
, just like the Select
, Insert
, and Table
objects themselves.
>>> r = users_table.select(users_table.c.user_name=='Harry').execute() SELECT users.user_id, users.user_name, users.password FROM users WHERE users.user_name = ? ['Harry'] >>> row = r.fetchone() >>> print row (4, u'Harry', None)
Pretty much the full range of standard SQL operations are supported as constructed Python expressions, including joins, ordering, grouping, functions, correlated subqueries, unions, etc. Documentation on selecting: Simple Select.
You can see that when we print out the rows returned by an execution result, it prints the rows as tuples. These rows in fact support both the list and dictionary interfaces. The dictionary interface allows the addressing of columns by string column name, or even the original Column
object:
>>> row.keys() ['user_id', 'user_name', 'password'] >>> row['user_id'], row[1], row[users_table.c.password] (4, u'Harry', None)
Addressing the columns in a row based on the original Column
object is especially handy, as it eliminates the need to work with literal column names altogether.
Result sets also support iteration. We'll show this with a slightly different form of select
that allows you to specify the specific columns to be selected:
>>> for row in select([users_table.c.user_id, users_table.c.user_name]).execute(): ... print row SELECT users.user_id, users.user_name FROM users [] (1, u'Mary') (2, u'Tom') (3, u'Fred') (4, u'Harry')
Lets create a second table, email_addresses
, which references the users
table. To define the relationship between the two tables, we will use the ForeignKey
construct. We will also issue the CREATE
statement for the table:
>>> email_addresses_table = Table('email_addresses', metadata, ... Column('address_id', Integer, primary_key=True), ... Column('email_address', String(100), nullable=False), ... Column('user_id', Integer, ForeignKey('users.user_id'))) >>> email_addresses_table.create() CREATE TABLE email_addresses ( address_id INTEGER NOT NULL, email_address VARCHAR(100) NOT NULL, user_id INTEGER, PRIMARY KEY (address_id), FOREIGN KEY(user_id) REFERENCES users (user_id) ) ...
Above, the email_addresses
table is related to the users
table via the ForeignKey('users.user_id')
. The ForeignKey
constructor can take a Column
object or a string representing the table and column name. When using the string argument, the referenced table must exist within the same MetaData
object; thats where it looks for the other table!
Next, lets put a few rows in:
>>> email_addresses_table.insert().execute( ... {'email_address':'tom@tom.com', 'user_id':2}, ... {'email_address':'mary@mary.com', 'user_id':1}) INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?) [['tom@tom.com', 2], ['mary@mary.com', 1]] COMMIT <sqlalchemy.engine.base.ResultProxy object at 0x...>
With two related tables, we can now construct a join amongst them using the join
method:
>>> r = users_table.join(email_addresses_table).select().execute() SELECT users.user_id, users.user_name, users.password, email_addresses.address_id, email_addresses.email_address, email_addresses.user_id FROM users JOIN email_addresses ON users.user_id = email_addresses.user_id [] >>> print [row for row in r] [(1, u'Mary', u'secure', 2, u'mary@mary.com', 1), (2, u'Tom', None, 1, u'tom@tom.com', 2)]
The join
method is also a standalone function in the sqlalchemy
namespace. The join condition is figured out from the foreign keys of the Table objects given. The condition (also called the "ON clause") can be specified explicitly, such as in this example where we locate all users that used their email address as their password:
>>> print join(users_table, email_addresses_table, ... and_(users_table.c.user_id==email_addresses_table.c.user_id, ... users_table.c.password==email_addresses_table.c.email_address) ... ) users JOIN email_addresses ON users.user_id = email_addresses.user_id AND users.password = email_addresses.email_address
Now that we have a little bit of Table and SQL operations covered, lets look into SQLAlchemy's ORM (object relational mapper). With the ORM, you associate Tables (and other Selectable units, like queries and table aliases) with Python classes, into units called Mappers. Then you can execute queries that return lists of object instances, instead of result sets. The object instances themselves are associated with an object called a Session, which automatically tracks changes on each object and supports a "save all at once" operation called a flush.
To start, we will import the names necessary to use SQLAlchemy's ORM, again using import *
for simplicities sake, even though we all know that in real life we should be importing individual names via "from sqlalchemy.orm import symbol1, symbol2, ...
" or "import sqlalchemy.orm as orm
":
>>> from sqlalchemy.orm import *
It should be noted that the above step is technically not needed when working with the 0.3 series of SQLAlchemy; all symbols from the orm
package are also included in the sqlalchemy
package. However, a future release (most likely the 0.4 series) will make the separate orm
import required in order to use the object relational mapper, so its a good practice for now.
A Mapper is usually created once per Python class, and at its core primarily means to say, "objects of this class are to be stored as rows in this table". Lets create a class called User
, which will represent a user object that is stored in our users
table:
>>> class User(object): ... def __repr__(self): ... return "%s(%r,%r)" % ( ... self.__class__.__name__, self.user_name, self.password)
The class is a new style class (i.e. it extends object
) and does not require a constructor (although one may be provided if desired). We just have one __repr__
method on it which will display basic information about the User. Note that the __repr__
method references the instance variables user_name
and password
which otherwise aren't defined. While we are free to explicitly define these attributes and treat them normally, this is optional; as SQLAlchemy's Mapper
construct will manage them for us, since their names correspond to the names of columns in the users
table. Lets create a mapper, and observe that these attributes are now defined:
>>> mapper(User, users_table) <sqlalchemy.orm.mapper.Mapper object at 0x...> >>> u1 = User() >>> print u1.user_name None >>> print u1.password None
The mapper
function returns a new instance of Mapper
. As it is the first Mapper we have created for the User
class, it is known as the classes' primary mapper. We generally don't need to hold onto the return value of the mapper
function; SA can automatically locate this Mapper as needed when it deals with the User
class.
After you create a Mapper, all operations with that Mapper require the usage of an important object called a Session
. All objects loaded or saved by the Mapper must be attached to a Session
object, which represents a kind of "workspace" of objects that are loaded into memory. A particular object instance can only be attached to one Session
at a time (but of course can be moved around or detached altogether).
By default, you have to create a Session
object explicitly before you can load or save objects. Theres several ways to manage sessions, but the most straightforward is to just create one, which we will do by saying, create_session()
:
>>> session = create_session() >>> session <sqlalchemy.orm.session.Session object at 0x...>
The Session has all kinds of methods on it to manage and inspect its collection of objects. The Session also provides an easy interface which can be used to query the database, by giving you an instance to a Query
object corresponding to a particular Python class:
>>> query = session.query(User) >>> print query.select_by(user_name='Harry') SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id FROM users WHERE users.user_name = ? ORDER BY users.oid ['Harry'] [User(u'Harry',None)]
All querying for objects is performed via an instance of Query
. The various select
methods on an instance of Mapper
also use an underlying Query
object to perform the operation. A Query
is always bound to a specific Session
.
Lets turn off the database echoing for a moment, and try out a few methods on Query
. Methods that end with the suffix _by
primarily take keyword arguments which correspond to properties on the object. Other methods take ClauseElement
objects, which are constructed by using Column
objects inside of Python expressions, in the same way as we did with our SQL select example in the previous section of this tutorial. Using ClauseElement
structures to query objects is more verbose but more flexible:
>>> metadata.engine.echo = False >>> print query.select(User.c.user_id==3) [User(u'Fred',None)] >>> print query.get(2) User(u'Tom',None) >>> print query.get_by(user_name='Mary') User(u'Mary',u'secure') >>> print query.selectfirst(User.c.password==None) User(u'Tom',None) >>> print query.count() 4
Notice that our User
class has a special attribute c
attached to it. This 'c' represents the columns on the User's mapper's Table object. Saying User.c.user_name
is synonymous with saying users_table.c.user_name
, recalling that User
is the Python class and users
is our Table
object.
With a little experience in loading objects, lets see what its like to make changes. First, lets create a new user "Ed". We do this by just constructing the new object. Then, we just add it to the session:
>>> ed = User() >>> ed.user_name = 'Ed' >>> ed.password = 'edspassword' >>> session.save(ed) >>> ed in session True
Lets also make a few changes on some of the objects in the database. We will load them with our Query
object, and then change some things.
>>> mary = query.get_by(user_name='Mary') >>> harry = query.get_by(user_name='Harry') >>> mary.password = 'marysnewpassword' >>> harry.password = 'harrysnewpassword'
At the moment, nothing has been saved to the database; all of our changes are in memory only. What happens if some other part of the application also tries to load 'Mary' from the database and make some changes before we had a chance to save it ? Assuming that the same Session
is used, loading 'Mary' from the database a second time will issue a second query in order locate the primary key of 'Mary', but will return the same object instance as the one already loaded. This behavior is due to an important property of the Session
known as the identity map:
>>> mary2 = query.get_by(user_name='Mary') >>> mary is mary2 True
With the identity map, a single Session
can be relied upon to keep all loaded instances straight.
As far as the issue of the same object being modified in two different Sessions, that's an issue of concurrency detection; SQLAlchemy does some basic concurrency checks when saving objects, with the option for a stronger check using version ids. See advdatamapping_arguments for more details.
With a new user "ed" and some changes made on "Mary" and "Harry", lets also mark "Fred" as deleted:
>>> fred = query.get_by(user_name='Fred') >>> session.delete(fred)
Then to send all of our changes to the database, we flush()
the Session. Lets turn echo back on to see this happen!:
>>> metadata.engine.echo = True >>> session.flush() BEGIN UPDATE users SET password=? WHERE users.user_id = ? ['marysnewpassword', 1] UPDATE users SET password=? WHERE users.user_id = ? ['harrysnewpassword', 4] INSERT INTO users (user_name, password) VALUES (?, ?) ['Ed', 'edspassword'] DELETE FROM users WHERE users.user_id = ? [3] COMMIT
When our User object contains relationships to other kinds of information, such as a list of email addresses, we can indicate this by using a function when creating the Mapper
called relation()
. While there is a lot you can do with relations, we'll cover a simple one here. First, recall that our users
table has a foreign key relationship to another table called email_addresses
. A single row in email_addresses
has a column user_id
that references a row in the users
table; since many rows in the email_addresses
table can reference a single row in users
, this is called a one to many relationship.
To illustrate this relationship, we will start with a new mapper configuration. Since our User
class has a mapper assigned to it, we want to discard it and start over again. So we issue the clear_mappers()
function first, which removes all mapping associations from classes:
>>> clear_mappers()
When removing mappers, it is usually best to remove all mappings at the same time, since mappers usually have relationships to each other which will become invalid if only part of the mapper collection is removed. In practice, a particular mapping setup will usually remain throughout the lifetime of an application. Clearing out the mappers and making new ones is a practice that is generally limited to writing mapper unit tests and experimenting from the console.
Next, we want to create a class/mapping that corresponds to the email_addresses
table. We will create a new class Address
which represents a single row in the email_addresses
table, and a corresponding Mapper
which will associate the Address
class with the email_addresses
table:
>>> class Address(object): ... def __init__(self, email_address): ... self.email_address = email_address ... def __repr__(self): ... return "%s(%r)" % ( ... self.__class__.__name__, self.email_address) >>> mapper(Address, email_addresses_table) <sqlalchemy.orm.mapper.Mapper object at 0x...>
We then create a mapper for the User
class which contains a relationship to the Address
class using the relation()
function:
>>> mapper(User, users_table, properties={ ... 'addresses':relation(Address) ... }) <sqlalchemy.orm.mapper.Mapper object at 0x...>
The relation()
function takes either a class or a Mapper as its first argument, and has many options to further control its behavior. When this mapping relationship is used, each new User
instance will contain an attribute called addresses
. SQLAlchemy will automatically determine that this relationship is a one-to-many relationship, and will subsequently create addresses
as a list. When a new User
is created, this list will begin as empty.
The order in which the mapping definitions for User
and Address
is created is not significant. When the mapper()
function is called, it creates an uncompiled mapping record corresponding to the given class/table combination. When the mappers are first used, the entire collection of mappers created up until that point will be compiled, which involves the establishment of class instrumentation as well as the resolution of all mapping relationships.
Lets try out this new mapping configuration, and see what we get for the email addresses already in the database. Since we have made a new mapping configuration, its best that we clear out our Session
, which is currently holding onto every User
object we have already loaded:
>>> session.clear()
We can then treat the addresses
attribute on each User
object like a regular list:
>>> mary = query.get_by(user_name='Mary') SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id FROM users WHERE users.user_name = ? ORDER BY users.oid LIMIT 1 OFFSET 0 ['Mary'] >>> print [a for a in mary.addresses] SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address FROM email_addresses WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid [1] [Address(u'mary@mary.com')]
Adding to the list is just as easy. New Address
objects will be detected and saved when we flush
the Session:
>>> mary.addresses.append(Address('mary2@gmail.com')) >>> session.flush() BEGIN INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?) ['mary2@gmail.com', 1] COMMIT
Main documentation for using mappers: Data Mapping
You may have noticed from the example above that when we say session.flush()
, SQLAlchemy indicates the names BEGIN
and COMMIT
to indicate a transaction with the database. The flush()
method, since it may execute many statements in a row, will automatically use a transaction in order to execute these instructions. But what if we want to use flush()
inside of a larger transaction? This is performed via the SessionTransaction
object, which we can establish using session.create_transaction()
. Below, we will perform a more complicated SELECT
statement, make several changes to our collection of users and email addresess, and then create a new user with two email addresses, within the context of a transaction. We will perform a flush()
in the middle of it to write the changes we have so far, and then allow the remaining changes to be written when we finally commit()
the transaction. We enclose our operations within a try/except
block to ensure that resources are properly freed:
>>> transaction = session.create_transaction() >>> try: ... (ed, harry, mary) = session.query(User).select( ... User.c.user_name.in_('Ed', 'Harry', 'Mary'), order_by=User.c.user_name ... ) ... del mary.addresses[1] ... harry.addresses.append(Address('harry2@gmail.com')) ... session.flush() ... print "***flushed the session***" ... fred = User() ... fred.user_name = 'fred_again' ... fred.addresses.append(Address('fred@fred.com')) ... fred.addresses.append(Address('fredsnewemail@fred.com')) ... session.save(fred) ... transaction.commit() ... except: ... transaction.rollback() ... raise BEGIN SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id FROM users WHERE users.user_name IN (?, ?, ?) ORDER BY users.user_name ['Ed', 'Harry', 'Mary'] SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address FROM email_addresses WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid [4] UPDATE email_addresses SET user_id=? WHERE email_addresses.address_id = ? [None, 3] INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?) ['harry2@gmail.com', 4] ***flushed the session*** INSERT INTO users (user_name, password) VALUES (?, ?) ['fred_again', None] INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?) ['fred@fred.com', 6] INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?) ['fredsnewemail@fred.com', 6] COMMIT
Main documentation: Session / Unit of Work
That covers a quick tour through the basic idea of SQLAlchemy, in its simplest form. Beyond that, one should familiarize oneself with the basics of Sessions, the various patterns that can be used to define different kinds of Mappers and relations among them, the rudimentary SQL types that are available when constructing Tables, and the basics of Engines, SQL statements, and database Connections.
A database engine is a subclass of sqlalchemy.sql.Engine
, and is the starting point for where SQLAlchemy provides a layer of abstraction on top of the various DBAPI2 database modules. For all databases supported by SA, there is a specific "implementation" module, found in the sqlalchemy.databases
package, that provides all the objects an Engine
needs in order to perform its job. A typical user of SQLAlchemy never needs to deal with these modules directly. For many purposes, the only knowledge that's needed is how to create an Engine for a particular connection URL. When dealing with direct execution of SQL statements, one would also be aware of Result, Connection, and Transaction objects. The primary public facing objects are:
create_engine()
function.
begin()
, commit()
and rollback()
methods that support basic "nestable" behavior, meaning an outermost transaction is maintained against multiple nested calls to begin/commit.
Underneath the public-facing API of Engine
, several components are provided by database implementations to provide the full behavior, including:
compile()
method which receives SQL expression objects and assembles them into strings that are suitable for direct execution. Also collects default bind parameters into a datastructure that will be converted at execution time into a dictionary or list, depending on the dialect's paramstyle.
Engines exist for SQLite, Postgres, MySQL, and Oracle, using the Pysqlite, Psycopg2 (Psycopg1 will work to a limited degree, but it is not supported), MySQLDB, and cx_Oracle modules. There is also preliminary support for MS-SQL using adodbapi or pymssql, as well as Firebird. For each engine, a distinct Python module exists in the sqlalchemy.databases
package, which provides implementations of some of the objects mentioned in the previous section.
Downloads for each DBAPI at the time of this writing are as follows:
The SQLAlchemy Wiki contains a page of database notes, describing whatever quirks and behaviors have been observed. Its a good place to check for issues with specific databases. Database Notes
SQLAlchemy indicates the source of an Engine strictly via RFC-1738 style URLs, combined with optional keyword arguments to specify options for the Engine. The form of the URL is:
$ driver://username:password@host:port/database
Available drivernames are sqlite
, mysql
, postgres
, oracle
, mssql
, and firebird
. For sqlite, the database name is the filename to connect to, or the special name ":memory:" which indicates an in-memory database. The URL is typically sent as a string to the create_engine()
function:
# postgres pg_db = create_engine('postgres://scott:tiger@localhost:5432/mydatabase') # sqlite (note the four slashes for an absolute path) sqlite_db = create_engine('sqlite:////absolute/path/to/database.txt') sqlite_db = create_engine('sqlite:///relative/path/to/database.txt') sqlite_db = create_engine('sqlite://') # in-memory database # mysql mysql_db = create_engine('mysql://localhost/foo') # oracle via TNS name oracle_db = create_engine('oracle://scott:tiger@dsn') # oracle will feed host/port/SID into cx_oracle.makedsn oracle_db = create_engine('oracle://scott:tiger@127.0.0.1:1521/sidname')
The Engine
will create its first connection to the database when a SQL statement is executed. As concurrent statements are executed, the underlying connection pool will grow to a default size of five connections, and will allow a default "overflow" of ten. Since the Engine
is essentially "home base" for the connection pool, it follows that you should keep a single Engine
per database established within an application, rather than creating a new one for each connection.
Custom arguments can be passed to the underlying DBAPI in three ways. String-based arguments can be passed directly from the URL string as query arguments:
db = create_engine('postgres://scott:tiger@localhost/test?argument1=foo&argument2=bar')
If SQLAlchemy's database connector is aware of a particular query argument, it may convert its type from string to its proper type.
create_engine
also takes an argument connect_args
which is an additional dictionary that will be passed to connect()
. This can be used when arguments of a type other than string are required, and SQLAlchemy's database connector has no type conversion logic present for that parameter:
db = create_engine('postgres://scott:tiger@localhost/test', connect_args = {'argument1':17, 'argument2':'bar'})
The most customizable connection method of all is to pass a creator
argument, which specifies a callable that returns a DBAPI connection:
def connect(): return psycopg.connect(user='scott', host='localhost') db = create_engine('postgres://', creator=connect)
Keyword options can also be specified to create_engine()
, following the string URL as follows:
db = create_engine('postgres://...', encoding='latin1', echo=True, module=psycopg1)
A list of all standard options, as well as several that are used by particular database dialects, is as follows:
Unicode
column type instead, described in The Types System.
echo
attribute of Engine
can be modified at any time to turn logging on and off. If set to the string "debug"
, result rows will be printed to the standard output as well. This flag ultimately controls a Python logger; see Configuring Logging for information on how to configure logging directly.
Unicode
type object.
QueuePool
.
sqlalchemy.pool.Pool
to be used as the underlying source for connections. For more on connection pooling, see Connection Pooling.
Example of a manual invocation of pool.QueuePool
(which is the pool instance used for all databases except sqlite):
from sqlalchemy import * import sqlalchemy.pool as pool import MySQLdb def getconn(): return MySQLdb.connect(user='ed', dbname='mydb') engine = create_engine('mysql://', pool=pool.QueuePool(getconn, pool_size=20, max_overflow=40))
sqlalchemy.pool.Pool
subclass that will be instantated in place of the default connection pool.
QueuePool
as well as SingletonThreadPool
.
QueuePool
.
TLEngine
class that provides a modified connection scope for implicit executions. Implicit execution as well as further detail on this setting are described in Implicit Execution.
threaded
parameter of the connection indicating thread-safe usage. cx_Oracle docs indicate setting this flag to False
will speed performance by 10-15%. While this defaults to False
in cx_Oracle, SQLAlchemy defaults it to True
, preferring stability over early optimization.
<column1>(+)=<column2>
must be used in order to achieve a LEFT OUTER JOIN.
As of the 0.3 series of SQLAlchemy, Python's standard logging module is used to implement informational and debug log output. This allows SQLAlchemy's logging to integrate in a standard way with other applications and libraries. The echo
and echo_pool
flags that are present on create_engine()
, as well as the echo_uow
flag used on Session
, all interact with regular loggers.
This section assumes familiarity with the above linked logging module. All logging performed by SQLAlchemy exists underneath the sqlalchemy
namespace, as used by logging.getLogger('sqlalchemy')
. When logging has been configured (i.e. such as via logging.basicConfig()
), the general namespace of SA loggers that can be turned on is as follows:
sqlalchemy.engine
- controls SQL echoing. set to logging.INFO
for SQL query output, logging.DEBUG
for query + result set output.
sqlalchemy.pool
- controls connection pool logging. set to logging.INFO
or lower to log connection pool checkouts/checkins.
sqlalchemy.orm
- controls logging of various ORM functions. set to logging.INFO
for configurational logging as well as unit of work dumps, logging.DEBUG
for extensive logging during query and flush() operations. Subcategories of sqlalchemy.orm
include:sqlalchemy.orm.attributes
- logs certain instrumented attribute operations, such as triggered callables
sqlalchemy.orm.mapper
- logs Mapper configuration and operations
sqlalchemy.orm.unitofwork
- logs flush() operations, including dependency sort graphs and other operations
sqlalchemy.orm.strategies
- logs relation loader operations (i.e. lazy and eager loads)
sqlalchemy.orm.sync
- logs synchronization of attributes from parent to child instances during a flush()
For example, to log SQL queries as well as unit of work debugging:
import logging logging.basicConfig() logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO) logging.getLogger('sqlalchemy.orm.unitofwork').setLevel(logging.DEBUG)
By default, the log level is set to logging.ERROR
within the entire sqlalchemy
namespace so that no log operations occur, even within an application that has logging enabled otherwise.
The echo
flags present as keyword arguments to create_engine()
and others as well as the echo
property on Engine
, when set to True
, will first attempt to insure that logging is enabled. Unfortunately, the logging
module provides no way of determining if output has already been configured (note we are referring to if a logging configuration has been set up, not just that the logging level is set). For this reason, any echo=True
flags will result in a call to logging.basicConfig()
using sys.stdout as the destination. It also sets up a default format using the level name, timestamp, and logger name. Note that this configuration has the affect of being configured in addition to any existing logger configurations. Therefore, when using Python logging, insure all echo flags are set to False at all times, to avoid getting duplicate log lines.
In this section we describe the SQL execution interface available from an Engine
instance. Note that when using the Object Relational Mapper (ORM) as well as when dealing with with "bound" metadata objects, SQLAlchemy deals with the Engine and Connections for you and you generally don't need to know much about it; in those cases, you can skip this section and go to Database Meta Data. "Bound" metadata is described in Binding MetaData to an Engine.
The Engine provides a connect()
method which returns a Connection
object. Connection
is a proxy object which maintains a reference to a DBAPI connection instance. This object provides methods by which literal SQL text as well as SQL clause constructs can be compiled and executed.
engine = create_engine('sqlite:///:memory:') connection = engine.connect() result = connection.execute("select * from mytable where col1=:col1", col1=5) for row in result: print row['col1'], row['col2'] connection.close()
The close
method on Connection
does not actually remove the underlying connection to the database, but rather indicates that the underlying resources can be returned to the connection pool. When using the connect()
method, the DBAPI connection referenced by the Connection
object is not referenced anywhere else.
In both execution styles above, the Connection
object will also automatically return its resources to the connection pool when the object is garbage collected, i.e. its __del__()
method is called. When using the standard C implementation of Python, this method is usually called immediately as soon as the object is dereferenced. With other Python implementations such as Jython, this is not so guaranteed.
The execute method on Engine
and Connection
can also receive SQL clause constructs as well, which are described in Constructing SQL Queries via Python Expressions:
connection = engine.connect() result = connection.execute(select([table1], table1.c.col1==5)) for row in result: print row['col1'], row['col2'] connection.close()
Both Connection
and Engine
fulfill an interface known as Connectable
which specifies common functionality between the two objects, such as getting a Connection
and executing queries. Therefore, most SQLAlchemy functions which take an Engine
as a parameter with which to execute SQL will also accept a Connection
(and the name of the argument is typically called connectable
):
engine = create_engine('sqlite:///:memory:') # specify some Table metadata metadata = MetaData() table = Table('sometable', metadata, Column('col1', Integer)) # create the table with the Engine table.create(connectable=engine) # drop the table with a Connection off the Engine connection = engine.connect() table.drop(connectable=connection)
Connection facts:
The Connection
object provides a begin()
method which returns a Transaction
object. This object is usually used within a try/except clause so that it is guaranteed to rollback()
or commit()
:
trans = connection.begin() try: r1 = connection.execute(table1.select()) connection.execute(table1.insert(), col1=7, col2='this is some data') trans.commit() except: trans.rollback() raise
The Transaction
object also handles "nested" behavior by keeping track of the outermost begin/commit pair. In this example, two functions both issue a transaction on a Connection, but only the outermost Transaction object actually takes effect when it is committed.
# method_a starts a transaction and calls method_b def method_a(connection): trans = connection.begin() # open a transaction try: method_b(connection) trans.commit() # transaction is committed here except: trans.rollback() # this rolls back the transaction unconditionally raise # method_b also starts a transaction def method_b(connection): trans = connection.begin() # open a transaction - this runs in the context of method_a's transaction try: connection.execute("insert into mytable values ('bat', 'lala')") connection.execute(mytable.insert(), col1='bat', col2='lala') trans.commit() # transaction is not committed yet except: trans.rollback() # this rolls back the transaction unconditionally raise # open a Connection and call method_a conn = engine.connect() method_a(conn) conn.close()
Above, method_a
is called first, which calls connection.begin()
. Then it calls method_b
. When method_b
calls connection.begin()
, it just increments a counter that is decremented when it calls commit()
. If either method_a
or method_b
calls rollback()
, the whole transaction is rolled back. The transaction is not committed until method_a
calls the commit()
method.
Note that SQLAlchemy's Object Relational Mapper also provides a way to control transaction scope at a higher level; this is described in SessionTransaction.
Transaction Facts:
Implicit execution refers to the execution of SQL without the explicit usage of a Connection
object. This occurs when you call the execute()
method off of an Engine
object or off of a SQL expression or table that is associated with "bound" metadata.
engine = create_engine('sqlite:///:memory:') result = engine.execute("select * from mytable where col1=:col1", col1=5) for row in result: print row['col1'], row['col2'] result.close()
Using "bound" metadata:
engine = create_engine('sqlite:///:memory:') meta = BoundMetaData(engine) table = Table('mytable', meta, Column('col1', Integer), Column('col2', String(20))) r = table.insert().execute(col1=5, col2='some record')
Notice in the above two examples, no connect()
method is ever called nor do we ever see a Connection
anywhere; the Connection
is created for you automatically via the execute()
method, and a handle to the execution's cursor remains open in the returned result set. When the result set is closed via the close()
method, or if the result set object falls out of scope and is garbage collected, the underlying cursor is closed, the Connection
is discarded and the underlying DBAPI connection is returned to the connection pool.
The purpose of the "implicit" connection is strictly one of convenience; while in SQLAlchemy 0.1 it was the only style of operation, it is now optional.
The internal behavior of engine during implicit execution can be affected by the strategy
keyword argument to create_engine()
. Generally this setting can be left at its default value of plain
. However, for the advanced user, the threadlocal
option can provide the service of managing connections against the current thread in which they were pulled from the connection pool, where the same underlying DBAPI connection as well as a single database-level transaction can then be shared by many operations without explicitly passing a Connection
or Transaction
object around. It also may reduce the number of connections checked out from the connection pool at a given time.
Note that this setting does not affect the fact that Connection and Transaction objects are not threadsafe. The "threadlocal" strategy affects the selection of DBAPI connections which are pulled from the connection pool when a Connection
object is created, but does not synchronize method access to the Connection
or Transaction
instances themselves, which are only proxy objects. It is instead intended that many Connection
instances would share access to a single "connection" object that is referenced in relation to the current thread.
When strategy
is set to plain
, each implicit execution requests a unique connection from the connection pool, which is returned to the pool when the resulting ResultProxy
falls out of scope (i.e. __del__()
is called) or its close()
method is called. If a second implicit execution occurs while the ResultProxy
from the previous execution is still open, then a second connection is pulled from the pool.
When strategy
is set to threadlocal
, the Engine
still checks out a connection which is closeable in the same manner via the ResultProxy
, except the connection it checks out will be the same connection as one which is already checked out, assuming the operation is in the same thread. When all ResultProxy
objects are closed in a particular thread, the connection is returned to the pool normally.
An additional feature of the threadlocal
selection is that Transaction
objects can be managed implicitly as well, by calling the begin()
,commit()
and rollback()
methods off of the Engine
, or by using Transaction
objects from the thread-local connection.
It is crucial to note that the plain
and threadlocal
contexts do not impact the connect() method on the Engine. connect()
always returns a unique connection. Implicit connections use a different method off of Engine
for their operations called contextual_connect()
.
By default, every call to execute
pulls a dedicated DBAPI connection from the connection pool:
db = create_engine('mysql://localhost/test', strategy='plain') # execute one statement and receive results. r1 now references a DBAPI connection resource. r1 = db.execute("select * from table1") # execute a second statement and receive results. r2 now references a *second* DBAPI connection resource. r2 = db.execute("select * from table2") for row in r1: ... for row in r2: ... # release connection 1 r1.close() # release connection 2 r2.close()
Using the "threadlocal" strategy, all calls to execute
within the same thread will be guaranteed to use the same underlying DBAPI connection, which is only returned to the connection pool when all ResultProxy
instances have been closed.
db = create_engine('mysql://localhost/test', strategy='threadlocal') # execute one statement and receive results. r1 now references a DBAPI connection resource. r1 = db.execute("select * from table1") # execute a second statement and receive results. r2 now references the *same* resource as r1 r2 = db.execute("select * from table2") for row in r1: ... for row in r2: ... # dereference r1. the connection is still held by r2. r1 = None # dereference r2. with no more references to the underlying connection resources, they # are returned to the pool. r2 = None
To get at the actual Connection
object which is used by implicit executions, call the contextual_connection()
method on Engine
:
# threadlocal strategy db = create_engine('mysql://localhost/test', strategy='threadlocal') conn1 = db.contextual_connection() conn2 = db.contextual_connection() >>> conn1.connection is conn2.connection True
When the plain
strategy is used, the contextual_connection()
method is synonymous with the connect()
method; both return a distinct connection from the pool.
One programming pattern that the threadlocal
strategy supports is transparent connection and transaction sharing.
db = create_engine('mysql://localhost/test', strategy='threadlocal') def dosomethingimplicit(): table1.execute("some sql") table1.execute("some other sql") def dosomethingelse(): table2.execute("some sql") conn = db.contextual_connection() # do stuff with conn conn.execute("some other sql") conn.close() def dosomethingtransactional(): conn = db.contextual_connection() trans = conn.begin() # do stuff trans.commit() db.create_transaction() try: dosomethingimplicit() dosomethingelse() dosomethingtransactional() db.commit() except: db.rollback()
In the above example, the program calls three functions dosomethingimplicit()
, dosomethingelse()
and dosomethingtransactional()
. In all three functions, either implicit execution is used, or an explicit Connection
is used via the contextual_connection()
method. This indicates that they all will share the same underlying dbapi connection as well as the same parent Transaction
instance, which is created in the main body of the program via the call to db.create_transaction()
. So while there are several calls that return "new" Transaction
or Connection
objects, in reality only one "real" connection is ever used, and there is only one transaction (i.e. one begin/commit pair) executed.
The core of SQLAlchemy's query and object mapping operations is database metadata, which are Python objects that describe tables and other schema-level objects. Metadata objects can be created by explicitly naming the various components and their properties, using the Table, Column, ForeignKey, Index, and Sequence objects imported from sqlalchemy.schema
. There is also support for reflection, which means you only specify the name of the entities and they are recreated from the database automatically.
A collection of metadata entities is stored in an object aptly named MetaData
. This object takes an optional name
parameter:
from sqlalchemy import * metadata = MetaData(name='my metadata')
Then to construct a Table, use the Table
class:
users = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(16), nullable = False), Column('email_address', String(60), key='email'), Column('password', String(20), nullable = False) ) user_prefs = Table('user_prefs', metadata, Column('pref_id', Integer, primary_key=True), Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False), Column('pref_name', String(40), nullable=False), Column('pref_value', String(100)) )
The specific datatypes for each Column, such as Integer, String, etc. are described in The Types System, and exist within the module sqlalchemy.types
as well as the global sqlalchemy
namespace.
Foreign keys are most easily specified by the ForeignKey
object within a Column
object. For a composite foreign key, i.e. a foreign key that contains multiple columns referencing multiple columns to a composite primary key, an explicit syntax is provided which allows the correct table CREATE statements to be generated:
# a table with a composite primary key invoices = Table('invoices', metadata, Column('invoice_id', Integer, primary_key=True), Column('ref_num', Integer, primary_key=True), Column('description', String(60), nullable=False) ) # a table with a composite foreign key referencing the parent table invoice_items = Table('invoice_items', metadata, Column('item_id', Integer, primary_key=True), Column('item_name', String(60), nullable=False), Column('invoice_id', Integer, nullable=False), Column('ref_num', Integer, nullable=False), ForeignKeyConstraint(['invoice_id', 'ref_num'], ['invoices.invoice_id', 'invoices.ref_num']) )
Above, the invoice_items
table will have ForeignKey
objects automatically added to the invoice_id
and ref_num
Column
objects as a result of the additional ForeignKeyConstraint
object.
The MetaData
object supports some handy methods, such as getting a list of Tables in the order (or reverse) of their dependency:
>>> for t in metadata.table_iterator(reverse=False): ... print t.name users user_prefs
And Table
provides an interface to the table's properties as well as that of its columns:
employees = Table('employees', metadata, Column('employee_id', Integer, primary_key=True), Column('employee_name', String(60), nullable=False, key='name'), Column('employee_dept', Integer, ForeignKey("departments.department_id")) ) # access the column "EMPLOYEE_ID": employees.columns.employee_id # or just employees.c.employee_id # via string employees.c['employee_id'] # iterate through all columns for c in employees.c: # ... # get the table's primary key columns for primary_key in employees.primary_key: # ... # get the table's foreign key objects: for fkey in employees.foreign_keys: # ... # access the table's MetaData: employees.metadata # access the table's Engine, if its MetaData is bound: employees.engine # access a column's name, type, nullable, primary key, foreign key employees.c.employee_id.name employees.c.employee_id.type employees.c.employee_id.nullable employees.c.employee_id.primary_key employees.c.employee_dept.foreign_key # get the "key" of a column, which defaults to its name, but can # be any user-defined string: employees.c.name.key # access a column's table: employees.c.employee_id.table is employees >>> True # get the table related by a foreign key fcolumn = employees.c.employee_dept.foreign_key.column.table
A MetaData object can be associated with one or more Engine instances. This allows the MetaData and the elements within it to perform operations automatically, using the connection resources of that Engine. This includes being able to "reflect" the columns of tables, as well as to perform create and drop operations without needing to pass an Engine
or Connection
around. It also allows SQL constructs to be created which know how to execute themselves (called "implicit execution").
To bind MetaData
to a single Engine
, use BoundMetaData
:
engine = create_engine('sqlite://', **kwargs) # create BoundMetaData from an Engine meta = BoundMetaData(engine) # create the Engine and MetaData in one step meta = BoundMetaData('postgres://db/', **kwargs)
Another form of MetaData
exists which allows connecting to any number of engines, within the context of the current thread. This is DynamicMetaData
:
meta = DynamicMetaData() meta.connect(engine) # connect to an existing Engine meta.connect('mysql://user@host/dsn') # create a new Engine and connect
DynamicMetaData
is ideal for applications that need to use the same set of Tables
for many different database connections in the same process, such as a CherryPy web application which handles multiple application instances in one process.
Some users prefer to create Table
objects without specifying a MetaData
object, having Tables scoped on an application-wide basis. For them the default_metadata
object and the global_connect()
function is supplied. default_metadata
is simply an instance of DynamicMetaData
that exists within the sqlalchemy
namespace, and global_connect()
is a synonym for default_metadata.connect()
. Defining a Table
that has no MetaData
argument will automatically use this default metadata as follows:
from sqlalchemy import * # a Table with just a name and its Columns mytable = Table('mytable', Column('col1', Integer, primary_key=True), Column('col2', String(40)) ) # connect all the "anonymous" tables to a postgres uri in the current thread global_connect('postgres://foo:bar@lala/test') # create all tables in the default metadata default_metadata.create_all() # the table is bound mytable.insert().execute(col1=5, col2='some value')
Once you have a BoundMetaData
or a connected DynamicMetaData
, you can create Table
objects without specifying their columns, just their names, using autoload=True
:
>>> messages = Table('messages', meta, autoload = True) >>> [c.name for c in messages.columns] ['message_id', 'message_name', 'date']
At the moment the Table is constructed, it will query the database for the columns and constraints of the messages
table.
Note that if a reflected table has a foreign key referencing another table, then the metadata for the related table will be loaded as well, even if it has not been defined by the application:
>>> shopping_cart_items = Table('shopping_cart_items', meta, autoload = True) >>> print shopping_cart_items.c.cart_id.table.name shopping_carts
To get direct access to 'shopping_carts', simply instantiate it via the Table constructor. You'll get the same instance of the shopping cart Table as the one that is attached to shopping_cart_items:
>>> shopping_carts = Table('shopping_carts', meta) >>> shopping_carts is shopping_cart_items.c.cart_id.table True
This works because when the Table constructor is called for a particular name and MetaData
object, if the table has already been created then the instance returned will be the same as the original. This is a singleton constructor:
>>> news_articles = Table('news', meta, ... Column('article_id', Integer, primary_key = True), ... Column('url', String(250), nullable = False) ... ) >>> othertable = Table('news', meta) >>> othertable is news_articles True
Individual columns can be overridden with explicit values when reflecting tables; this is handy for specifying custom datatypes, constraints such as primary keys that may not be configured within the database, etc.
>>> mytable = Table('mytable', meta, ... Column('id', Integer, primary_key=True), # override reflected 'id' to have primary key ... Column('mydata', Unicode(50)), # override reflected 'mydata' to be Unicode ... autoload=True)
Some databases support the concept of multiple schemas. A Table
can reference this by specifying the schema
keyword argument:
financial_info = Table('financial_info', meta, Column('id', Integer, primary_key=True), Column('value', String(100), nullable=False), schema='remote_banks' )
Within the MetaData
collection, this table will be identified by the combination of financial_info
and remote_banks
. If another table called financial_info
is referenced without the remote_banks
schema, it will refer to a different Table
. ForeignKey
objects can reference columns in this table using the form remote_banks.financial_info.id
.
ON UPDATE
and ON DELETE
clauses to a table create are specified within the ForeignKeyConstraint
object, using the onupdate
and ondelete
keyword arguments:
foobar = Table('foobar', meta, Column('id', Integer, primary_key=True), Column('lala', String(40)), ForeignKeyConstraint(['lala'],['hoho.lala'], onupdate="CASCADE", ondelete="CASCADE"))
Note that these clauses are not supported on SQLite, and require InnoDB
tables when used with MySQL. They may also not be supported on other databases.
Feature Status: Alpha Implementation
Many table, schema, or column names require quoting to be enabled. Reasons for this include names that are the same as a database reserved word, or for identifiers that use MixedCase, where the database would normally "fold" the case convention into lower or uppercase (such as Postgres). SQLAlchemy will attempt to automatically determine when quoting should be used. It will determine a value for every identifier name called case_sensitive
, which defaults to False
if the identifer name uses no uppercase letters, or True
otherwise. This flag may be explicitly set on any schema item as well (schema items include Table
, Column
, MetaData
, Sequence
, etc.) to override this default setting, where objects will inherit the setting from an enclosing object if not explicitly overridden.
When case_sensitive
is True
, the dialect will do what it has to in order for the database to recognize the casing. For Postgres and Oracle, this means using quoted identifiers.
Identifiers that match known SQL reserved words (such as "asc", "union", etc.) will also be quoted according to the dialect's quoting convention regardless of the case_sensitive
setting.
To force quoting for an identifier, set the "quote=True" flag on Column
or Table
, as well as the quote_schema=True
flag for Table
.
table2 = Table('WorstCase2', metadata, # desc is a reserved word, which will be quoted. Column('desc', Integer, primary_key=True), # if using a reserved word which SQLAlchemy doesn't know about, # specify quote=True Column('some_reserved_word', Integer, quote=True, primary_key=True), # MixedCase uses a mixed case convention. # it will be automatically quoted since it is case sensitive Column('MixedCase', Integer), # Union is both a reserved word and mixed case Column('Union', Integer), # normal_column doesnt require quoting Column('normal_column', String(30))) # to use tables where case_sensitive is False by default regardless # of idenfifier casings, set "case_sensitive" to false at any level # (or true to force case sensitive for lowercase identifiers as well) lowercase_metadata = MetaData(case_sensitive=False)
Tables
may support database-specific options, such as MySQL's engine
option that can specify "MyISAM", "InnoDB", and other backends for the table:
addresses = Table('engine_email_addresses', meta, Column('address_id', Integer, primary_key = True), Column('remote_user_id', Integer, ForeignKey(users.c.user_id)), Column('email_address', String(20)), mysql_engine='InnoDB' )
Creating and dropping individual tables can be done via the create()
and drop()
methods of Table
; these methods take an optional connectable
parameter which references an Engine
or a Connection
. If not supplied, the Engine
bound to the MetaData
will be used, else an error is raised:
meta = BoundMetaData('sqlite:///:memory:') employees = Table('employees', meta, Column('employee_id', Integer, primary_key=True), Column('employee_name', String(60), nullable=False, key='name'), Column('employee_dept', Integer, ForeignKey("departments.department_id")) ) sqlemployees.create()
drop()
method:
sqlemployees.drop(connectable=e)
The create()
and drop()
methods also support an optional keyword argument checkfirst
which will issue the database's appropriate pragma statements to check if the table exists before creating or dropping:
employees.create(connectable=e, checkfirst=True) employees.drop(checkfirst=False)
Entire groups of Tables can be created and dropped directly from the MetaData
object with create_all()
and drop_all()
. These methods always check for the existence of each table before creating or dropping. Each method takes an optional connectable
keyword argument which can reference an Engine
or a Connection
. If no engine is specified, the underlying bound Engine
, if any, is used:
engine = create_engine('sqlite:///:memory:') metadata = MetaData() users = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(16), nullable = False), Column('email_address', String(60), key='email'), Column('password', String(20), nullable = False) ) user_prefs = Table('user_prefs', metadata, Column('pref_id', Integer, primary_key=True), Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False), Column('pref_name', String(40), nullable=False), Column('pref_value', String(100)) ) sqlmetadata.create_all(connectable=engine)
SQLAlchemy includes flexible constructs in which to create default values for columns upon the insertion of rows, as well as upon update. These defaults can take several forms: a constant, a Python callable to be pre-executed before the SQL is executed, a SQL expression or function to be pre-executed before the SQL is executed, a pre-executed Sequence (for databases that support sequences), or a "passive" default, which is a default function triggered by the database itself upon insert, the value of which can then be post-fetched by the engine, provided the row provides a primary key in which to call upon.
A basic default is most easily specified by the "default" keyword argument to Column. This defines a value, function, or SQL expression that will be pre-executed to produce the new value, before the row is inserted:
# a function to create primary key ids i = 0 def mydefault(): global i i += 1 return i t = Table("mytable", meta, # function-based default Column('id', Integer, primary_key=True, default=mydefault), # a scalar default Column('key', String(10), default="default") )
The "default" keyword can also take SQL expressions, including select statements or direct function calls:
t = Table("mytable", meta, Column('id', Integer, primary_key=True), # define 'create_date' to default to now() Column('create_date', DateTime, default=func.now()), # define 'key' to pull its default from the 'keyvalues' table Column('key', String(20), default=keyvalues.select(keyvalues.c.type='type1', limit=1)) )
The "default" keyword argument is shorthand for using a ColumnDefault object in a column definition. This syntax is optional, but is required for other types of defaults, futher described below:
Column('mycolumn', String(30), ColumnDefault(func.get_data()))
Similar to an on-insert default is an on-update default, which is most easily specified by the "onupdate" keyword to Column, which also can be a constant, plain Python function or SQL expression:
t = Table("mytable", meta, Column('id', Integer, primary_key=True), # define 'last_updated' to be populated with current_timestamp (the ANSI-SQL version of now()) Column('last_updated', DateTime, onupdate=func.current_timestamp()), )
To use an explicit ColumnDefault object to specify an on-update, use the "for_update" keyword argument:
Column('mycolumn', String(30), ColumnDefault(func.get_data(), for_update=True))
A PassiveDefault indicates an column default that is executed upon INSERT by the database. This construct is used to specify a SQL function that will be specified as "DEFAULT" when creating tables.
t = Table('test', meta, Column('mycolumn', DateTime, PassiveDefault("sysdate")) )
A create call for the above table will produce:
CREATE TABLE test ( mycolumn datetime default sysdate )
PassiveDefault also sends a message to the Engine
that data is available after an insert. The object-relational mapper system uses this information to post-fetch rows after the insert, so that instances can be refreshed with the new data. Below is a simplified version:
# table with passive defaults mytable = Table('mytable', engine, Column('my_id', Integer, primary_key=True), # an on-insert database-side default Column('data1', Integer, PassiveDefault("d1_func")), ) # insert a row r = mytable.insert().execute(name='fred') # check the result: were there defaults fired off on that row ? if r.lastrow_has_defaults(): # postfetch the row based on primary key. # this only works for a table with primary key columns defined primary_key = r.last_inserted_ids() row = table.select(table.c.id == primary_key[0])
When Tables are reflected from the database using autoload=True
, any DEFAULT values set on the columns will be reflected in the Table object as PassiveDefault instances.
Current Postgres support does not rely upon OID's to determine the identity of a row. This is because the usage of OIDs has been deprecated with Postgres and they are disabled by default for table creates as of PG version 8. Pyscopg2's "cursor.lastrowid" function only returns OIDs. Therefore, when inserting a new row which has passive defaults set on the primary key columns, the default function is still pre-executed since SQLAlchemy would otherwise have no way of retrieving the row just inserted.
A table with a sequence looks like:
table = Table("cartitems", meta, Column("cart_id", Integer, Sequence('cart_id_seq'), primary_key=True), Column("description", String(40)), Column("createdate", DateTime()) )
The Sequence is used with Postgres or Oracle to indicate the name of a database sequence that will be used to create default values for a column. When a table with a Sequence on a column is created in the database by SQLAlchemy, the database sequence object is also created. Similarly, the database sequence is dropped when the table is dropped. Sequences are typically used with primary key columns. When using Postgres, if an integer primary key column defines no explicit Sequence or other default method, SQLAlchemy will create the column with the SERIAL keyword, and will pre-execute a sequence named "tablename_columnname_seq" in order to retrieve new primary key values, if they were not otherwise explicitly stated. Oracle, which has no "auto-increment" keyword, requires that a Sequence be created for a table if automatic primary key generation is desired.
A Sequence object can be defined on a Table that is then used for a non-sequence-supporting database. In that case, the Sequence object is simply ignored. Note that a Sequence object is entirely optional for all databases except Oracle, as other databases offer options for auto-creating primary key values, such as AUTOINCREMENT, SERIAL, etc. SQLAlchemy will use these default methods for creating primary key values if no Sequence is present on the table metadata.
A sequence can also be specified with optional=True
which indicates the Sequence should only be used on a database that requires an explicit sequence, and not those that supply some other method of providing integer values. At the moment, it essentially means "use this sequence only with Oracle and not Postgres".
Unique constraints can be created anonymously on a single column using the unique
keyword on Column
. Explicitly named unique constraints and/or those with multiple columns are created via the UniqueConstraint
table-level construct.
meta = MetaData() mytable = Table('mytable', meta, # per-column anonymous unique constraint Column('col1', Integer, unique=True), Column('col2', Integer), Column('col3', Integer), # explicit/composite unique constraint. 'name' is optional. UniqueConstraint('col2', 'col3', name='uix_1') )
Check constraints can be named or unnamed and can be created at the Column or Table level, using the CheckConstraint
construct. The text of the check constraint is passed directly through to the database, so there is limited "database independent" behavior. Column level check constraints generally should only refer to the column to which they are placed, while table level constraints can refer to any columns in the table.
Note that some databases do not actively support check constraints such as MySQL and sqlite.
meta = MetaData() mytable = Table('mytable', meta, # per-column CHECK constraint Column('col1', Integer, CheckConstraint('col1>5')), Column('col2', Integer), Column('col3', Integer), # table level CHECK constraint. 'name' is optional. CheckConstraint('col2 > col3 + 5', name='check1') )
Indexes can be created anonymously (using an auto-generated name "ix_index
keyword on Column
, which also modifies the usage of unique
to apply the uniqueness to the index itself, instead of adding a separate UNIQUE constraint. For indexes with specific names or which encompass more than one column, use the Index
construct, which requires a name.
Note that the Index
construct is created externally to the table which it corresponds, using Column
objects and not strings.
meta = MetaData() mytable = Table('mytable', meta, # an indexed column, with index "ix_mytable_col1" Column('col1', Integer, index=True), # a uniquely indexed column with index "ix_mytable_col2" Column('col2', Integer, index=True, unique=True), Column('col3', Integer), Column('col4', Integer), Column('col5', Integer), Column('col6', Integer), ) # place an index on col3, col4 Index('idx_col34', mytable.c.col3, mytable.c.col4) # place a unique index on col5, col6 Index('myindex', mytable.c.col5, mytable.c.col6, unique=True)
The Index
objects will be created along with the CREATE statements for the table itself. An index can also be created on its own independently of the table:
# create a table sometable.create() # define an index i = Index('someindex', sometable.c.col5) # create the index, will use the table's connectable, or specify the connectable keyword argument i.create()
A Table
object created against a specific MetaData
object can be re-created against a new MetaData using the tometadata
method:
# create two metadata meta1 = BoundMetaData('sqlite:///querytest.db') meta2 = MetaData() # load 'users' from the sqlite engine users_table = Table('users', meta1, autoload=True) # create the same Table object for the plain metadata users_table_2 = users_table.tometadata(meta2)
Note: This section describes how to use SQLAlchemy to construct SQL queries and receive result sets. It does not cover the object relational mapping capabilities of SQLAlchemy; that is covered later on in Data Mapping. However, both areas of functionality work similarly in how selection criterion is constructed, so if you are interested just in ORM, you should probably skim through basic WHERE Clause construction before moving on.
Once you have used the sqlalchemy.schema
module to construct your tables and/or reflect them from the database, performing SQL queries using those table meta data objects is done via the sqlalchemy.sql
package. This package defines a large set of classes, each of which represents a particular kind of lexical construct within a SQL query; all are descendants of the common base class sqlalchemy.sql.ClauseElement
. A full query is represented via a structure of ClauseElement
s. A set of reasonably intuitive creation functions is provided by the sqlalchemy.sql
package to create these structures; these functions are described in the rest of this section.
Executing a ClauseElement
structure can be performed in two general ways. You can use an Engine
or a Connection
object's execute()
method to which you pass the query structure; this is known as explicit style. Or, if the ClauseElement
structure is built upon Table metadata which is bound to an Engine
directly, you can simply call execute()
on the structure itself, known as implicit style. In both cases, the execution returns a cursor-like object (more on that later). The same clause structure can be executed repeatedly. The ClauseElement
is compiled into a string representation by an underlying Compiler
object which is associated with the Engine
via its Dialect
.
The examples below all include a dump of the generated SQL corresponding to the query object, as well as a dump of the statement's bind parameters. In all cases, bind parameters are shown as named parameters using the colon format (i.e. ':name'). When the statement is compiled into a database-specific version, the named-parameter statement and its bind values are converted to the proper paramstyle for that database automatically.
For this section, we will mostly use the implcit style of execution, meaning the Table
objects are associated with an instance of BoundMetaData
, and constructed ClauseElement
objects support self-execution. Assume the following configuration:
from sqlalchemy import * metadata = BoundMetaData('sqlite:///mydb.db', strategy='threadlocal', echo=True) # a table to store users users = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(40)), Column('password', String(80)) ) # a table that stores mailing addresses associated with a specific user addresses = Table('addresses', metadata, Column('address_id', Integer, primary_key = True), Column('user_id', Integer, ForeignKey("users.user_id")), Column('street', String(100)), Column('city', String(80)), Column('state', String(2)), Column('zip', String(10)) ) # a table that stores keywords keywords = Table('keywords', metadata, Column('keyword_id', Integer, primary_key = True), Column('name', VARCHAR(50)) ) # a table that associates keywords with users userkeywords = Table('userkeywords', metadata, Column('user_id', INT, ForeignKey("users")), Column('keyword_id', INT, ForeignKey("keywords")) )
A select is done by constructing a Select
object with the proper arguments, adding any extra arguments if desired, then calling its execute()
method.
from sqlalchemy import * # use the select() function defined in the sql package s = select([users]) # or, call the select() method off of a Table object s = users.select() # then, call execute on the Select object: sqlresult = s.execute()
# the SQL text of any clause object can also be viewed via the str() call: >>> str(s) SELECT users.user_id, users.user_name, users.password FROM users
As mentioned above, ClauseElement
structures can also be executed with a Connection
object explicitly:
engine = create_engine('sqlite:///myfile.db') conn = engine.connect() s = users.select() sqlresult = conn.execute(s)
conn.close()
For queries that don't contain any tables, ClauseElement
s that represent a fully executeable statement support an engine
keyword parameter which can bind the object to an Engine
, thereby allowing implicit execution:
The object returned by execute()
is a sqlalchemy.engine.ResultProxy
object, which acts much like a DBAPI cursor
object in the context of a result set, except that the rows returned can address their columns by ordinal position, column name, or even column object:
# select rows, get resulting ResultProxy object sqlresult = users.select().execute()
# get one row row = result.fetchone() # get the 'user_id' column via integer index: user_id = row[0] # or column name user_name = row['user_name'] # or column object password = row[users.c.password] # or column accessor password = row.password # ResultProxy object also supports fetchall() rows = result.fetchall() # or get the underlying DBAPI cursor object cursor = result.cursor # close the result. If the statement was implicitly executed # (i.e. without an explicit Connection), this will # return the underlying connection resources back to # the connection pool. de-referencing the result # will also have the same effect. if an explicit Connection was # used, then close() just closes the underlying cursor object. result.close()
A common need when writing statements that reference multiple tables is to create labels for columns, thereby separating columns from different tables with the same name. The Select construct supports automatic generation of column labels via the use_labels=True
parameter:
sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id, use_labels=True).execute()
The table name part of the label is affected if you use a construct such as a table alias:
person = users.alias('person') sqlc = select([person, addresses], person.c.user_id==addresses.c.address_id, use_labels=True).execute()
Labels are also generated in such a way as to never go beyond 30 characters. Most databases support a limit on the length of symbols, such as Postgres, and particularly Oracle which has a rather short limit of 30:
long_named_table = users.alias('this_is_the_person_table') sqlc = select([long_named_table], use_labels=True).execute()
You can also specify custom labels on a per-column basis using the label()
function:
sqlc = select([users.c.user_id.label('id'), users.c.user_name.label('name')]).execute()
Calling select
off a table automatically generates a column clause which includes all the table's columns, in the order they are specified in the source Table object.
But in addition to selecting all the columns off a single table, any set of columns can be specified, as well as full tables, and any combination of the two:
The WHERE condition is the named keyword argument whereclause
, or the second positional argument to the select()
constructor and the first positional argument to the select()
method of Table
.
WHERE conditions are constructed using column objects, literal values, and functions defined in the sqlalchemy.sql
module. Column objects override the standard Python operators to provide clause compositional objects, which compile down to SQL operations:
Notice that the literal value "7" was broken out of the query and placed into a bind parameter. Databases such as Oracle must parse incoming SQL and create a "plan" when new queries are received, which is an expensive process. By using bind parameters, the same query with various literal values can have its plan compiled only once, and used repeatedly with less overhead.
More where clauses:
# another comparison operator sqlc = select([users], users.c.user_id>7).execute()
# OR keyword sqlc = users.select(or_(users.c.user_name=='jack', users.c.user_name=='ed')).execute()
# AND keyword sqlc = users.select(and_(users.c.user_name=='jack', users.c.password=='dog')).execute()
# NOT keyword sqlc = users.select(not_( or_(users.c.user_name=='jack', users.c.password=='dog') )).execute()
# IN clause sqlc = users.select(users.c.user_name.in_('jack', 'ed', 'fred')).execute()
# join users and addresses together sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id).execute()
# join users and addresses together, but dont specify "addresses" in the # selection criterion. The WHERE criterion adds it to the FROM list # automatically. sqlc = select([users], and_( users.c.user_id==addresses.c.user_id, users.c.user_name=='fred' )).execute()
Select statements can also generate a WHERE clause based on the parameters you give it. If a given parameter, which matches the name of a column or its "label" (the combined tablename + "_" + column name), and does not already correspond to a bind parameter in the select object, it will be added as a comparison against that column. This is a shortcut to creating a full WHERE clause:
# specify a match for the "user_name" column sqlc = users.select().execute(user_name='ed')
# specify a full where clause for the "user_name" column, as well as a # comparison for the "user_id" column sqlc = users.select(users.c.user_name=='ed').execute(user_id=10)
Supported column operators so far are all the numerical comparison operators, i.e. '==', '>', '>=', etc., as well as like()
, startswith()
, endswith()
, between()
, and in()
. Boolean operators include not_()
, and_()
and or_()
, which also can be used inline via '~', '&', and '|'. Math operators are '+', '-', '*', '/'. Any custom operator can be specified via the op()
function shown below.
# "like" operator users.select(users.c.user_name.like('%ter')) # equality operator users.select(users.c.user_name == 'jane') # in opertator users.select(users.c.user_id.in_(1,2,3)) # and_, endswith, equality operators users.select(and_(addresses.c.street.endswith('green street'), addresses.c.zip=='11234')) # & operator subsituting for 'and_' users.select(addresses.c.street.endswith('green street') & (addresses.c.zip=='11234')) # + concatenation operator select([users.c.user_name + '_name']) # NOT operator users.select(~(addresses.c.street == 'Green Street')) # any custom operator select([users.c.user_name.op('||')('_category')]) # "null" comparison via == (converts to IS) sqlusers.select(users.c.user_name==None).execute()
# or via explicit null() construct sqlusers.select(users.c.user_name==null()).execute()
Functions can be specified using the func
keyword:
sqlselect([func.count(users.c.user_id)]).execute()
sqlusers.select(func.substr(users.c.user_name, 1) == 'J').execute()
Functions also are callable as standalone values:
# call the "now()" function time = func.now(engine=myengine).scalar() # call myfunc(1,2,3) myvalue = func.myfunc(1, 2, 3, engine=db).execute() # or call them off the engine db.func.now().scalar()
You can drop in a literal value anywhere there isnt a column to attach to via the literal
keyword:
sqlselect([literal('foo') + literal('bar'), users.c.user_name]).execute()
# literals have all the same comparison functions as columns sqlselect([literal('foo') == literal('bar')], engine=myengine).scalar()
Literals also take an optional type
parameter to give literals a type. This can sometimes be significant, for example when using the "+" operator with SQLite, the String type is detected and the operator is converted to "||":
sqlselect([literal('foo', type=String) + 'bar'], engine=e).execute()
The ORDER BY clause of a select statement can be specified as individual columns to order by within an array specified via the order_by
parameter, and optional usage of the asc() and desc() functions:
These are specified as keyword arguments:
sqlc = select([users.c.user_name], distinct=True).execute()
sqlc = users.select(limit=10, offset=20).execute()
The Oracle driver does not support LIMIT and OFFSET directly, but instead wraps the generated query into a subquery and uses the "rownum" variable to control the rows selected (this is somewhat experimental).
As some of the examples indicated above, a regular inner join can be implicitly stated, just like in a SQL expression, by just specifying the tables to be joined as well as their join conditions:
sqladdresses.select(addresses.c.user_id==users.c.user_id).execute()
There is also an explicit join constructor, which can be embedded into a select query via the from_obj
parameter of the select statement:
sqladdresses.select(from_obj=[ addresses.join(users, addresses.c.user_id==users.c.user_id) ]).execute()
The join constructor can also be used by itself:
sqljoin(users, addresses, users.c.user_id==addresses.c.user_id).select().execute()
The join criterion in a join() call is optional. If not specified, the condition will be derived from the foreign key relationships of the two tables. If no criterion can be constructed, an exception will be raised.
sqljoin(users, addresses).select().execute()
Notice that this is the first example where the FROM criterion of the select statement is explicitly specified. In most cases, the FROM criterion is automatically determined from the columns requested as well as the WHERE clause. The from_obj
keyword argument indicates a list of explicit FROM clauses to be used in the statement.
A join can be created on its own using the join
or outerjoin
functions, or can be created off of an existing Table or other selectable unit via the join
or outerjoin
methods:
Aliases are used primarily when you want to use the same table more than once as a FROM expression in a statement:
address_b = addresses.alias('addressb') sql# select users who have an address on Green street as well as Orange street users.select(and_( users.c.user_id==addresses.c.user_id, addresses.c.street.like('%Green%'), users.c.user_id==address_b.c.user_id, address_b.c.street.like('%Orange%') )).execute()
SQLAlchemy allows the creation of select statements from not just Table objects, but from a whole class of objects that implement the Selectable
interface. This includes Tables, Aliases, Joins and Selects. Therefore, if you have a Select, you can select from the Select:
>>> s = users.select() >>> str(s) SELECT users.user_id, users.user_name, users.password FROM users >>> s = s.select() >>> str(s) SELECT user_id, user_name, password FROM (SELECT users.user_id, users.user_name, users.password FROM users)
Any Select, Join, or Alias object supports the same column accessors as a Table:
>>> s = users.select() >>> [c.key for c in s.columns] ['user_id', 'user_name', 'password']
When you use use_labels=True
in a Select object, the label version of the column names become the keys of the accessible columns. In effect you can create your own "view objects":
s = select([users, addresses], users.c.user_id==addresses.c.user_id, use_labels=True) sqlselect([ s.c.users_user_name, s.c.addresses_street, s.c.addresses_zip ], s.c.addresses_city=='San Francisco').execute()
To specify a SELECT statement as one of the selectable units in a FROM clause, it usually should be given an alias.
sqls = users.select().alias('u') select([addresses, s]).execute()
Select objects can be used in a WHERE condition, in operators such as IN:
# select user ids for all users whos name starts with a "p" s = select([users.c.user_id], users.c.user_name.like('p%')) # now select all addresses for those users sqladdresses.select(addresses.c.user_id.in_(s)).execute()
The sql package supports embedding select statements into other select statements as the criterion in a WHERE condition, or as one of the "selectable" objects in the FROM list of the query. It does not at the moment directly support embedding a SELECT statement as one of the column criterion for a statement, although this can be achieved via direct text insertion, described later.
Subqueries can be used in the column clause of a select statement by specifying the scalar=True
flag:
sqlselect([table2.c.col1, table2.c.col2, select([table1.c.col1], table1.c.col2==7, scalar=True)])
When a select object is embedded inside of another select object, and both objects reference the same table, SQLAlchemy makes the assumption that the table should be correlated from the child query to the parent query. To disable this behavior, specify the flag correlate=False
to the Select statement.
# make an alias of a regular select. s = select([addresses.c.street], addresses.c.user_id==users.c.user_id).alias('s') >>> str(s) SELECT addresses.street FROM addresses, users WHERE addresses.user_id = users.user_id # now embed that select into another one. the "users" table is removed from # the embedded query's FROM list and is instead correlated to the parent query s2 = select([users, s.c.street]) >>> str(s2) SELECT users.user_id, users.user_name, users.password, s.street FROM users, (SELECT addresses.street FROM addresses WHERE addresses.user_id = users.user_id) s
An EXISTS clause can function as a higher-scaling version of an IN clause, and is usually used in a correlated fashion:
# find all users who have an address on Green street: sqlusers.select( exists( [addresses.c.address_id], and_( addresses.c.user_id==users.c.user_id, addresses.c.street.like('%Green%') ) ) )
Unions come in two flavors, UNION and UNION ALL, which are available via module level functions or methods off a Selectable:
sqlunion( addresses.select(addresses.c.street=='123 Green Street'), addresses.select(addresses.c.street=='44 Park Ave.'), addresses.select(addresses.c.street=='3 Mill Road'), order_by=[addresses.c.street] ).execute()
sqlusers.select( users.c.user_id==7 ).union_all( users.select( users.c.user_id==9 ), order_by=[users.c.user_id] # order_by is an argument to union_all() ).execute()
Throughout all these examples, SQLAlchemy is busy creating bind parameters wherever literal expressions occur. You can also specify your own bind parameters with your own names, and use the same statement repeatedly. The bind parameters, shown here in the "named" format, will be converted to the appropriate named or positional style according to the database implementation being used.
s = users.select(users.c.user_name==bindparam('username')) # execute implicitly sqls.execute(username='fred')
# execute explicitly conn = engine.connect() sqlconn.execute(s, username='fred')
executemany()
is also available by supplying multiple dictionary arguments instead of keyword arguments to the execute()
method of ClauseElement
or Connection
. Examples can be found later in the sections on INSERT/UPDATE/DELETE.
By throwing the compile()
method onto the end of any query object, the query can be "compiled" by the SQLEngine into a sqlalchemy.sql.Compiled
object just once, and the resulting compiled object reused, which eliminates repeated internal compilation of the SQL string:
s = users.select(users.c.user_name==bindparam('username')).compile() s.execute(username='fred') s.execute(username='jane') s.execute(username='mary')
The sql package tries to allow free textual placement in as many ways as possible. In the examples below, note that the from_obj parameter is used only when no other information exists within the select object with which to determine table metadata. Also note that in a query where there isnt even table metadata used, the SQLEngine to be used for the query has to be explicitly specified:
# strings as column clauses sqlselect(["user_id", "user_name"], from_obj=[users]).execute()
# strings for full column lists sqlselect( ["user_id, user_name, password, addresses.*"], from_obj=[users.alias('u'), addresses]).execute()
# functions, etc. sqlselect([users.c.user_id, "process_string(user_name)"]).execute()
# where clauses sqlusers.select(and_(users.c.user_id==7, "process_string(user_name)=27")).execute()
# subqueries sqlusers.select( "exists (select 1 from addresses where addresses.user_id=users.user_id)").execute()
# custom FROM objects sqlselect( ["*"], from_obj=["(select user_id, user_name from users)"], engine=db).execute()
# a full query sqltext("select user_name from users", engine=db).execute()
Use the format ':paramname'
to define bind parameters inside of a text block. They will be converted to the appropriate format upon compilation:
t = engine.text("select foo from mytable where lala=:hoho") r = t.execute(hoho=7)
Bind parameters can also be explicit, which allows typing information to be added. Just specify them as a list with keys that match those inside the textual statement:
t = engine.text("select foo from mytable where lala=:hoho", bindparams=[bindparam('hoho', type=types.String)]) r = t.execute(hoho="im hoho")
Result-row type processing can be added via the typemap
argument, which is a dictionary of return columns mapped to types:
# specify DateTime type for the 'foo' column in the result set # sqlite, for example, uses result-row post-processing to construct dates t = engine.text("select foo from mytable where lala=:hoho", bindparams=[bindparam('hoho', type=types.String)], typemap={'foo':types.DateTime} ) r = t.execute(hoho="im hoho") # 'foo' is a datetime year = r.fetchone()['foo'].year
One of the primary motivations for a programmatic SQL library is to allow the piecemeal construction of a SQL statement based on program variables. All the above examples typically show Select objects being created all at once. The Select object also includes "builder" methods to allow building up an object. The below example is a "user search" function, where users can be selected based on primary key, user name, street address, keywords, or any combination:
def find_users(id=None, name=None, street=None, keywords=None): statement = users.select() if id is not None: statement.append_whereclause(users.c.user_id==id) if name is not None: statement.append_whereclause(users.c.user_name==name) if street is not None: # append_whereclause joins "WHERE" conditions together with AND statement.append_whereclause(users.c.user_id==addresses.c.user_id) statement.append_whereclause(addresses.c.street==street) if keywords is not None: statement.append_from( users.join(userkeywords, users.c.user_id==userkeywords.c.user_id).join( keywords, userkeywords.c.keyword_id==keywords.c.keyword_id)) statement.append_whereclause(keywords.c.name.in_(keywords)) # to avoid multiple repeats, set query to be DISTINCT: statement.distinct=True return statement.execute() sqlfind_users(id=7)
sqlfind_users(street='123 Green Street')
sqlfind_users(name='Jack', keywords=['jack','foo'])
An INSERT involves just one table. The Insert object is used via the insert() function, and the specified columns determine what columns show up in the generated SQL. If primary key columns are left out of the criterion, the SQL generator will try to populate them as specified by the particular database engine and sequences, i.e. relying upon an auto-incremented column or explicitly calling a sequence beforehand. Insert statements, as well as updates and deletes, can also execute multiple parameters in one pass via specifying an array of dictionaries as parameters.
The values to be populated for an INSERT or an UPDATE can be specified to the insert()/update() functions as the values
named argument, or the query will be compiled based on the values of the parameters sent to the execute() method.
# basic insert sqlusers.insert().execute(user_id=1, user_name='jack', password='asdfdaf')
# insert just user_name, NULL for others # will auto-populate primary key columns if they are configured # to do so sqlusers.insert().execute(user_name='ed')
# INSERT with a list: sqlusers.insert(values=(3, 'jane', 'sdfadfas')).execute()
# INSERT with user-defined bind parameters i = users.insert( values={'user_name':bindparam('name'), 'password':bindparam('pw')} ) sqli.execute(name='mary', pw='adas5fs')
# INSERT many - if no explicit 'values' parameter is sent, # the first parameter list in the list determines # the generated SQL of the insert (i.e. what columns are present) # executemany() is used at the DBAPI level sqlusers.insert().execute( {'user_id':7, 'user_name':'jack', 'password':'asdfasdf'}, {'user_id':8, 'user_name':'ed', 'password':'asdffcadf'}, {'user_id':9, 'user_name':'fred', 'password':'asttf'}, )
Updates work a lot like INSERTS, except there is an additional WHERE clause that can be specified.
# change 'jack' to 'ed' sqlusers.update(users.c.user_name=='jack').execute(user_name='ed')
# use bind parameters u = users.update(users.c.user_name==bindparam('name'), values={'user_name':bindparam('newname')}) sqlu.execute(name='jack', newname='ed')
# update a column to another column sqlusers.update(values={users.c.password:users.c.user_name}).execute()
# multi-update sqlusers.update(users.c.user_id==bindparam('id')).execute( {'id':7, 'user_name':'jack', 'password':'fh5jks'}, {'id':8, 'user_name':'ed', 'password':'fsr234ks'}, {'id':9, 'user_name':'mary', 'password':'7h5jse'}, )
A correlated update lets you update a table using selection from another table, or the same table:
s = select([addresses.c.city], addresses.c.user_id==users.c.user_id) sqlusers.update( and_(users.c.user_id>10, users.c.user_id<20), values={users.c.user_name:s} ).execute()
A delete is formulated like an update, except theres no values:
users.delete(users.c.user_id==7).execute() users.delete(users.c.user_name.like(bindparam('name'))).execute( {'name':'%Jack%'}, {'name':'%Ed%'}, {'name':'%Jane%'}, ) users.delete(exists())
Data mapping describes the process of defining Mapper objects, which associate table metadata with user-defined classes.
The Mapper
's role is to perform SQL operations upon the database, associating individual table rows with instances of those classes, and individual database columns with properties upon those instances, to transparently associate in-memory objects with a persistent database representation.
When a Mapper
is created to associate a Table
object with a class, all of the columns defined in the Table
object are associated with the class via property accessors, which add overriding functionality to the normal process of setting and getting object attributes. These property accessors keep track of changes to object attributes; these changes will be stored to the database when the application "flushes" the current state of objects (known as a Unit of Work).
Two objects provide the primary interface for interacting with Mappers and the "unit of work", which are the Query
object and the Session
object. Query
deals with selecting objects from the database, whereas Session
provides a context for loaded objects and the ability to communicate changes on those objects back to the database.
The primary method on Query
for loading objects is its select()
method, which has similar arguments to a sqlalchemy.sql.Select
object. But this select method executes automatically and returns results, instead of awaiting an execute() call. Instead of returning a cursor-like object, it returns an array of objects.
The three configurational elements to be defined, i.e. the Table
metadata, the user-defined class, and the Mapper
, are typically defined as module-level variables, and may be defined in any fashion suitable to the application, with the only requirement being that the class and table metadata are described before the mapper. For the sake of example, we will be defining these elements close together, but this should not be construed as a requirement; since SQLAlchemy is not a framework, those decisions are left to the developer or an external framework.
Also, keep in mind that the examples in this section deal with explicit Session
objects mapped directly to Engine
objects, which represents the most explicit style of using the ORM. Options exist for how this is configured, including binding Table
objects directly to Engines
(described in Binding MetaData to an Engine), as well as using the "Threadlocal" plugin which provides various code shortcuts by using an implicit Session associated to the current thread (described in threadlocal).
First, the metadata/mapper configuration code:
from sqlalchemy import * # metadata meta = MetaData() # table object users_table = Table('users', meta, Column('user_id', Integer, primary_key=True), Column('user_name', String(16)), Column('password', String(20)) ) # class definition class User(object): pass # create a mapper and associate it with the User class. # technically we dont really need the 'usermapper' variable. usermapper = mapper(User, users_table)
Note that no database definitions are required. Next we will define an Engine
and connect a Session
to it, and perform a simple select:
The method session.query(class_or_mapper)
returns a Query
object. Below is a synopsis of things you can do with Query
:
# get a query from a Session based on class: query = session.query(User) # get a query from a Session given a Mapper: query = session.query(usermapper) # select_by, which takes keyword arguments. the # keyword arguments represent property names and the values # represent values which will be compared via the = operator. # the comparisons are joined together via "AND". result = query.select_by(name='john', street='123 green street') # select_by can also combine ClauseElements with key/value properties. # all ClauseElements and keyword-based criterion are combined together # via "AND". result = query.select_by(users_table.c.user_name=='john', addresses_table.c.zip_code=='12345', street='123 green street') # get_by, which takes the same arguments as select_by # returns a single scalar result or None if no results user = query.get_by(id=12) # "dynamic" versions of select_by and get_by - everything past the # "select_by_" or "get_by_" is used as the key, and the function argument # as the value result = query.select_by_name('fred') u = query.get_by_name('fred') # get an object directly from its primary key. this will bypass the SQL # call if the object has already been loaded u = query.get(15) # get an object that has a composite primary key of three columns. # the order of the arguments matches that of the table meta data. myobj = query.get((27, 3, 'receipts')) # using a WHERE criterion result = query.select(or_(users_table.c.user_name == 'john', users_table.c.user_name=='fred')) # using a WHERE criterion to get a scalar u = query.selectfirst(users_table.c.user_name=='john') # selectone() is a stricter version of selectfirst() which # will raise an exception if there is not exactly one row u = query.selectone(users_table.c.user_name=='john') # using a full select object result = query.select(users_table.select(users_table.c.user_name=='john'))
Some of the above examples above illustrate the usage of the mapper's Table object to provide the columns for a WHERE Clause. These columns are also accessible off of the mapped class directly. When a mapper is assigned to a class, it also attaches a special property accessor c
to the class itself, which can be used just like the table metadata to access the columns of the table:
userlist = session.query(User).select(User.c.user_id==12)
When objects corresponding to mapped classes are created or manipulated, all changes are logged by the Session
object. The changes are then written to the database when an application calls flush()
. This pattern is known as a Unit of Work, and has many advantages over saving individual objects or attributes on those objects with individual method invocations. Domain models can be built with far greater complexity with no concern over the order of saves and deletes, excessive database round-trips and write operations, or deadlocking issues. The flush()
operation batches its SQL statements into a transaction, and can also perform optimistic concurrency checks (using a version id column) to ensure the proper number of rows were in fact affected (not supported with the current MySQL drivers).
The Unit of Work is a powerful tool, and has some important concepts that should be understood in order to use it effectively. See the Session / Unit of Work section for a full description on all its operations.
When a mapper is created, the target class has its mapped properties decorated by specialized property accessors that track changes. New objects by default must be explicitly added to the Session
, however this can be made automatic by using threadlocal or SessionContext.
mapper(User, users_table) # create a new User myuser = User() myuser.user_name = 'jane' myuser.password = 'hello123' # create another new User myuser2 = User() myuser2.user_name = 'ed' myuser2.password = 'lalalala' # create a Session and save them sess = create_session() sess.save(myuser) sess.save(myuser2) # load a third User from the database sqlmyuser3 = sess.query(User).select(User.c.user_name=='fred')[0]
myuser3.user_name = 'fredjones' # save all changes sqlsession.flush()
The mapped class can also specify whatever methods and/or constructor it wants:
class User(object): def __init__(self, user_name, password): self.user_id = None self.user_name = user_name self.password = password def get_name(self): return self.user_name def __repr__(self): return "User id %s name %s password %s" % (repr(self.user_id), repr(self.user_name), repr(self.password)) mapper(User, users_table) sess = create_session() u = User('john', 'foo') sess.save(u) sqlsession.flush()
>>> u User id 1 name 'john' password 'foo'
Note that the __init__() method is not called when the instance is loaded. This is so that classes can define operations that are specific to their initial construction which are not re-called when the object is restored from the database, and is similar in concept to how Python's pickle
module calls __new__()
when deserializing instances. To allow __init__()
to be called at object load time, or to define any other sort of on-load operation, create a MapperExtension
which supplies the create_instance()
method (see Extending Mapper, as well as the example in the FAQ).
SQLAlchemy will only put modified object attributes columns into the UPDATE statements generated upon flush. This is to conserve database traffic and also to successfully interact with a "deferred" attribute, which is a mapped object attribute against the mapper's primary table that isnt loaded until referenced by the application.
So that covers how to map the columns in a table to an object, how to load objects, create new ones, and save changes. The next step is how to define an object's relationships to other database-persisted objects. This is done via the relation
function provided by the orm
module.
So with our User class, lets also define the User has having one or more mailing addresses. First, the table metadata:
from sqlalchemy import * metadata = MetaData() # define user table users_table = Table('users', metadata, Column('user_id', Integer, primary_key=True), Column('user_name', String(16)), Column('password', String(20)) ) # define user address table addresses_table = Table('addresses', metadata, Column('address_id', Integer, primary_key=True), Column('user_id', Integer, ForeignKey("users.user_id")), Column('street', String(100)), Column('city', String(80)), Column('state', String(2)), Column('zip', String(10)) )
Of importance here is the addresses table's definition of a foreign key relationship to the users table, relating the user_id column into a parent-child relationship. When a Mapper
wants to indicate a relation of one object to another, the ForeignKey
relationships are the default method by which the relationship is determined (options also exist to describe the relationships explicitly).
So then lets define two classes, the familiar User
class, as well as an Address
class:
class User(object): def __init__(self, user_name, password): self.user_name = user_name self.password = password class Address(object): def __init__(self, street, city, state, zip): self.street = street self.city = city self.state = state self.zip = zip
And then a Mapper
that will define a relationship of the User
and the Address
classes to each other as well as their table metadata. We will add an additional mapper keyword argument properties
which is a dictionary relating the names of class attributes to database relationships, in this case a relation
object against a newly defined mapper for the Address class:
mapper(Address, addresses_table) mapper(User, users_table, properties = { 'addresses' : relation(Address) } )
Lets do some operations with these classes and see what happens:
engine = create_engine('sqlite:///mydb.db') metadata.create_all(engine) session = create_session(bind_to=engine) u = User('jane', 'hihilala') u.addresses.append(Address('123 anywhere street', 'big city', 'UT', '76543')) u.addresses.append(Address('1 Park Place', 'some other city', 'OK', '83923')) session.save(u) session.flush()
A lot just happened there! The Mapper
figured out how to relate rows in the addresses table to the users table, and also upon flush had to determine the proper order in which to insert rows. After the insert, all the User
and Address
objects have their new primary and foreign key attributes populated.
Also notice that when we created a Mapper
on the User
class which defined an addresses
relation, the newly created User
instance magically had an "addresses" attribute which behaved like a list. This list is in reality a property function which returns an instance of sqlalchemy.util.HistoryArraySet
. This object fulfills the full set of Python list accessors, but maintains a unique set of objects (based on their in-memory identity), and also tracks additions and deletions to the list:
del u.addresses[1] u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839')) session.flush()
Note that when creating a relation with the relation()
function, the target can either be a class, in which case the primary mapper for that class is used as the target, or a Mapper
instance itself, as returned by the mapper()
function.
In the previous example, a single address was removed from the addresses
attribute of a User
object, resulting in the corresponding database row being updated to have a user_id of None
. But now, theres a mailing address with no user_id floating around in the database of no use to anyone. How can we avoid this ? This is acheived by using the cascade
parameter of relation
:
clear_mappers() # clear mappers from the previous example mapper(Address, addresses_table) mapper(User, users_table, properties = { 'addresses' : relation(Address, cascade="all, delete-orphan") } ) del u.addresses[1] u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839')) session.flush()
In this case, with the delete-orphan
cascade rule set, the element that was removed from the addresses list was also removed from the database. Specifying cascade="all, delete-orphan"
means that every persistence operation performed on the parent object will be cascaded to the child object or objects handled by the relation, and additionally that each child object cannot exist without being attached to a parent. Such a relationship indicates that the lifecycle of the Address
objects are bounded by that of their parent User
object.
Cascading is described fully in Cascade rules.
By creating relations with the backref
keyword, a bi-directional relationship can be created which will keep both ends of the relationship updated automatically, independently of database operations. Below, the User
mapper is created with an addresses
property, and the corresponding Address
mapper receives a "backreference" to the User
object via the property name user
:
Address = mapper(Address, addresses_table) User = mapper(User, users_table, properties = { 'addresses' : relation(Address, backref='user') } ) u = User('fred', 'hi') a1 = Address('123 anywhere street', 'big city', 'UT', '76543') a2 = Address('1 Park Place', 'some other city', 'OK', '83923') # append a1 to u u.addresses.append(a1) # attach u to a2 a2.user = u # the bi-directional relation is maintained >>> u.addresses == [a1, a2] True >>> a1.user is user and a2.user is user True
The backreference feature also works with many-to-many relationships, which are described later. When creating a backreference, a corresponding property (i.e. a second relation()
) is placed on the child mapper. The default arguments to this property can be overridden using the backref()
function:
mapper(User, users_table) mapper(Address, addresses_table, properties={ 'user':relation(User, backref=backref('addresses', cascade="all, delete-orphan")) })
The backref()
function is often used to set up a bi-directional one-to-one relationship. This is because the relation()
function by default creates a "one-to-many" relationship when presented with a primary key/foreign key relationship, but the backref()
function can redefine the uselist
property to make it a scalar:
mapper(User, users_table) mapper(Address, addresses_table, properties={ 'user' : relation(User, backref=backref('address', uselist=False)) })
We've seen how the relation
specifier affects the saving of an object and its child items, how does it affect selecting them? By default, the relation keyword indicates that the related property should be attached a lazy loader when instances of the parent object are loaded from the database; this is just a callable function that when accessed will invoke a second SQL query to load the child objects of the parent.
# define a mapper mapper(User, users_table, properties = { 'addresses' : relation(mapper(Address, addresses_table)) }) # select users where username is 'jane', get the first element of the list # this will incur a load operation for the parent table sqluser = session.query(User).select(User.c.user_name=='jane')[0]
# iterate through the User object's addresses. this will incur an # immediate load of those child items sqlfor a in user.addresses:
print repr(a)
When using mappers that have relationships to other mappers, the need to specify query criterion across multiple tables arises. SQLAlchemy provides several core techniques which offer this functionality.
When specifying columns to the select
method of Query
, if the columns are attached to a table other than the mapped table, that table is automatically added to the "FROM" clause of the query. This is the same behavior that occurs when creating a non-ORM select
object. Using this feature, joins can be created when querying:
sqll = session.query(User).select(and_(users.c.user_id==addresses.c.user_id, addresses.c.street=='123 Green Street'))
Above, we specified selection criterion that included columns from both the users
and the addresses
table. Note that in this case, we had to specify not just the matching condition to the street
column on addresses
, but also the join condition between the users
and addresses
table. Even though the User
mapper has a relationship to the Address
mapper where the join condition is known, the select method does not assume how you want to construct joins. If we did not include the join clause, we would get:
# this is usually not what you want to do sqll = session.query(User).select(addresses.c.street=='123 Green Street')
The above join will return all rows of the users
table, even those that do not correspond to the addresses
table, in a cartesian product with the matching rows from the addresses
table.
Another way to specify joins more explicitly is to use the from_obj
parameter of select()
. This allows you to explicitly place elements in the FROM clause of the query, which could include lists of tables and/or Join
constructs:
sqll = session.query(User).select(addresses.c.street=='123 Green Street', from_obj=[users.join(addresses)])
In the above example, the join
function by default creates a natural join between the two tables, so we were able to avoid having to specify the join condition between users
and addresses
explicitly.
Another way that joins can be created is by using the select_by
method of Query
, which has the ability to create joins across relationships automatically. This method is in many circumstances more convenient than, but not as flexible as, the more SQL-level approach using the select()
method described in the previous section.
To issue a join using select_by
, just specify a key in the argument list which is not present in the primary mapper's list of properties or columns, but is present in the property list of some relationship down the line of objects. The Query
object will recursively traverse along the mapped relationships starting with the lead class and descending into child classes, until it finds a property matching the given name. For each new mapper it encounters along the path to the located property, it constructs a join across that relationship:
sqll = session.query(User).select_by(street='123 Green Street')
The above example is shorthand for:
l = session.query(User).select(and_( Address.c.user_id==User.c.user_id, Address.c.street=='123 Green Street') )
All keyword arguments sent to select_by
are used to create query criterion. This means that familiar select
keyword options like order_by
and limit
are not directly available. To enable these options with select_by
, you can try the SelectResults extension which offers transitive methods off the result of a select
or select_by
such as order_by()
, limit()
, etc.
Note that the select_by
method, while it primarily uses keyword arguments, also can accomodate ClauseElement
objects positionally; recall that a ClauseElement
is genearated when producing a comparison off of a Column
expression, such as users.c.name=='ed'
. When using ClauseElements
with select_by
, these clauses are passed directly to the generated SQL and are not used to further locate join criterion. If criterion is being constructed with these kinds of expressions, consider using the select()
method which is better designed to accomodate these expressions.
As of SA 0.3.4, select_by()
and related functions can compare not only column-based attributes to column-based values, but also relations to object instances:
# get an instance of Address someaddress = session.query(Address).get_by(street='123 Green Street') # look for User instances which have the # "someaddress" instance in their "addresses" collection l = session.query(User).select_by(addresses=someaddress)
Where above, the comparison denoted by addresses=someaddress
is constructed by comparing all the primary key columns in the Address
mapper to each corresponding primary key value in the someaddress
entity. In other words, its equivalent to saying select_by(address_id=someaddress.address_id)
(Alpha API).
Feature Status: Alpha API
The join_to
method of Query
is a component of the select_by
operation, and is given a keyname in order to return a "join path" from the Query's mapper to the mapper which is referenced by a relation()
of the given name:
>>> q = session.query(User) >>> j = q.join_to('addresses') >>> print j users.user_id=addresses.user_id
join_to
can also be given the name of a column-based property, in which case it will locate a path to the nearest mapper which has that property as a column:
>>> q = session.query(User) >>> j = q.join_to('street') >>> print j users.user_id=addresses.user_id
Also available is the join_via
function, which is similar to join_to
, except instead of traversing through all properties to find a path to the given key, its given an explicit path to the target property:
>>> q = session.query(User) >>> j = q.join_via(['orders', 'items']) >>> print j users.c.user_id==orders.c.user_id AND orders.c.item_id==items.c.item_id
Expressions produced by join_to
and join_via
can be used with select
to create more complicated query criterion across multiple relations:
>>> l = q.select( (addresses_table.c.street=='some address') & (items_table.c.item_name=='item #4') & q.join_to('addresses') & q.join_via(['orders', 'items']) )
Note that the from_obj
parameter of select()
, described previously, allows finer grained control of joins, allowing any combination of inner and outer joins.
Eager Loading describes the loading of parent and child objects across a relation using a single query. The purpose of eager loading is strictly one of performance enhancement; eager loading has no impact on the results of a query, except that when traversing child objects within the results, lazy loaders will not need to issue separate queries to load those child objects.
Eager Loading is enabled on a per-relationship basis, either as the default for a particular relationship, or for a single query using query options, described later.
With just a single parameter lazy=False
specified to the relation object, the parent and child SQL queries can be joined together.
mapper(Address, addresses_table) mapper(User, users_table, properties = { 'addresses' : relation(Address, lazy=False) } ) sqlusers = session.query(User).select(User.c.user_name=='Jane')
for u in users: print repr(u) for a in u.addresses: print repr(a)
Above, a pretty ambitious query is generated just by specifying that the User should be loaded with its child Addresses in one query. When the mapper processes the results, it uses an Identity Map to keep track of objects that were already loaded, based on their primary key identity. Through this method, the redundant rows produced by the join are organized into the distinct object instances they represent.
Recall that eager loading has no impact on the results of the query. What if our query included our own join criterion? The eager loading query accomodates this using aliases, and is immune to the effects of additional joins being specified in the original query. To use our select_by example above, joining against the "addresses" table to locate users with a certain street results in this behavior:
sqlusers = session.query(User).select_by(street='123 Green Street')
The join implied by passing the "street" parameter is separate from the join produced by the eager join, which is "aliasized" to prevent conflicts.
The options
method on the Query
object provides an easy way to get alternate forms of a mapper query from an original one. The most common use of this feature is to change the "eager/lazy" loading behavior of a particular mapper, via the functions eagerload()
, lazyload()
and noload()
:
# user mapper with lazy addresses mapper(User, users_table, properties = { 'addresses' : relation(mapper(Address, addresses_table)) } ) # query object query = session.query(User) # make an eager loading query eagerquery = query.options(eagerload('addresses')) u = eagerquery.select() # make another query that wont load the addresses at all plainquery = query.options(noload('addresses')) # multiple options can be specified myquery = oldquery.options(lazyload('tracker'), noload('streets'), eagerload('members')) # to specify a relation on a relation, separate the property names by a "." myquery = oldquery.options(eagerload('orders.items'))
The above examples focused on the "one-to-many" relationship. To do other forms of relationship is easy, as the relation
function can usually figure out what you want:
metadata = MetaData() # a table to store a user's preferences for a site prefs_table = Table('user_prefs', metadata, Column('pref_id', Integer, primary_key = True), Column('stylename', String(20)), Column('save_password', Boolean, nullable = False), Column('timezone', CHAR(3), nullable = False) ) # user table with a 'preference_id' column users_table = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(16), nullable = False), Column('password', String(20), nullable = False), Column('preference_id', Integer, ForeignKey("user_prefs.pref_id")) ) # engine and some test data engine = create_engine('sqlite:///', echo=True) metadata.create_all(engine) engine.execute(prefs_table.insert(), dict(pref_id=1, stylename='green', save_password=1, timezone='EST')) engine.execute(users_table.insert(), dict(user_name = 'fred', password='45nfss', preference_id=1)) # classes class User(object): def __init__(self, user_name, password): self.user_name = user_name self.password = password class UserPrefs(object): pass mapper(UserPrefs, prefs_table) mapper(User, users_table, properties = dict( preferences = relation(UserPrefs, lazy=False, cascade="all, delete-orphan"), )) # select session = create_session(bind_to=engine) sqluser = session.query(User).get_by(user_name='fred')
save_password = user.preferences.save_password # modify user.preferences.stylename = 'bluesteel' # flush sqlsession.flush()
The relation
function handles a basic many-to-many relationship when you specify the association table:
metadata = MetaData() articles_table = Table('articles', metadata, Column('article_id', Integer, primary_key = True), Column('headline', String(150), key='headline'), Column('body', TEXT, key='body'), ) keywords_table = Table('keywords', metadata, Column('keyword_id', Integer, primary_key = True), Column('keyword_name', String(50)) ) itemkeywords_table = Table('article_keywords', metadata, Column('article_id', Integer, ForeignKey("articles.article_id")), Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")) ) engine = create_engine('sqlite:///') metadata.create_all(engine) # class definitions class Keyword(object): def __init__(self, name): self.keyword_name = name class Article(object): pass mapper(Keyword, keywords_table) # define a mapper that does many-to-many on the 'itemkeywords' association # table mapper(Article, articles_table, properties = dict( keywords = relation(Keyword, secondary=itemkeywords_table, lazy=False) ) ) session = create_session(bind_to=engine) article = Article() article.headline = 'a headline' article.body = 'this is the body' article.keywords.append(Keyword('politics')) article.keywords.append(Keyword('entertainment')) session.save(article) sqlsession.flush()
# select articles based on a keyword. select_by will handle the extra joins. sqlarticles = session.query(Article).select_by(keyword_name='politics')
a = articles[0] # clear out keywords with a new list a.keywords = [] a.keywords.append(Keyword('topstories')) a.keywords.append(Keyword('government')) # flush sqlsession.flush()
Many to Many can also be done with an association object, that adds additional information about how two items are related. In this pattern, the "secondary" option to relation()
is no longer used; instead, the association object becomes a mapped entity itself, mapped to the association table. If the association table has no explicit primary key columns defined, you also have to tell the mapper what columns will compose its "primary key", which are typically the two (or more) columns involved in the association. Also, the relation between the parent and association mapping is typically set up with a cascade of all, delete-orphan
. This is to ensure that when an association object is removed from its parent collection, it is deleted (otherwise, the unit of work tries to null out one of the foreign key columns, which raises an error condition since that column is also part of its "primary key").
from sqlalchemy import * metadata = MetaData() users_table = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(16), nullable = False), ) articles_table = Table('articles', metadata, Column('article_id', Integer, primary_key = True), Column('headline', String(150), key='headline'), Column('body', TEXT, key='body'), ) keywords_table = Table('keywords', metadata, Column('keyword_id', Integer, primary_key = True), Column('keyword_name', String(50)) ) # add "attached_by" column which will reference the user who attached this keyword itemkeywords_table = Table('article_keywords', metadata, Column('article_id', Integer, ForeignKey("articles.article_id")), Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")), Column('attached_by', Integer, ForeignKey("users.user_id")) ) engine = create_engine('sqlite:///', echo=True) metadata.create_all(engine) # class definitions class User(object): pass class Keyword(object): def __init__(self, name): self.keyword_name = name class Article(object): pass class KeywordAssociation(object): pass # Article mapper, relates to Keyword via KeywordAssociation mapper(Article, articles_table, properties={ 'keywords':relation(KeywordAssociation, lazy=False, cascade="all, delete-orphan") } ) # mapper for KeywordAssociation # specify "primary key" columns manually mapper(KeywordAssociation, itemkeywords_table, primary_key=[itemkeywords_table.c.article_id, itemkeywords_table.c.keyword_id], properties={ 'keyword' : relation(Keyword, lazy=False), 'user' : relation(User, lazy=False) } ) # user mapper mapper(User, users_table) # keyword mapper mapper(Keyword, keywords_table) session = create_session(bind_to=engine) # select by keyword sqlalist = session.query(Article).select_by(keyword_name='jacks_stories')
# user is available for a in alist: for k in a.keywords: if k.keyword.name == 'jacks_stories': print k.user.user_name
Keep in mind that the association object works a little differently from a plain many-to-many relationship. Members have to be added to the list via instances of the association object, which in turn point to the associated object:
user = User() user.user_name = 'some user' article = Article() assoc = KeywordAssociation() assoc.keyword = Keyword('blue') assoc.user = user assoc2 = KeywordAssociation() assoc2.keyword = Keyword('green') assoc2.user = user article.keywords.append(assoc) article.keywords.append(assoc2) session.save(article) session.flush()
SQLAlchemy includes an extension module which can be used in some cases to decrease the explicitness of the association object pattern; this extension is described in associationproxy.
The concept behind Unit of Work is to track modifications to a field of objects, and then be able to flush those changes to the database in a single operation. Theres a lot of advantages to this, including that your application doesn't need to worry about individual save operations on objects, nor about the required order for those operations, nor about excessive repeated calls to save operations that would be more efficiently aggregated into one step. It also simplifies database transactions, providing a neat package with which to insert into the traditional database begin/commit phase.
SQLAlchemy's unit of work includes these functions:
Session
object. Transactional capability, which rides on top of the transactions provided by Engine
objects, is provided by the SessionTransaction
object.
When dealing with mapped instances with regards to Sessions, an instance may be attached or unattached to a particular Session. An instance also may or may not correspond to an actual row in the database. The product of these two binary conditions yields us four general states a particular instance can have within the perspective of the Session:
Transient - a transient instance exists within memory only and is not associated with any Session. It also has no database identity and does not have a corresponding record in the database. When a new instance of a class is constructed, and no default session context exists with which to automatically attach the new instance, it is a transient instance. The instance can then be saved to a particular session in which case it becomes a pending instance. If a default session context exists, new instances are added to that Session by default and therefore become pending instances immediately.
Pending - a pending instance is a Session-attached object that has not yet been assigned a database identity. When the Session is flushed (i.e. changes are persisted to the database), a pending instance becomes persistent.
Persistent - a persistent instance has a database identity and a corresponding record in the database, and is also associated with a particular Session. By "database identity" we mean the object is associated with a table or relational concept in the database combined with a particular primary key in that table. Objects that are loaded by SQLAlchemy in the context of a particular session are automatically considered persistent, as are formerly pending instances which have been subject to a session flush()
.
Detached - a detached instance is an instance which has a database identity and corresponding row in the database, but is not attached to any Session. This occurs when an instance has been removed from a Session, either because the session itself was cleared or closed, or the instance was explicitly removed from the Session. The object can be re-attached with a session again in which case it becomes Persistent again. Detached instances are useful when an application needs to represent a long-running operation across multiple Sessions, needs to store an object in a serialized state and then restore it later (such as within an HTTP "session" object), or in some cases where code needs to load instances locally which will later be associated with some other Session.
A new Session object is constructed via the create_session()
function:
session = create_session()
A common option used with create_session()
is to specify a specific Engine
or Connection
to be used for all operations performed by this Session:
# create an engine e = create_engine('postgres://some/url') # create a Session that will use this engine for all operations. # it will open and close Connections as needed. session = create_session(bind_to=e) # open a Connection conn = e.connect() # create a Session that will use this specific Connection for all operations session = create_session(bind_to=conn)
The session to which an object is attached can be acquired via the object_session()
function, which returns the appropriate Session
if the object is pending or persistent, or None
if the object is transient or detached:
session = object_session(obj)
Session Facts:
We will now cover some of the key concepts used by Sessions and its underlying Unit of Work.
A primary concept of the Session's underlying Unit of Work is that it is keeps track of all persistent instances; recall that a persistent instance has a database identity and is attached to a Session. In particular, the Unit of Work must ensure that only one copy of a particular persistent instance exists within the Session at any given time. The UOW accomplishes this task using a dictionary known as an Identity Map.
When a Query
is used to issue select
or get
requests to the database, it will in nearly all cases result in an actual SQL execution to the database, and a corresponding traversal of rows received from that execution. However, when the underlying mapper actually creates objects corresponding to the result set rows it receives, it will check the session's identity map first before instantating a new object, and return the same instance already present in the identity map if it already exists, essentially ignoring the object state represented by that row. There are several ways to override this behavior and truly refresh an already-loaded instance which are described later, but the main idea is that once your instance is loaded into a particular Session, it will never change its state without your explicit approval, regardless of what the database says about it.
For example; below, two separate calls to load an instance with database identity "15" are issued, and the results assigned to two separate variables. However, since the same Session
was used, the two instances are the same instance:
mymapper = mapper(MyClass, mytable) session = create_session() obj1 = session.query(MyClass).selectfirst(mytable.c.id==15) obj2 = session.query(MyClass).selectfirst(mytable.c.id==15) >>> obj1 is obj2 True
The Identity Map is an instance of dict
by default. (This is new as of version 0.3.2). As an option, you can specify the flag weak_identity_map=True
to the create_session
function so that it will use a weakref.WeakValueDictionary
, so that when an in-memory object falls out of scope, it will be removed automatically, thereby providing some automatic management of memory. However, this may not be instant if there are circular references upon the object. To guarantee that an instance is removed from the identity map before removing references to it, use the expunge()
method, described later, to remove it. Additionally, note that an object that has changes marked on it (i.e. "dirty") can still fall out of scope when using weak_identity_map
.
The Session supports an iterator interface in order to see all objects in the identity map:
for obj in session: print obj
As well as __contains__()
:
if obj in session: print "Object is present"
The identity map itself is accessible via the identity_map
accessor:
>>> session.identity_map.values() [<__main__.User object at 0x712630>, <__main__.Address object at 0x712a70>]
The identity of each object instance is available via the _instance_key
property attached to each object instance, and is a tuple consisting of the object's class and an additional tuple of primary key values, in the order that they appear within the table definition:
>>> obj._instance_key (<class 'test.tables.User'>, (7,))
At the moment that an object is assigned this key within a flush()
operation, it is also added to the session's identity map.
The get()
method on Query
, which retrieves an object based on primary key identity, also checks in the Session's identity map first to save a database round-trip if possible. In the case of an object lazy-loading a single child object, the get()
method is used as well, so scalar-based lazy loads may in some cases not query the database; this is particularly important for backreference relationships as it can save a lot of queries.
The next concept is that in addition to the Session
storing a record of all objects loaded or saved, it also stores lists of all newly created (i.e. pending) objects and lists of all persistent objects that have been marked as deleted. These lists are used when a flush()
call is issued to save all changes. During a flush operation, it also scans its list of persistent instances for changes which are marked as dirty.
These records are all tracked by collection functions that are also viewable off the Session
as properties:
# pending objects recently added to the Session session.new # persistent objects which currently have changes detected # (this collection is now created on the fly each time the property is called) session.dirty # persistent objects that have been marked as deleted via session.delete(obj) session.deleted
Note that if a session is created with the weak_identity_map
flag, an item which is marked as "dirty" will be silently removed from the session if the item falls out of scope in the user application. This is because the unit of work does not look for "dirty" changes except for within a flush operation (or any time the session.dirty collection is accessed).
As for objects inside of new
and deleted
, if you abandon all references to new or modified objects within a session, they are still present in either of those two lists, and will be saved on the next flush operation, unless they are removed from the Session explicitly (more on that later).
The query()
function takes a class or Mapper
as an argument, along with an optional entity_name
parameter, and returns a new Query
object which will issue mapper queries within the context of this Session. If a Mapper is passed, then the Query uses that mapper. Otherwise, if a class is sent, it will locate the primary mapper for that class which is used to construct the Query.
# query from a class session.query(User).select_by(name='ed') # query from a mapper query = session.query(usermapper) x = query.get(1) # query from a class mapped with entity name 'alt_users' q = session.query(User, entity_name='alt_users') y = q.options(eagerload('orders')).select()
entity_name
is an optional keyword argument sent with a class object, in order to further qualify which primary mapper to be used; this only applies if there was a Mapper
created with that particular class/entity name combination, else an exception is raised. All of the methods on Session which take a class or mapper argument also take the entity_name
argument, so that a given class can be properly matched to the desired primary mapper.
All instances retrieved by the returned Query
object will be stored as persistent instances within the originating Session
.
Given a class or mapper, a scalar or tuple-based identity, and an optional entity_name
keyword argument, creates a Query
corresponding to the given mapper or class/entity_name combination, and calls the get()
method with the given identity value. If the object already exists within this Session, it is simply returned, else it is queried from the database. If the instance is not found, the method returns None
.
# get Employer primary key 5 employer = session.get(Employer, 5) # get Report composite primary key 7,12, using mapper 'report_mapper_b' report = session.get(Report, (7,12), entity_name='report_mapper_b')
load() is similar to get() except it will raise an exception if the instance does not exist in the database. It will also load the object's data from the database in all cases, and overwrite all changes on the object if it already exists in the session with the latest data from the database.
# load Employer primary key 5 employer = session.load(Employer, 5) # load Report composite primary key 7,12, using mapper 'report_mapper_b' report = session.load(Report, (7,12), entity_name='report_mapper_b')
save() is called with a single transient (unsaved, unattached) instance as an argument, which is then added to the Session and becomes pending. When the session is next flush
ed, the instance will be saved to the database uponwhich it becomes persistent (saved, attached). If the given instance is not transient, meaning it is either attached to an existing Session or it has a database identity, an exception is raised.
user1 = User(name='user1') user2 = User(name='user2') session.save(user1) session.save(user2) session.flush() # write changes to the database
save() is called automatically for new instances by the classes' associated mapper, if a default Session context is in effect (such as a thread-local session), which means that newly created instances automatically become pending. If there is no default session available, then the instance remains transient (unattached) until it is explicitly added to a Session via the save() method.
A transient instance also can be automatically save
ed if it is associated with a parent object which specifies save-update
within its cascade
rules, and that parent is already attached or becomes attached to a Session. For more information on cascade
, see the next section.
The save_or_update()
method, covered later, is a convenience method which will call the save()
or update()
methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached).
This is the main gateway to what the Unit of Work does best, which is save everything ! It should be clear by now what a flush looks like:
session.flush()
It also can be called with a list of objects; in this form, the flush operation will be limited only to the objects specified in the list, as well as any child objects within private
relationships for a delete operation:
# saves only user1 and address2. all other modified # objects remain present in the session. session.flush([user1, address2])
This second form of flush should be used carefully as it will not necessarily locate other dependent objects within the session, whose database representation may have foreign constraint relationships with the objects being operated upon.
A common misconception about the flush()
operation is that once performed, the newly persisted instances will automatically have related objects attached to them, based on the values of primary key identities that have been assigned to the instances before they were persisted. An example would be, you create a new Address
object, set address.user_id
to 5, and then flush()
the session. The erroneous assumption would be that there is now a User
object of identity "5" attached to the Address
object, but in fact this is not the case. If you were to refresh()
the Address
, invalidating its current state and re-loading, then it would have the appropriate User
object present.
This misunderstanding is related to the observed behavior of backreferences (Backreferences), which automatically associates an instance "A" with another instance "B", in response to the manual association of instance "B" to instance "A" by the user. The backreference operation occurs completely externally to the flush()
operation, and is pretty much the only example of a SQLAlchemy feature that manipulates the relationships of persistent objects.
The primary guideline for dealing with flush()
is, the developer is responsible for maintaining in-memory objects and their relationships to each other, the unit of work is responsible for maintaining the database representation of the in-memory objects. The typical pattern is that the manipulation of objects is the way that changes get communicated to the unit of work, so that when the flush occurs, the objects are already in their correct in-memory representation and problems dont arise. The manipulation of identifier attributes like integer key values as well as deletes in particular are a frequent source of confusion.
This method first calls clear()
, removing all objects from this Session
, and then ensures that any transactional resources are closed.
The delete
method places an instance into the Unit of Work's list of objects to be marked as deleted:
# mark two objects to be deleted session.delete(obj1) session.delete(obj2) # flush session.flush()
The delete operation will have an effect on instances that are attached to the deleted instance according to the cascade
style of the relationship; cascade rules are described further in the following section. By default, associated instances may need to be updated in the database to reflect that they no longer are associated with the parent object, before the parent is deleted. If the relationship specifies cascade="delete"
, then the associated instance will also be deleted upon flush, assuming it is still attached to the parent. If the relationship additionally includes the delete-orphan
cascade style, the associated instance will be deleted if it is still attached to the parent, or is unattached to any other parent.
The delete()
operation has no relationship to the in-memory status of the instance, including usage of the del
Python statement. An instance marked as deleted and flushed will still exist within memory until references to it are freed; similarly, removing an instance from memory via the del
statement will have no effect, since the persistent instance will still be referenced by its Session. Obviously, if the instance is removed from the Session and then totally dereferenced, it will no longer exist in memory, but also won't exist in any Session and is therefore not deleted from the database.
This method detaches all instances from the Session, sending them to the detached or transient state as applicable, and replaces the underlying UnitOfWork with a new one.
session.clear()
The clear()
method is particularly useful with a "default context" session such as a thread-local session, which can stay attached to the current thread to handle a new field of objects without having to re-attach a new Session.
To assist with the Unit of Work's "sticky" behavior, individual objects can have all of their attributes immediately re-loaded from the database, or marked as "expired" which will cause a re-load to occur upon the next access of any of the object's mapped attributes. This includes all relationships, so lazy-loaders will be re-initialized, eager relationships will be repopulated. Any changes marked on the object are discarded:
# immediately re-load attributes on obj1, obj2 session.refresh(obj1) session.refresh(obj2) # expire objects obj1, obj2, attributes will be reloaded # on the next access: session.expire(obj1) session.expire(obj2)
Expunge removes an object from the Session, sending persistent instances to the detached state, and pending instances to the transient state:
session.expunge(obj1)
Use expunge
when youd like to remove an object altogether from memory, such as before calling del
on it, which will prevent any "ghost" operations occuring when the session is flushed.
Both of these methods receive two arguments; in the case of bind_mapper()
, it is a Mapper
and an Engine
or Connection
instance; in the case of bind_table()
, it is a Table
instance or other Selectable
(such as an Alias
, Select
, etc.), and an Engine
or Connection
instance.
engine1 = create_engine('sqlite:///file1.db') engine2 = create_engine('mysql://localhost') sqlite_conneciton = engine1.connect() sess = create_session() sess.bind_mapper(mymapper, sqlite_connection) # bind mymapper operations to a single SQLite connection sess.bind_table(email_addresses_table, engine2) # bind operations with the email_addresses_table to mysql
Normally, when a Session
is created via create_session()
with no arguments, the Session has no awareness of individual Engines
, and when mappers use the Session
to retrieve connections, the underlying MetaData
each Table
is associated with is expected to be "bound" to an Engine
, else no engine can be located and an exception is raised. A second form of create_session()
takes the argument bind_to=engine_or_connection
, where all SQL operations performed by this Session
use the single Engine
or Connection
(collectively known as a Connectable
) passed to the constructor. With bind_mapper()
and bind_table()
, the operations of individual mapper and/or tables are bound to distinct engines or connections, thereby overriding not only the engine which may be "bound" to the underlying MetaData
, but also the Engine
or Connection
which may have been passed to the create_session()
function. Configurations which interact with multiple explicit database connections at one time must use either or both of these methods in order to associate Session
operations with the appropriate connection resource.
Binding a Mapper
to a resource takes precedence over a Table
bind, meaning if mapper A is associated with table B, and the Session binds mapper A to connection X and table B to connection Y, an operation with mapper A will use connection X, not connection Y.
The update() method is used only with detached instances. A detached instance only exists if its Session
was cleared or closed, or the instance was expunge()
d from its session. update()
will re-attach the detached instance with this Session, bringing it back to the persistent state, and allowing any changes on the instance to be saved when the Session
is next flush
ed. If the instance is already attached to an existing Session
, an exception is raised.
A detached instance also can be automatically update
ed if it is associated with a parent object which specifies save-update
within its cascade
rules, and that parent is already attached or becomes attached to a Session. For more information on cascade
, see the next section.
The save_or_update()
method is a convenience method which will call the save()
or update()
methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached).
This method is a combination of the save()
and update()
methods, which will examine the given instance for a database identity (i.e. if it is transient or detached), and will call the implementation of save()
or update()
as appropriate. Use save_or_update()
to add unattached instances to a session when you're not sure if they were newly created or not. Like save()
and update()
, save_or_update()
cascades along the save-update
cascade indicator, described in the cascade
section below.
Feature Status: Alpha Implementation
merge()
is used to return the persistent version of an instance that is not attached to this Session. When passed an instance, if an instance with its database identity already exists within this Session, it is returned. If the instance does not exist in this Session, it is loaded from the database and then returned.
A future version of merge()
will also update the Session's instance with the state of the given instance (hence the name "merge").
This method is useful for bringing in objects which may have been restored from a serialization, such as those stored in an HTTP session:
# deserialize an object myobj = pickle.loads(mystring) # "merge" it. if the session already had this object in the # identity map, then you get back the one from the current session. myobj = session.merge(myobj)
Note that merge()
does not associate the given instance with the Session; it remains detached (or attached to whatever Session it was already attached to).
Feature Status: Alpha Implementation
Mappers support the concept of configurable cascade behavior on relation()
s. This behavior controls how the Session should treat the instances that have a parent-child relationship with another instance that is operated upon by the Session. Cascade is indicated as a comma-separated list of string keywords, with the possible values all
, delete
, save-update
, refresh-expire
, merge
, expunge
, and delete-orphan
.
Cascading is configured by setting the cascade
keyword argument on a relation()
:
mapper(Order, order_table, properties={ 'items' : relation(Item, items_table, cascade="all, delete-orphan"), 'customer' : relation(User, users_table, user_orders_table, cascade="save-update"), })
The above mapper specifies two relations, items
and customer
. The items
relationship specifies "all, delete-orphan" as its cascade
value, indicating that all save
, update
, merge
, expunge
, refresh
delete
and expire
operations performed on a parent Order
instance should also be performed on the child Item
instances attached to it (save
and update
are cascaded using the save_or_update()
method, so that the database identity of the instance doesn't matter). The delete-orphan
cascade value additionally indicates that if an Item
instance is no longer associated with an Order
, it should also be deleted. The "all, delete-orphan" cascade argument allows a so-called lifecycle relationship between an Order
and an Item
object.
The customer
relationship specifies only the "save-update" cascade value, indicating most operations will not be cascaded from a parent Order
instance to a child User
instance, except for if the Order
is attached with a particular session, either via the save()
, update()
, or save-update()
method.
Additionally, when a child item is attached to a parent item that specifies the "save-update" cascade value on the relationship, the child is automatically passed to save_or_update()
(and the operation is further cascaded to the child item).
Note that cascading doesn't do anything that isn't possible by manually calling Session methods on individual instances within a hierarchy, it merely automates common operations on a group of associated instances.
The default value for cascade
on relation()
s is save-update
, and the private=True
keyword argument is a synonym for cascade="all, delete-orphan"
.
SessionTransaction is a multi-engine transaction manager, which aggregates one or more Engine/Connection pairs and keeps track of a Transaction object for each one. As the Session receives requests to execute SQL statements, it uses the Connection that is referenced by the SessionTransaction. At commit time, the underyling Session is flushed, and each Transaction is the committed.
Example usage is as follows:
sess = create_session() trans = sess.create_transaction() try: item1 = sess.query(Item).get(1) item2 = sess.query(Item).get(2) item1.foo = 'bar' item2.bar = 'foo' except: trans.rollback() raise trans.commit()
The create_transaction()
method creates a new SessionTransaction object but does not declare any connection/transaction resources. At the point of the first get()
call, a connection resource is opened off the engine that corresponds to the Item classes' mapper and is stored within the SessionTransaction
with an open Transaction
. When trans.commit()
is called, the flush()
method is called on the Session
and the corresponding update statements are issued to the database within the scope of the transaction already opened; afterwards, the underying Transaction is committed, and connection resources are freed.
SessionTransaction
, like the Transaction
off of Connection
also supports "nested" behavior, and is safe to pass to other functions which then issue their own begin()
/commit()
pair; only the outermost begin()
/commit()
pair actually affects the transaction, and any call to rollback()
within a particular call stack will issue a rollback.
Note that while SessionTransaction is capable of tracking multiple transactions across multiple databases, it currently is in no way a fully functioning two-phase commit engine; generally, when dealing with multiple databases simultaneously, there is the distinct possibility that a transaction can succeed on the first database and fail on the second, which for some applications may be an invalid state. If this is an issue, its best to either refrain from spanning transactions across databases, or to look into some of the available technologies in this area, such as Zope which offers a two-phase commit engine; some users have already created their own SQLAlchemy/Zope hybrid implementations to deal with scenarios like these.
SessionTransaction Facts:
The SessionTransaction can interact with direct SQL queries in two general ways. Either specific Connection
objects can be associated with the SessionTransaction
, which are then useable both for direct SQL as well as within flush()
operations performed by the SessionTransaction
, or via accessing the Connection
object automatically referenced within the SessionTransaction
.
To associate a specific Connection
with the SessionTransaction
, use the add()
method:
connection = engine.connect() trans = session.create_transaction() try: trans.add(connection) connection.execute(mytable.update(), {'col1':4, 'col2':17}) session.flush() # flush() operation will use the same connection except: trans.rollback() raise trans.commit()
The add()
method will key the Connection
's underlying Engine
to this SessionTransaction
. When mapper operations are performed against this Engine
, the Connection
explicitly added will be used. This overrides any other Connection
objects that the underlying Session was associated with, corresponding to the underlying Engine
of that Connection
. However, if the SessionTransaction
itself is already associated with a Connection
, then an exception is thrown.
The other way is just to use the Connection
referenced by the SessionTransaction
. This is performed via the connection()
method, and requires passing in a class or Mapper
which indicates which underlying Connection
should be returned (recall that different Mappers
may use different underlying Engines
). If the class_or_mapper
argument is None
, then the Session
must be globally bound to a specific Engine
when it was constructed, else the method returns None
.
trans = session.create_transaction() try: connection = trans.connection(UserClass) # get the Connection used by the UserClass' Mapper connection.execute(mytable.update(), {'col1':4, 'col2':17}) except: trans.rollback() raise trans.commit()
The connection()
method also exists on the Session
object itself, and can be called regardless of whether or not a SessionTransaction
is in progress. If a SessionTransaction
is in progress, it will return the connection referenced by the transaction. If an Engine
is being used with threadlocal
strategy, the Connection
returned will correspond to the connection resources that are bound to the current thread, if any (i.e. it is obtained by calling contextual_connection()
).
The transactions issued by SessionTransaction
as well as internally by the Session
's flush()
operation use the same Transaction
object off of Connection
that is publically available. Recall that this object supports "nestable" behavior, meaning any number of actors can call begin()
off a particular Connection
object, and they will all be managed within the scope of a single transaction. Therefore, the flush()
operation can similarly take place within the scope of a regular Transaction
:
connection = engine.connect() # Connection session = create_session(bind_to=connection) # Session bound to the Connection trans = connection.begin() # start transaction try: stuff = session.query(MyClass).select() # Session operation uses connection stuff[2].foo = 'bar' connection.execute(mytable.insert(), dict(id=12, value="bar")) # use connection explicitly session.flush() # Session flushes with "connection", using transaction "trans" except: trans.rollback() # or rollback raise trans.commit() # commit
The session module can log an extensive display of its "flush plans", which is a graph of its internal representation of objects before they are written to the database. To turn this logging on:
# make an Session with echo_uow session = create_session(echo_uow=True)
The flush()
operation will then dump to the standard output displays like the following:
Task dump: UOWTask(6034768, 'User/users/None') | |- Save User(6016624) | |-Process User(6016624).addresses | |- UOWTask(6034832, 'Address/email_addresses/None') | |- Save Address(6034384) | |- Save Address(6034256) | |---- | |----
The above graph can be read straight downwards to determine the order of operations. It indicates "save User 6016624, process each element in the 'addresses' list on User 6016624, save Address 6034384, Address 6034256".
Of course, one can also get a good idea of the order of operations just by logging the actual SQL statements executed.
This section details all the options available to Mappers, as well as advanced patterns.
To start, heres the tables we will work with again:
from sqlalchemy import * metadata = MetaData() # a table to store users users_table = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(40)), Column('password', String(80)) ) # a table that stores mailing addresses associated with a specific user addresses_table = Table('addresses', metadata, Column('address_id', Integer, primary_key = True), Column('user_id', Integer, ForeignKey("users.user_id")), Column('street', String(100)), Column('city', String(80)), Column('state', String(2)), Column('zip', String(10)) ) # a table that stores keywords keywords_table = Table('keywords', metadata, Column('keyword_id', Integer, primary_key = True), Column('name', VARCHAR(50)) ) # a table that associates keywords with users userkeywords_table = Table('userkeywords', metadata, Column('user_id', INT, ForeignKey("users")), Column('keyword_id', INT, ForeignKey("keywords")) )
When mappers are constructed, by default the column names in the Table metadata are used as the names of attributes on the mapped class. This can be customzed within the properties by stating the key/column combinations explicitly:
user_mapper = mapper(User, users_table, properties={ 'id' : users_table.c.user_id, 'name' : users_table.c.user_name, })
In the situation when column names overlap in a mapper against multiple tables, columns may be referenced together with a list:
# join users and addresses usersaddresses = sql.join(users_table, addresses_table, users_table.c.user_id == addresses_table.c.user_id) m = mapper(User, usersaddresses, properties = { 'id' : [users_table.c.user_id, addresses_table.c.user_id], } )
A common request is the ability to create custom class properties that override the behavior of setting/getting an attribute. Currently, the easiest way to do this in SQLAlchemy is how it would be done in any Python program; define your attribute with a different name, such as "_attribute", and use a property to get/set its value. The mapper just needs to be told of the special name:
class MyClass(object): def _set_email(self, email): self._email = email def _get_email(self): return self._email email = property(_get_email, _set_email) mapper(MyClass, mytable, properties = { # map the '_email' attribute to the "email" column # on the table '_email': mytable.c.email })
It is also possible to route the the select_by
and get_by
functions on Query
using the new property name, by establishing a synonym
:
mapper(MyClass, mytable, properties = { # map the '_email' attribute to the "email" column # on the table '_email': mytable.c.email, # make a synonym 'email' 'email' : synonym('_email') }) # now you can select_by(email) result = session.query(MyClass).select_by(email='john@smith.com')
Synonym can be established with the flag "proxy=True", to create a class-level proxy to the actual property. This has the effect of creating a fully functional synonym on class instances:
mapper(MyClass, mytable, properties = { '_email': mytable.c.email 'email' : synonym('_email', proxy=True) }) x = MyClass() x.email = 'john@doe.com' >>> x._email 'john@doe.com'
Feature Status: Alpha API
A one-to-many or many-to-many relationship results in a list-holding element being attached to all instances of a class. The actual list is an "instrumented" list, which transparently maintains a relationship to a plain Python list. The implementation of the underlying plain list can be changed to be any object that implements a list
-style append
and __iter__
method. A common need is for a list-based relationship to actually be a dictionary. This can be achieved by subclassing dict
to have list
-like behavior.
In this example, a class MyClass
is defined, which is associated with a parent object MyParent
. The collection of MyClass
objects on each MyParent
object will be a dictionary, storing each MyClass
instance keyed to its name
attribute.
# a class to be stored in the list class MyClass(object): def __init__(self, name): self.name = name # create a dictionary that will act like a list, and store # instances of MyClass class MyDict(dict): def append(self, item): self[item.name] = item def __iter__(self): return self.values() # parent class class MyParent(object): pass # mappers, constructed normally mapper(MyClass, myclass_table) mapper(MyParent, myparent_table, properties={ 'myclasses' : relation(MyClass, collection_class=MyDict) }) # elements on 'myclasses' can be accessed via string keyname myparent = MyParent() myparent.myclasses.append(MyClass('this is myclass')) myclass = myparent.myclasses['this is myclass']
When creating relations on a mapper, most examples so far have illustrated the mapper and relationship joining up based on the foreign keys of the tables they represent. in fact, this "automatic" inspection can be completely circumvented using the primaryjoin
and secondaryjoin
arguments to relation
, as in this example which creates a User object which has a relationship to all of its Addresses which are in Boston:
class User(object): pass class Address(object): pass mapper(Address, addresses_table) mapper(User, users_table, properties={ 'boston_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='Boston')) })
Many to many relationships can be customized by one or both of primaryjoin
and secondaryjoin
, shown below with just the default many-to-many relationship explicitly set:
class User(object): pass class Keyword(object): pass mapper(Keyword, keywords_table) mapper(User, users_table, properties={ 'keywords':relation(Keyword, secondary=userkeywords_table, primaryjoin=users_table.c.user_id==userkeywords_table.c.user_id, secondaryjoin=userkeywords_table.c.keyword_id==keywords_table.c.keyword_id ) })
The previous example leads in to the idea of joining against the same table multiple times. Below is a User object that has lists of its Boston and New York addresses:
mapper(User, users_table, properties={ 'boston_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='Boston')), 'newyork_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='New York')), })
Both lazy and eager loading support multiple joins equally well.
This feature allows particular columns of a table to not be loaded by default, instead being loaded later on when first referenced. It is essentailly "column-level lazy loading". This feature is useful when one wants to avoid loading a large text or binary field into memory when its not needed. Individual columns can be lazy loaded by themselves or placed into groups that lazy-load together.
book_excerpts = Table('books', db, Column('book_id', Integer, primary_key=True), Column('title', String(200), nullable=False), Column('summary', String(2000)), Column('excerpt', String), Column('photo', Binary) ) class Book(object): pass # define a mapper that will load each of 'excerpt' and 'photo' in # separate, individual-row SELECT statements when each attribute # is first referenced on the individual object instance mapper(Book, book_excerpts, properties = { 'excerpt' : deferred(book_excerpts.c.excerpt), 'photo' : deferred(book_excerpts.c.photo) })
Deferred columns can be placed into groups so that they load together:
book_excerpts = Table('books', db, Column('book_id', Integer, primary_key=True), Column('title', String(200), nullable=False), Column('summary', String(2000)), Column('excerpt', String), Column('photo1', Binary), Column('photo2', Binary), Column('photo3', Binary) ) class Book(object): pass # define a mapper with a 'photos' deferred group. when one photo is referenced, # all three photos will be loaded in one SELECT statement. The 'excerpt' will # be loaded separately when it is first referenced. mapper(Book, book_excerpts, properties = { 'excerpt' : deferred(book_excerpts.c.excerpt), 'photo1' : deferred(book_excerpts.c.photo1, group='photos'), 'photo2' : deferred(book_excerpts.c.photo2, group='photos'), 'photo3' : deferred(book_excerpts.c.photo3, group='photos') })
You can defer or undefer columns at the Query
level with the options
method:
query = session.query(Book) query.options(defer('summary')).select() query.options(undefer('excerpt')).select()
SQLAlchemy relations are generally simplistic; the lazy loader loads in the full list of child objects when accessed, and the eager load builds a query that loads the full list of child objects. Additionally, when you are deleting a parent object, SQLAlchemy ensures that it has loaded the full list of child objects so that it can mark them as deleted as well (or to update their parent foreign key to NULL). It does not issue an en-masse "delete from table where parent_id=?" type of statement in such a scenario. This is because the child objects themselves may also have further dependencies, and additionally may also exist in the current session in which case SA needs to know their identity so that their state can be properly updated.
So there are several techniques that can be used individually or combined together to address these issues, in the context of a large collection where you normally would not want to load the full list of relationships:
Use lazy=None
to disable child object loading (i.e. noload)
mapper(MyClass, table, properties=relation{ 'children':relation(MyOtherClass, lazy=None) })
To load child objects, just use a query:
class Organization(object): def __init__(self, name): self.name = name def find_members(self, criterion): """locate a subset of the members associated with this Organization""" return object_session(self).query(Member).select(and_(member_table.c.name.like(criterion), org_table.c.org_id==self.org_id), from_obj=[org_table.join(member_table)])
Use passive_deletes=True
to disable child object loading on a DELETE operation, in conjunction with "ON DELETE (CASCADE|SET NULL)" on your database to automatically cascade deletes to child objects. Note that "ON DELETE" is not supported on SQLite, and requires InnoDB
tables when using MySQL:
mytable = Table('mytable', meta, Column('id', Integer, primary_key=True), ) myothertable = Table('myothertable', meta, Column('id', Integer, primary_key=True), Column('parent_id', Integer), ForeignKeyConstraint(['parent_id'],['mytable.id'], ondelete="CASCADE"), ) mmapper(MyOtherClass, myothertable) mapper(MyClass, mytable, properties={ 'children':relation(MyOtherClass, passive_deletes=True) })
As an alternative to using "ON DELETE CASCADE", for very simple scenarios you can create a simple MapperExtension
that will issue a DELETE for child objects before the parent object is deleted:
class DeleteMemberExt(MapperExtension): def before_delete(self, mapper, connection, instance): connection.execute(member_table.delete(member_table.c.org_id==instance.org_id)) mapper(Organization, org_table, extension=DeleteMemberExt(), properties = { 'members' : relation(Member, lazy=None, passive_deletes=True, cascade="all, delete-orphan") })
Note that this approach is not nearly as efficient or general-purpose as "ON DELETE CASCADE", since the database itself can cascade the operation along any number of tables.
The latest distribution includes an example examples/collection/large_collection.py
which illustrates most of these techniques.
Options which can be sent to the relation()
function. For arguments to mapper()
, see Mapper Options.
backref()
construct for more configurability. See Backreferences.
remote_side
to indicate the direction of self-referential relationships. For ForeignKey
specification, this field is only partially functional (i.e. does not work for many-to-many relationships), but is needed in rare cases when both sides of the primaryjoin condition contain foreign keys to either side of the relationship, and relation()
cannot determine which side of the relationship is "foreign". a new field foreign_keys
will be added in a future release to fully implement this functionality.
flush()
process, which normally occurs in order to locate all existing child items when a parent item is to be deleted. Setting this flag to True is appropriate when ON DELETE CASCADE
rules have been set up on the actual tables so that the database may handle cascading deletes automatically. This strategy is useful particularly for handling the deletion of objects that have very large (and/or deep) child-object collections. See the example in Working with Large Collections.
flush()
operation returns an error that a "cyclical dependency" was detected, this is a cue that you might want to use post_update
to "break" the cycle.
private=True
is the equivalent of setting cascade="all, delete-orphan"
, and indicates the lifecycle of child objects should be contained within that of the parent. See the example in Lifecycle Relations.
secondary
keyword argument should generally only be used for a table that is not otherwise expressed in any class mapping. In particular, using the Association Object Pattern is generally mutually exclusive against using the secondary
keyword argument.
relation()
, based on the type and direction of the relationship - one to many forms a list, many to one forms a scalar, many to many is a list. If a scalar is desired where normally a list would be present, such as a bi-directional one-to-one relationship, set uselist to False.
By default, mappers will attempt to ORDER BY the "oid" column of a table, or the primary key column, when selecting rows. This can be modified in several ways.
The "order_by" parameter can be sent to a mapper, overriding the per-engine ordering if any. A value of None means that the mapper should not use any ordering. A non-None value, which can be a column, an asc
or desc
clause, or an array of either one, indicates the ORDER BY clause that should be added to all select queries:
# disable all ordering mapper = mapper(User, users_table, order_by=None) # order by a column mapper = mapper(User, users_table, order_by=users_tableusers_table.c.user_id) # order by multiple items mapper = mapper(User, users_table, order_by=[users_table.c.user_id, desc(users_table.c.user_name)])
"order_by" can also be specified to an individual select
method, overriding all other per-engine/per-mapper orderings:
# order by a column l = mapper.select(users_table.c.user_name=='fred', order_by=users_table.c.user_id) # order by multiple criterion l = mapper.select(users_table.c.user_name=='fred', order_by=[users_table.c.user_id, desc(users_table.c.user_name)])
For relations, the "order_by" property can also be specified to all forms of relation:
# order address objects by address id mapper = mapper(User, users_table, properties = { 'addresses' : relation(mapper(Address, addresses_table), order_by=addresses_table.c.address_id) }) # eager load with ordering - the ORDER BY clauses of parent/child will be organized properly mapper = mapper(User, users_table, properties = { 'addresses' : relation(mapper(Address, addresses_table), order_by=desc(addresses_table.c.email_address), eager=True) }, order_by=users_table.c.user_id)
You can limit rows in a regular SQL query by specifying limit
and offset
. A Mapper can handle the same concepts:
class User(object): pass mapper(User, users_table) sqlr = session.query(User).select(limit=20, offset=10)
However, things get tricky when dealing with eager relationships, since a straight LIMIT of rows does not represent the count of items when joining against other tables to load related items as well. So here is what SQLAlchemy will do when you use limit or offset with an eager relationship:
class User(object): pass class Address(object): pass mapper(User, users_table, properties={ 'addresses' : relation(mapper(Address, addresses_table), lazy=False) }) r = session.query(User).select(User.c.user_name.like('F%'), limit=20, offset=10)
The main WHERE clause as well as the limiting clauses are coerced into a subquery; this subquery represents the desired result of objects. A containing query, which handles the eager relationships, is joined against the subquery to produce the result.
Feature Status: Alpha Implementation
Inheritance in databases comes in three forms: single table inheritance, where several types of classes are stored in one table, concrete table inheritance, where each type of class is stored in its own table, and multiple table inheritance, where the parent/child classes are stored in their own tables that are joined together in a select.
There is also a concept of polymorphic
loading, which indicates if multiple kinds of classes can be loaded in one pass.
SQLAlchemy supports all three kinds of inheritance. Additionally, true polymorphic
loading is supported in a straightfoward way for single table inheritance, and has some more manually-configured features that can make it happen for concrete and multiple table inheritance.
Working examples of polymorphic inheritance come with the distribution in the directory examples/polymorphic
.
Here are the classes we will use to represent an inheritance relationship:
class Employee(object): def __init__(self, name): self.name = name def __repr__(self): return self.__class__.__name__ + " " + self.name class Manager(Employee): def __init__(self, name, manager_data): self.name = name self.manager_data = manager_data def __repr__(self): return self.__class__.__name__ + " " + self.name + " " + self.manager_data class Engineer(Employee): def __init__(self, name, engineer_info): self.name = name self.engineer_info = engineer_info def __repr__(self): return self.__class__.__name__ + " " + self.name + " " + self.engineer_info
Each class supports a common name
attribute, while the Manager
class has its own attribute manager_data
and the Engineer
class has its own attribute engineer_info
.
This will support polymorphic loading via the Employee
mapper.
employees_table = Table('employees', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('manager_data', String(50)), Column('engineer_info', String(50)), Column('type', String(20)) ) employee_mapper = mapper(Employee, employees_table, polymorphic_on=employees_table.c.type) manager_mapper = mapper(Manager, inherits=employee_mapper, polymorphic_identity='manager') engineer_mapper = mapper(Engineer, inherits=employee_mapper, polymorphic_identity='engineer')
Without polymorphic loading, you just define a separate mapper for each class.
managers_table = Table('managers', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('manager_data', String(50)), ) engineers_table = Table('engineers', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('engineer_info', String(50)), ) manager_mapper = mapper(Manager, managers_table) engineer_mapper = mapper(Engineer, engineers_table)
With polymorphic loading, the SQL query to do the actual polymorphic load must be constructed, usually as a UNION. There is a helper function to create these UNIONS called polymorphic_union
.
pjoin = polymorphic_union({ 'manager':managers_table, 'engineer':engineers_table }, 'type', 'pjoin') employee_mapper = mapper(Employee, pjoin, polymorphic_on=pjoin.c.type) manager_mapper = mapper(Manager, managers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='manager') engineer_mapper = mapper(Engineer, engineers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='engineer')
A future release of SQLALchemy might better merge the generated UNION into the mapper construction phase.
Like concrete table inheritance, this can be done non-polymorphically, or with a little more complexity, polymorphically:
employees = Table('employees', metadata, Column('person_id', Integer, primary_key=True), Column('name', String(50)), Column('type', String(30))) engineers = Table('engineers', metadata, Column('person_id', Integer, ForeignKey('employees.person_id'), primary_key=True), Column('engineer_info', String(50)), ) managers = Table('managers', metadata, Column('person_id', Integer, ForeignKey('employees.person_id'), primary_key=True), Column('manager_data', String(50)), ) person_mapper = mapper(Employee, employees) mapper(Engineer, engineers, inherits=person_mapper) mapper(Manager, managers, inherits=person_mapper)
Polymorphic:
person_join = polymorphic_union( { 'engineer':employees.join(engineers), 'manager':employees.join(managers), 'person':employees.select(employees.c.type=='person'), }, None, 'pjoin') person_mapper = mapper(Employee, employees, select_table=person_join, polymorphic_on=person_join.c.type, polymorphic_identity='person') mapper(Engineer, engineers, inherits=person_mapper, polymorphic_identity='engineer') mapper(Manager, managers, inherits=person_mapper, polymorphic_identity='manager')
The join condition in a multiple table inheritance relationship can be specified explicitly, using inherit_condition
:
AddressUser.mapper = mapper( AddressUser, addresses_table, inherits=User.mapper, inherit_condition=users_table.c.user_id==addresses_table.c.user_id )
Mappers can be constructed against arbitrary relational units (called Selectables
) as well as plain Tables
. For example, The join
keyword from the SQL package creates a neat selectable unit comprised of multiple tables, complete with its own composite primary key, which can be passed in to a mapper as the table.
# a class class AddressUser(object): pass # define a Join j = join(users_table, addresses_table) # map to it - the identity of an AddressUser object will be # based on (user_id, address_id) since those are the primary keys involved m = mapper(AddressUser, j, properties={ 'user_id':[users_table.c.user_id, addresses_table.c.user_id] })
A second example:
# many-to-many join on an association table j = join(users_table, userkeywords, users_table.c.user_id==userkeywords.c.user_id).join(keywords, userkeywords.c.keyword_id==keywords.c.keyword_id) # a class class KeywordUser(object): pass # map to it - the identity of a KeywordUser object will be # (user_id, keyword_id) since those are the primary keys involved m = mapper(KeywordUser, j, properties={ 'user_id':[users_table.c.user_id, userkeywords.c.user_id], 'keyword_id':[userkeywords.c.keyword_id, keywords.c.keyword_id] })
In both examples above, "composite" columns were added as properties to the mappers; these are aggregations of multiple columns into one mapper property, which instructs the mapper to keep both of those columns set at the same value.
Similar to mapping against a join, a plain select() object can be used with a mapper as well. Below, an example select which contains two aggregate functions and a group_by is mapped to a class:
s = select([customers, func.count(orders).label('order_count'), func.max(orders.price).label('highest_order')], customers.c.customer_id==orders.c.customer_id, group_by=[c for c in customers.c] ) class Customer(object): pass m = mapper(Customer, s)
Above, the "customers" table is joined against the "orders" table to produce a full row for each customer row, the total count of related rows in the "orders" table, and the highest price in the "orders" table, grouped against the full set of columns in the "customers" table. That query is then mapped against the Customer class. New instances of Customer will contain attributes for each column in the "customers" table as well as an "order_count" and "highest_order" attribute. Updates to the Customer object will only be reflected in the "customers" table and not the "orders" table. This is because the primary keys of the "orders" table are not represented in this mapper and therefore the table is not affected by save or delete operations.
The first mapper created for a certain class is known as that class's "primary mapper." Other mappers can be created as well, these come in two varieties.
non_primary=True
, and represents a load-only mapper. Objects that are loaded with a secondary mapper will have their save operation processed by the primary mapper. It is also invalid to add new relation()
s to a non-primary mapper. To use this mapper with the Session, specify it to the query
method:
example:
# primary mapper mapper(User, users_table) # make a secondary mapper to load User against a join othermapper = mapper(User, users_table.join(someothertable), non_primary=True) # select result = session.query(othermapper).select()
entity_name
parameter. Instances loaded with this mapper will be totally managed by this new mapper and have no connection to the original one. Most methods on Session
include an optional entity_name
parameter in order to specify this condition.
example:
# primary mapper mapper(User, users_table) # make an entity name mapper that stores User objects in another table mapper(User, alternate_users_table, entity_name='alt') # make two User objects user1 = User() user2 = User() # save one in in the "users" table session.save(user1) # save the other in the "alternate_users_table" session.save(user2, entity_name='alt') session.flush() # select from the alternate mapper session.query(User, entity_name='alt').select()
A self-referential mapper is a mapper that is designed to operate with an adjacency list table. This is a table that contains one or more foreign keys back to itself, and is usually used to create hierarchical tree structures. SQLAlchemy's default model of saving items based on table dependencies is not sufficient in this case, as an adjacency list table introduces dependencies between individual rows. Fortunately, SQLAlchemy will automatically detect a self-referential mapper and do the extra lifting to make it work.
# define a self-referential table trees = Table('treenodes', engine, Column('node_id', Integer, primary_key=True), Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True), Column('node_name', String(50), nullable=False), ) # treenode class class TreeNode(object): pass # mapper defines "children" property, pointing back to TreeNode class, # with the mapper unspecified. it will point back to the primary # mapper on the TreeNode class. TreeNode.mapper = mapper(TreeNode, trees, properties={ 'children' : relation( TreeNode, cascade="all" ), } )
This kind of mapper goes through a lot of extra effort when saving and deleting items, to determine the correct dependency graph of nodes within the tree.
A self-referential mapper where there is more than one relationship on the table requires that all join conditions be explicitly spelled out. Below is a self-referring table that contains a "parent_node_id" column to reference parent/child relationships, and a "root_node_id" column which points child nodes back to the ultimate root node:
# define a self-referential table with several relations trees = Table('treenodes', engine, Column('node_id', Integer, primary_key=True), Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True), Column('root_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True), Column('node_name', String(50), nullable=False), ) # treenode class class TreeNode(object): pass # define the "children" property as well as the "root" property TreeNode.mapper = mapper(TreeNode, trees, properties={ 'children' : relation( TreeNode, primaryjoin=trees.c.parent_node_id==trees.c.node_id cascade="all", backref=backref("parent", remote_side=[trees.c.node_id]) ), 'root' : relation( TreeNode, primaryjoin=trees.c.root_node_id=trees.c.node_id, remote_side=[trees.c.node_id], uselist=False ) } )
The "root" property on a TreeNode is a many-to-one relationship. By default, a self-referential mapper declares relationships as one-to-many, so the extra parameter remote_side
, pointing to a column or list of columns on the remote side of a relationship, is needed to indicate a "many-to-one" self-referring relationship (note the previous keyword argument foreignkey
is deprecated).
Both TreeNode examples above are available in functional form in the examples/adjacencytree
directory of the distribution.
Take any result set and feed it into a Query to produce objects. Multiple mappers can be combined to retrieve unrelated objects from the same row in one step. The instances
method on Query takes a ResultProxy object, which is the result type generated from SQLEngine, and delivers object instances. (note: this method has been moved off of Mapper, where it is deprecated).
class User(object): pass mapper(User, users_table) # select users c = users_table.select().execute() # get objects userlist = session.query(User).instances(c) # define a second class/mapper class Address(object): pass mapper(Address, addresses_table) # select users and addresses in one query # use_labels is so that the user_id column in both tables are distinguished s = select([users_table, addresses_table], users_table.c.user_id==addresses_table.c.user_id, use_labels=True) # execute it, and process the results with the User mapper, chained to the Address mapper r = session.query(User).instances(s.execute(), class_mapper(Address)) # result rows are an array of objects, one for each mapper used for entry in r: user = r[0] address = r[1]
When result-set mapping is used with a particular Mapper, SQLAlchemy has no access to the query being used and therefore has no way of tacking on its own LEFT OUTER JOIN
conditions that are normally used to eager load relationships. If the query being constructed is created in such a way that it returns rows not just from a parent table (or tables) but also returns rows from child tables, the result-set mapping can be notified as to which additional properties are contained within the result set. This is done using the contains_eager()
query option, which specifies the name of the relationship to be eagerly loaded, and optionally a decorator function that can translate aliased column names when results are received.
# mapping is the users->addresses mapping mapper(User, users_table, properties={ 'addresses':relation(Address, addresses_table, lazy=False) }) # define a query on USERS with an outer join to ADDRESSES statement = users_table.outerjoin(addresses_table).select(use_labels=True) # construct a Query object which expects the "addresses" results query = session.query(User).options(contains_eager('addresses')) # get results normally r = query.instances(statement.execute())
It is often the case with large queries that some of the tables within the query need to be aliased in order to distinguish them from other occurences of the same table within the query. A query that attempts to add eagerly loaded child items will often have this condition. Therefore with a little more effort a decorator function can be used to produce translated rows (in the form of a dictionary which accepts Column instances), in the case that aliasing is used for the relationship tables.
# use an alias of the addresses table adalias = addresses_table.alias('adalias') # define a query on USERS with an outer join to adalias statement = users_table.outerjoin(adalias).select(use_labels=True) # define row-translation function. this should return # a dictionary-like object that will receive Column instances from the normally expected # table (i.e. addreses_table), and produce results from the actual result set def adtranslator(row): d = {} for c in addresses_table.columns: d[c] = row[adalias.corresponding_column(addresses_table)] return d # construct a Query object which expects the "addresses" results query = session.query(User).options(contains_eager('addresses', decorator=adtranslator)) # get results normally sqlr = query.instances(statement.execute())
Options which can be sent to the mapper()
function. For arguments to relation()
, see Relation Options.
relation()
which has the same name as a column in the mapped table. The table column will no longer be mapped.
MapperExtension
objects are used to attach logic to before_insert()
, before_update()
, etc., and the user-defined logic requires that the full persistence of each instance must be completed before moving onto the next (such as logic which queries the tables for the most recent ID). Note that this flag has a significant impact on the efficiency of a large save operation.
column_prefix='_'
is equivalent to defining all column-based properties as _columnname=table.c.columnname
. See Overriding Column Names for information on overriding column-based attribute names.
inherits
to be set.
session.query(somemapper)
. Note that it is usually invalid to define additional relationships on a non_primary mapper as they will conflict with those of the primary. See Multiple Mappers for One Class.
polymorphic_identity
value to be set for all mappers in the inheritance hierarchy.
polymorphic_on
, corresponding to the "class identity" of this mapper. See Mapping a Class with Table Inheritance.
local_table
of the mapper combined against any inherited tables. When this argument is specified, the primary keys of the mapped table if any are disregarded in place of the columns given. This can be used to provide primary key identity to a table that has no PKs defined at the schema level, or to modify what defines "identity" for a particular table.
Selectable
which will take the place of the Mapper
's main table argument when
performing queries.
Mappers can have functionality augmented or replaced at many points in its execution via the usage of the MapperExtension class. This class is just a series of "hooks" where various functionality takes place. An application can make its own MapperExtension objects, overriding only the methods it needs. Methods that are not overridden return the special value sqlalchemy.orm.mapper.EXT_PASS
, which indicates the operation should proceed as normally.
class MapperExtension(object): """base implementation for an object that provides overriding behavior to various Mapper functions. For each method in MapperExtension, a result of EXT_PASS indicates the functionality is not overridden.""" def get_session(self): """called to retrieve a contextual Session instance with which to register a new object. Note: this is not called if a session is provided with the __init__ params (i.e. _sa_session)""" return EXT_PASS def select_by(self, query, *args, **kwargs): """overrides the select_by method of the Query object""" return EXT_PASS def select(self, query, *args, **kwargs): """overrides the select method of the Query object""" return EXT_PASS def create_instance(self, mapper, selectcontext, row, class_): """called when a new object instance is about to be created from a row. the method can choose to create the instance itself, or it can return None to indicate normal object creation should take place. mapper - the mapper doing the operation selectcontext - SelectionContext corresponding to the instances() call row - the result row from the database class_ - the class we are mapping. """ return EXT_PASS def append_result(self, mapper, selectcontext, row, instance, identitykey, result, isnew): """called when an object instance is being appended to a result list. If this method returns EXT_PASS, it is assumed that the mapper should do the appending, else if this method returns any other value or None, it is assumed that the append was handled by this method. mapper - the mapper doing the operation selectcontext - SelectionContext corresponding to the instances() call row - the result row from the database instance - the object instance to be appended to the result identitykey - the identity key of the instance result - list to which results are being appended isnew - indicates if this is the first time we have seen this object instance in the current result set. if you are selecting from a join, such as an eager load, you might see the same object instance many times in the same result set. """ return EXT_PASS def populate_instance(self, mapper, selectcontext, row, instance, identitykey, isnew): """called right before the mapper, after creating an instance from a row, passes the row to its MapperProperty objects which are responsible for populating the object's attributes. If this method returns EXT_PASS, it is assumed that the mapper should do the appending, else if this method returns any other value or None, it is assumed that the append was handled by this method. Essentially, this method is used to have a different mapper populate the object: def populate_instance(self, mapper, selectcontext, instance, row, identitykey, isnew): othermapper.populate_instance(selectcontext, instance, row, identitykey, isnew, frommapper=mapper) return True """ return EXT_PASS def before_insert(self, mapper, connection, instance): """called before an object instance is INSERTed into its table. this is a good place to set up primary key values and such that arent handled otherwise.""" return EXT_PASS def before_update(self, mapper, connection, instance): """called before an object instnace is UPDATED""" return EXT_PASS def after_update(self, mapper, connection, instance): """called after an object instnace is UPDATED""" return EXT_PASS def after_insert(self, mapper, connection, instance): """called after an object instance has been INSERTed""" return EXT_PASS def before_delete(self, mapper, connection, instance): """called before an object instance is DELETEed""" return EXT_PASS def after_delete(self, mapper, connection, instance): """called after an object instance is DELETEed""" return EXT_PASS
To use MapperExtension, make your own subclass of it and just send it off to a mapper:
m = mapper(User, users_table, extension=MyExtension())
Multiple extensions will be chained together and processed in order; they are specified as a list:
m = mapper(User, users_table, extension=[ext1, ext2, ext3])
The package sqlalchemy.types
defines the datatype identifiers which may be used when defining metadata. This package includes a set of generic types, a set of SQL-specific subclasses of those types, and a small extension system used by specific database connectors to adapt these generic types into database-specific type objects.
SQLAlchemy comes with a set of standard generic datatypes, which are defined as classes.
The standard set of generic types are:
class String(TypeEngine): def __init__(self, length=None) class Integer(TypeEngine) class SmallInteger(Integer) class Numeric(TypeEngine): def __init__(self, precision=10, length=2) class Float(Numeric): def __init__(self, precision=10) # DateTime, Date and Time types deal with datetime objects from the Python datetime module class DateTime(TypeEngine) class Date(TypeEngine) class Time(TypeEngine) class Binary(TypeEngine): def __init__(self, length=None) class Boolean(TypeEngine) # converts unicode strings to raw bytes # as bind params, raw bytes to unicode as # rowset values, using the unicode encoding # setting on the engine (defaults to 'utf-8') class Unicode(TypeDecorator): impl = String # uses the pickle protocol to serialize data # in/out of Binary columns class PickleType(TypeDecorator): impl = Binary
More specific subclasses of these types are available, which various database engines may choose to implement specifically, allowing finer grained control over types:
class FLOAT(Numeric) class TEXT(String) class DECIMAL(Numeric) class INT(Integer) INTEGER = INT class TIMESTAMP(DateTime) class DATETIME(DateTime) class CLOB(String) class VARCHAR(String) class CHAR(String) class BLOB(Binary) class BOOLEAN(Boolean)
When using a specific database engine, these types are adapted even further via a set of database-specific subclasses defined by the database engine. There may eventually be more type objects that are defined for specific databases. An example of this would be Postgres' Array type.
Type objects are specified to table meta data using either the class itself, or an instance of the class. Creating an instance of the class allows you to specify parameters for the type, such as string length, numerical precision, etc.:
mytable = Table('mytable', engine, # define type using a class Column('my_id', Integer, primary_key=True), # define type using an object instance Column('value', Number(7,4)) )
User-defined types can be created, to support either database-specific types, or customized pre-processing of query parameters as well as post-processing of result set data. You can make your own classes to perform these operations. To augment the behavior of a TypeEngine
type, such as String
, the TypeDecorator
class is used:
import sqlalchemy.types as types class MyType(types.TypeDecorator): """basic type that decorates String, prefixes values with "PREFIX:" on the way in and strips it off on the way out.""" impl = types.String def convert_bind_param(self, value, engine): return "PREFIX:" + value def convert_result_value(self, value, engine): return value[7:]
The Unicode
and PickleType
classes are instances of TypeDecorator
already and can be subclassed directly.
To build a type object from scratch, which will not have a corresponding database-specific implementation, subclass TypeEngine
:
import sqlalchemy.types as types class MyType(types.TypeEngine): def __init__(self, precision = 8): self.precision = precision def get_col_spec(self): return "MYTYPE(%s)" % self.precision def convert_bind_param(self, value, engine): return value def convert_result_value(self, value, engine): return value
This section describes the connection pool module of SQLAlchemy. The Pool
object it provides is normally embedded within an Engine
instance. For most cases, explicit access to the pool module is not required. However, the Pool
object can be used on its own, without the rest of SA, to manage DBAPI connections; this section describes that usage. Also, this section will describe in more detail how to customize the pooling strategy used by an Engine
.
At the base of any database helper library is a system of efficiently acquiring connections to the database. Since the establishment of a database connection is typically a somewhat expensive operation, an application needs a way to get at database connections repeatedly without incurring the full overhead each time. Particularly for server-side web applications, a connection pool is the standard way to maintain a "pool" of database connections which are used over and over again among many requests. Connection pools typically are configured to maintain a certain "size", which represents how many connections can be used simultaneously without resorting to creating more newly-established connections.
Any DBAPI module can be "proxied" through the connection pool using the following technique (note that the usage of 'psycopg2' is just an example; substitute whatever DBAPI module you'd like):
import sqlalchemy.pool as pool import psycopg2 as psycopg psycopg = pool.manage(psycopg) # then connect normally connection = psycopg.connect(database='test', username='scott', password='tiger')
This produces a sqlalchemy.pool.DBProxy
object which supports the same connect()
function as the original DBAPI module. Upon connection, a connection proxy object is returned, which delegates its calls to a real DBAPI connection object. This connection object is stored persistently within a connection pool (an instance of sqlalchemy.pool.Pool
) that corresponds to the exact connection arguments sent to the connect()
function.
The connection proxy supports all of the methods on the original connection object, most of which are proxied via __getattr__()
. The close()
method will return the connection to the pool, and the cursor()
method will return a proxied cursor object. Both the connection proxy and the cursor proxy will also return the underlying connection to the pool after they have both been garbage collected, which is detected via the __del__()
method.
Additionally, when connections are returned to the pool, a rollback()
is issued on the connection unconditionally. This is to release any locks still held by the connection that may have resulted from normal activity.
By default, the connect()
method will return the same connection that is already checked out in the current thread. This allows a particular connection to be used in a given thread without needing to pass it around between functions. To disable this behavior, specify use_threadlocal=False
to the manage()
function.
For all types of Pool construction, which includes the "transparent proxy" described in the previous section, using an Engine
via create_engine()
, or constructing a pool through direct class instantiation, the options are generally the same. Additional options may be available based on the specific subclass of Pool
being used.
For a description of all pool classes, see the generated documentation.
Common options include:
echo_pool
.
create_engine()
, corresponding to the "plain" or
"threadlocal" connection strategy.
QueuePool options include:
Besides using the transparent proxy, instances of sqlalchemy.pool.Pool
can be created directly. Constructing your own pool involves passing a callable used to create a connection. Through this method, custom connection schemes can be made, such as a connection that automatically executes some initialization commands to start.
import sqlalchemy.pool as pool import psycopg2 def getconn(): c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test') # execute an initialization function on the connection before returning c.cursor.execute("setup_encodings()") return c p = pool.QueuePool(getconn, max_overflow=10, pool_size=5, use_threadlocal=True)
Or with SingletonThreadPool:
import sqlalchemy.pool as pool import sqlite def getconn(): return sqlite.connect(filename='myfile.db') # SQLite connections require the SingletonThreadPool p = pool.SingletonThreadPool(getconn)
SQLAlchemy has a variety of extensions and "mods" available which provide extra functionality to SA, either via explicit usage or by augmenting the core behavior. Several of these extensions are designed to work together.
Author: Daniel Miller
This plugin is used to instantiate and manage Session objects. It is the preferred way to provide thread-local session functionality to an application. It provides several services:
create_session()
with a particular set of arguments, or instantiate a different implementation of Session
if one is available.
Session
objects it creates, provides the ability to maintain a single Session
per distinct application thread. The Session
returned by a SessionContext
is called the contextual session. Providing at least a thread-local context to sessions is important because the Session
object is not threadsafe, and is intended to be used with localized sets of data, as opposed to a single session being used application wide.
Session
per thread, the contextual algorithm can be changed to support any kind of contextual scheme.
MapperExtension
that can enhance a Mapper
, such that it can automatically save()
newly instantiated objects to the current contextual session. It also allows Query
objects to be created without an explicit Session
. While this is very convenient functionality, having it switched on without understanding it can be very confusing. Note that this feature is optional when using SessionContext
.
Using the SessionContext in its most basic form involves just instantiating a SessionContext
:
import sqlalchemy from sqlalchemy.ext.sessioncontext import SessionContext ctx = SessionContext(sqlalchemy.create_session) class User(object): pass mapper(User, users_table) u = User() # the contextual session is referenced by the "current" property on SessionContext ctx.current.save(u) ctx.current.flush()
From this example, one might see that the SessionContext
's typical scope is at the module or application level. Since the Session
itself is better suited to be used in per-user-request or even per-function scope, the SessionContext
provides an easy way to manage the scope of those Session
objects.
The construction of each Session
instance can be customized by providing a "creation function" which returns a new Session
. A common customization is a Session
which needs to explicitly bind to a particular Engine
:
import sqlalchemy from sqlalchemy.ext.sessioncontext import SessionContext # create an engine someengine = sqlalchemy.create_engine('sqlite:///') # a function to return a Session bound to our engine def make_session(): return sqlalchemy.create_session(bind_to=someengine) # SessionContext ctx = SessionContext(make_session) # get the session bound to engine "someengine": session = ctx.current
The above pattern is more succinctly expressed using Python lambdas:
ctx = SessionContext(lambda:sqlalchemy.create_session(bind_to=someengine))
The default creation function is simply:
ctx = SessionContext(sqlalchemy.create_session)
The "scope" to which the session is associated, which by default is a thread-local scope, can be customized by providing a "scope callable" which returns a hashable key that represents the current scope:
import sqlalchemy from sqlalchemy.ext.sessioncontext import SessionContext # global declaration of "scope" scope = "scope1" # a function to return the current "session scope" def global_scope_func(): return scope # create SessionContext with a custom "scopefunc" ctx = SessionContext(sqlalchemy.create_session, scopefunc=global_scope_func) # get the session corresponding to "scope1": session = ctx.current # switch the "scope" scope = "scope2" # get the session corresponding to "scope2": session = ctx.current
Examples of customized scope can include user-specific sessions or requests, or even sub-elements of an application, such as a graphical application which maintains a single Session
per application window (this was the original motivation to create SessionContext).
This is a MapperExtension
which allows a Mapper
to be automatically associated with a SessionContext
. Newly constructed objects get save()
d to the session automatically, and Query
objects can be constructed without a session. The instance of SessionContextExt
is provided by the SessionContext
itself:
import sqlalchemy from sqlalchemy.ext.sessioncontext import SessionContext ctx = SessionContext(sqlalchemy.create_session) class User(object): pass mapper(User, users_table, extension=ctx.mapper_extension) # 'u' is automatically added to the current session of 'ctx' u = User() assert u in ctx.current # get the current session and flush ctx.current.flush()
The MapperExtension
can be configured either per-mapper as above, or on an application-wide basis using:
import sqlalchemy from sqlalchemy.orm.mapper import global_extensions from sqlalchemy.ext.sessioncontext import SessionContext ctx = SessionContext(sqlalchemy.create_session) global_extensions.append(ctx.mapper_extension)
SessionContextExt allows Query
objects to be created against the mapped class without specifying a Session
. Each Query
will automatically make usage of the current contextual session:
# create a Query from a class query = Query(User) # specify entity name query = Query(User, entity_name='foo') # create a Query from a mapper query = Query(mapper) # then use it result = query.select()
When installed globally, all Mapper
objects will contain a built-in association to the SessionContext
. This means that once a mapped instance is created, creating a new Session
and calling save()
with the instance as an argument will raise an error stating that the instance is already associated with a different session. While you can always remove the object from its original session, SessionContextExt
is probably convenient only for an application that does not need much explicit manipulation of sessions.
The user still has some control over which session gets used at instance construction time. An instance can be redirected at construction time to a different Session
by specifying the keyword parameter _sa_session
to its constructor, which is decorated by the mapper:
session = create_session() # create a new session distinct from the contextual session myuser = User(_sa_session=session) # make a new User that is saved to this session
Similarly, the entity_name
parameter, which specifies an alternate Mapper
to be used when attaching this instance to the Session
, can be specified via _sa_entity_name
:
myuser = User(_sa_session=session, _sa_entity_name='altentity')
The decoration of mapped instances' __init__()
method is similar to this example:
oldinit = class_.__init__ # the previous init method def __init__(self, *args, **kwargs): session = kwargs.pop('_sa_session', None) entity_name = kwargs.pop('_sa_entity_name', None) if session is None: session = ext.get_session() # get Session from this Mapper's MapperExtension if session is EXT_PASS: session = None if session is not None: session.save(self, entity_name=entity_name) # attach to the current session oldinit(self, *args, **kwagrs) # call previous init method
Author: Jonas Borgström
SelectResults gives transformative behavior to the results returned from the select
and select_by
methods of Query
.
from sqlalchemy.ext.selectresults import SelectResults query = session.query(MyClass) res = SelectResults(query) res = res.filter(table.c.column == "something") # adds a WHERE clause (or appends to the existing via "and") res = res.order_by([table.c.column]) # adds an ORDER BY clause for x in res[:10]: # Fetch and print the top ten instances - adds OFFSET 0 LIMIT 10 or equivalent print x.column2 # evaluate as a list, which executes the query x = list(res) # Count how many instances that have column2 > 42 # and column == "something" print res.filter(table.c.column2 > 42).count() # select() is a synonym for filter() session.query(MyClass).select(mytable.c.column=="something").order_by([mytable.c.column])[2:7]
An important facet of SelectResults is that the actual SQL execution does not occur until the object is used in a list or iterator context. This means you can call any number of transformative methods (including filter
, order_by
, list range expressions, etc) before any SQL is actually issued.
Configuration of SelectResults may be per-Query, per Mapper, or per application:
from sqlalchemy.ext.selectresults import SelectResults, SelectResultsExt # construct a SelectResults for an individual Query sel = SelectResults(session.query(MyClass)) # construct a Mapper where the Query.select()/select_by() methods will return a SelectResults: mapper(MyClass, mytable, extension=SelectResultsExt()) # globally configure all Mappers to return SelectResults, using the "selectresults" mod import sqlalchemy.mods.selectresults
SelectResults greatly enhances querying and is highly recommended. For example, heres an example of constructing a query using a combination of joins and outerjoins:
mapper(User, users_table, properties={ 'orders':relation(mapper(Order, orders_table, properties={ 'items':relation(mapper(Item, items_table)) })) }) session = create_session() query = SelectResults(session.query(User)) result = query.outerjoin_to('orders').outerjoin_to('items').select(or_(Order.c.order_id==None,Item.c.item_id==2))
For a full listing of methods, see the generated documentation.
Author: Mike Bayer
This extension is used to decorate a mapped class with direct knowledge about its own Mapper
, a contextual Session
, as well as functions provided by the Query
and Session
objects. The methods will automatically make usage of a contextual session with which all newly constructed objects are associated. assign_mapper
operates as a MapperExtension
, and requires the usage of a SessionContext
as well as SessionContextExt
, described in SessionContext. It replaces the usage of the normal mapper
function with its own version that adds a SessionContext
specified as the first argument:
import sqlalchemy from sqlalchemy.ext.sessioncontext import SessionContext from sqlalchemy.ext.assignmapper import assign_mapper # session context ctx = SessionContext(sqlalchemy.create_session) # assign mapper to class MyClass using table 'sometable', getting # Sessions from 'ctx'. assign_mapper(ctx, MyClass, sometable, properties={...}, ...)
Above, all new instances of MyClass
will be associated with the contextual session, ctx.current
. Additionally, MyClass
and instances of MyClass
now contain a large set of methods including get
, select
, flush
, delete
. The full list is as follows:
# Query methods: ['get', 'select', 'select_by', 'selectone', 'get_by', 'join_to', 'join_via', 'count', 'count_by'] # Session methods: ['flush', 'delete', 'expire', 'refresh', 'expunge', 'merge', 'save', 'update', 'save_or_update']
To continue the MyClass
example:
# create a MyClass. it will be automatically assigned to the contextual Session. mc = MyClass() # save MyClass - this will call flush() on the session, specifying 'mc' as the only # object to be affected mc.flush() # load an object, using Query methods attached to MyClass result = MyClass.get_by(id=5) # delete it result.delete() # commit the change result.flush()
It should be noted that the flush()
method on the instance need not be called. You're probably better off calling flush()
on the actual session, so that all changes are properly written to the database simultaneously:
# create a MyClass. mc = MyClass() # load some MyClass objects result = MyClass.select(MyClass.c.name=='bar') # delete one of them result[1].delete() # commit all changes ctx.current.flush()
Author: Mike Bayer
Version: 0.3.1 or greater
associationproxy
is used to create a transparent proxy to the associated object in an association relationship, thereby decreasing the verbosity of the pattern in cases where explicit access to the association object is not required. The association relationship pattern is a richer form of a many-to-many relationship, which is described in Association Object. It is strongly recommended to fully understand the association object pattern in its explicit form before using this extension; see the examples in the SQLAlchemy distribution under the directory examples/association/
.
When dealing with association relationships, the association object refers to the object that maps to a row in the association table (i.e. the many-to-many table), while the associated object refers to the "endpoint" of the association, i.e. the ultimate object referenced by the parent. The proxy can return collections of objects attached to association objects, and can also create new association objects given only the associated object. An example using the Keyword mapping described in the data mapping documentation is as follows:
from sqlalchemy.ext.associationproxy import AssociationProxy class User(object): pass class Keyword(object): def __init__(self, name): self.keyword_name = name class Article(object): # create "keywords" proxied association. # the collection is called 'keyword_associations', the endpoint # attribute of each association object is called 'keyword'. the # class itself of the association object will be figured out automatically . keywords = AssociationProxy('keyword_associations', 'keyword') class KeywordAssociation(object): pass # create mappers normally # note that we set up 'keyword_associations' on Article, # and 'keyword' on KeywordAssociation. mapper(Article, articles_table, properties={ 'keyword_associations':relation(KeywordAssociation, lazy=False, cascade="all, delete-orphan") } ) mapper(KeywordAssociation, itemkeywords_table, primary_key=[itemkeywords_table.c.article_id, itemkeywords_table.c.keyword_id], properties={ 'keyword' : relation(Keyword, lazy=False), 'user' : relation(User, lazy=False) } ) mapper(User, users_table) mapper(Keyword, keywords_table) # now, Keywords can be attached to an Article directly; # KeywordAssociation will be created by the AssociationProxy, and have the # 'keyword' attribute set to the new Keyword. # note that these KeywordAssociation objects will not have a User attached to them. article = Article() article.keywords.append(Keyword('blue')) article.keywords.append(Keyword('red')) session.save(article) session.flush() # the "keywords" collection also returns the underlying Keyword objects article = session.query(Article).get_by(id=12) for k in article.keywords: print "Keyword:", k.keyword_name # the original 'keyword_associations' relation exists normally with no awareness of the proxy article.keyword_associations.append(KeywordAssociation()) print [ka for ka in article.keyword_associations]
Note that the above operations on the keywords
collection are proxying operations to and from the keyword_associations
collection, which exists normally and can be accessed directly. AssociationProxy
will also detect if the collection is list or scalar based and will configure the proxied property to act the same way.
For the common case where the association object's creation needs to be specified by the application, AssociationProxy
takes an optional callable creator()
which takes a single associated object as an argument, and returns a new association object.
def create_keyword_association(keyword): ka = KeywordAssociation() ka.keyword = keyword return ka class Article(object): # create "keywords" proxied association keywords = AssociationProxy('keyword_associations', 'keyword', creator=create_keyword_association)
Author: Mike Bayer and Daniel Miller
threadlocal
is an extension that was created primarily to provide backwards compatibility with the older SQLAlchemy 0.1 series. It uses three features which SQLAlchemy 0.2 and above provide as distinct features: SessionContext
, assign_mapper
, and the TLEngine
, which is the Engine
used with the threadlocal create_engine()
strategy. It is strongly recommended that these three features are understood individually before using threadlocal.
In SQLAlchemy 0.1, users never dealt with explcit connections and didn't have a very explicit Session
interface, instead relying upon a more magical global object called objectstore
. The objectstore
idea was wildly popular with about half of SA's users, and completely unpopular with the other half. The threadlocal mod basically brings back objectstore
, which is in fact just a SessionContext
where you can call Session
methods directly off of it, instead of saying context.current
. For threadlocal
to faithfully produce 0.1 behavior, it is invoked as a mod which globally installs the objectstore's mapper extension, such that all Mapper
s will automatically assign all new instances of mapped classes to the objectstore's contextual Session
. Additionally, it also changes the default engine strategy used by create_engine
to be the "threadlocal" strategy, which in normal practice does not affect much.
When you import threadlocal, what you get is:
sqlalchemy
namespace.
MapperExtension
is set up for all mappers which assigns "objectstore"'s session as the default session context, used by new instances as well as Query
objects (see the section Using SessionContextExt).
sqlalchemy
namespace, which calls the assignmapper
mapper function using the new "objectstore" context.
create_engine
function is modified so that "threadlocal", and not "plain", is the default engine strategy.
So an important point to understand is, don't use the threadlocal mod unless you explcitly are looking for that behavior. Unfortunately, the easy import of the "threadlocal" mod has found its way into several tutorials on external websites, which produces application-wide behavior that is in conflict with the SQLAlchemy tutorial and data mapping documentation.
While "threadlocal" is only about 10 lines of code, it is strongly advised that users instead make usage of SessionContext
and assign_mapper
explictly to eliminate confusion. Additionally, the "threadlocal" strategy on create_engine()
also exists primarily to provide patterns used in 0.1 and is probably not worth using either, unless you specifically need those patterns.
Basic usage of threadlocal involves importing the mod, before any usage of the sqlalchemy
namespace, since threadlocal is going to add the "objectstore" and "assign_mapper" keywords to "sqlalchemy".
To use objectstore
:
import sqlalchemy.mods.threadlocal from sqlalchemy import * metadata = BoundMetaData('sqlite:///') user_table = Table('users', metadata, Column('user_id', Integer, primary_key=True), Column('user_name', String(50), nullable=False) ) class User(object): pass mapper(User, user_table) # "user" object is added to the session automatically user = User() # flush the contextual session objectstore.flush()
The actual Session
is available as:
objectstore.get_session()
To use assign_mapper
:
import sqlalchemy.mods.threadlocal from sqlalchemy import * metadata = BoundMetaData('sqlite:///') user_table = Table('users', metadata, Column('user_id', Integer, primary_key=True), Column('user_name', String(50), nullable=False) ) class User(object): pass # note that no "context" object is needed assign_mapper(User, user_table) # call up a user user = User.selectfirst(user_table.c.user_id==7) # call 'delete' on the user user.delete() # flush objectstore.flush()
Author: Jonathan LaCour
ActiveMapper is a so-called "declarative layer" which allows the construction of a class, a Table
, and a Mapper
all in one step:
class Person(ActiveMapper): class mapping: id = column(Integer, primary_key=True) full_name = column(String) first_name = column(String) middle_name = column(String) last_name = column(String) birth_date = column(DateTime) ssn = column(String) gender = column(String) home_phone = column(String) cell_phone = column(String) work_phone = column(String) prefs_id = column(Integer, foreign_key=ForeignKey('preferences.id')) addresses = one_to_many('Address', colname='person_id', backref='person') preferences = one_to_one('Preferences', colname='pref_id', backref='person') def __str__(self): s = '%s\n' % self.full_name s += ' * birthdate: %s\n' % (self.birth_date or 'not provided') s += ' * fave color: %s\n' % (self.preferences.favorite_color or 'Unknown') s += ' * personality: %s\n' % (self.preferences.personality_type or 'Unknown') for address in self.addresses: s += ' * address: %s\n' % address.address_1 s += ' %s, %s %s\n' % (address.city, address.state, address.postal_code) return s class Preferences(ActiveMapper): class mapping: __table__ = 'preferences' id = column(Integer, primary_key=True) favorite_color = column(String) personality_type = column(String) class Address(ActiveMapper): class mapping: id = column(Integer, primary_key=True) type = column(String) address_1 = column(String) city = column(String) state = column(String) postal_code = column(String) person_id = column(Integer, foreign_key=ForeignKey('person.id'))
More discussion on ActiveMapper can be found at Jonathan LaCour's Blog as well as the SQLAlchemy Wiki.
Author: Jonathan Ellis
SqlSoup creates mapped classes on the fly from tables, which are automatically reflected from the database based on name. It is essentially a nicer version of the "row data gateway" pattern.
>>> from sqlalchemy.ext.sqlsoup import SqlSoup >>> soup = SqlSoup('sqlite:///') >>> db.users.select(order_by=[db.users.c.name]) [MappedUsers(name='Bhargan Basepair',email='basepair@example.edu',password='basepair',classname=None,admin=1), MappedUsers(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0)]
Full SqlSoup documentation is on the SQLAlchemy Wiki.
Author: Jason Pellerin
The ProxyEngine
is used to "wrap" an Engine
, and via subclassing ProxyEngine
one can instrument the functionality of an arbitrary Engine
instance through the decorator pattern. It also provides a connect()
method which will send all Engine
requests to different underlying engines. Its functionality in that regard is largely superceded now by DynamicMetaData
which is a better solution.
from sqlalchemy.ext.proxy import ProxyEngine proxy = ProxyEngine() proxy.connect('postgres://user:pw@host/db')
defines the base components of SQL expression trees.
joins a list of clauses together by the AND operator. the & operator can be used as well.
returns BETWEEN predicate clause (clausetest BETWEEN clauseleft AND clauseright).
this is better called off a ColumnElement directly, i.e.
column.between(value1, value2).
creates a bind parameter clause with the given key.
An optional default value can be specified by the value parameter, and the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this bind parameter.
SQL CASE statement -- whens are a sequence of pairs to be translated into "when / then" clauses; optional [value] for simple case statements, and [else_] for case defaults
returns CAST function CAST(clause AS totype) Use with a sqlalchemy.types.TypeEngine object, i.e cast(table.c.unit_price * table.c.qty, Numeric(10,4)) or cast(table.c.timestamp, DATE)
returns a textual column clause, relative to a table. this is also the primitive version of a schema.Column which is a subclass.
returns a DELETE clause element.
This can also be called from a table directly via the table's delete() method.
'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement.
returns an INSERT clause element.
This can also be called from a table directly via the table's insert() method.
'table' is the table to be inserted into.
'values' is a dictionary which specifies the column specifications of the INSERT, and is optional. If left as None, the column specifications are determined from the bind parameters used during the compile phase of the INSERT statement. If the bind parameters also are None during the compile phase, then the column specifications will be generated from the full list of table columns.
If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.
The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this INSERT statement's table, the statement will be correlated against the INSERT statement.
return a JOIN clause element (regular inner join).
left - the left side of the join right - the right side of the join onclause - optional criterion for the ON clause, is derived from foreign key relationships otherwise
To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.
returns a literal clause, bound to a bind parameter.
literal clauses are created automatically when used as the right-hand side of a boolean or math operation against a column object. use this function when a literal is needed on the left-hand side (and optionally on the right as well).
the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this literal.
returns a negation of the given clause, i.e. NOT(clause). the ~ operator can be used as well.
joins a list of clauses together by the OR operator. the | operator can be used as well.
return an OUTER JOIN clause element.
left - the left side of the join right - the right side of the join onclause - optional criterion for the ON clause, is derived from foreign key relationships otherwise
To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.
returns a SELECT clause element.
this can also be called via the table's select() method.
'columns' is a list of columns and/or selectable items to select columns from 'whereclause' is a text or ClauseElement expression which will form the WHERE clause 'from_obj' is an list of additional "FROM" objects, such as Join objects, which will extend or override the default "from" objects created from the column list and the whereclause. **kwargs - additional parameters for the Select object.
returns a table clause. this is a primitive version of the schema.Table object, which is a subclass of this object.
creates literal text to be inserted into a query.
When constructing a query from a select(), update(), insert() or delete(), using plain strings for argument values will usually result in text objects being created automatically. Use this function when creating textual clauses outside of other ClauseElement objects, or optionally wherever plain text is to be used.
Arguments include:
text - the text of the SQL statement to be created. use :<param> to specify bind parameters; they will be compiled to their engine-specific format.
engine - an optional engine to be used for this text query.
bindparams - a list of bindparam() instances which can be used to define the types and/or initial values for the bind parameters within the textual statement; the keynames of the bindparams must match those within the text of the statement. The types will be used for pre-processing on bind values.
typemap - a dictionary mapping the names of columns represented in the SELECT clause of the textual statement to type objects, which will be used to perform post-processing on columns within the result set (for textual statements that produce result sets).
returns an UPDATE clause element.
This can also be called from a table directly via the table's update() method.
'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement. 'values' is a dictionary which specifies the SET conditions of the UPDATE, and is optional. If left as None, the SET conditions are determined from the bind parameters used during the compile phase of the UPDATE statement. If the bind parameters also are None during the compile phase, then the SET conditions will be generated from the full list of table columns.
If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.
The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this UPDATE statement's table, the statement will be correlated against the UPDATE statement.
represents the behavior of a particular database. Used by Compiled objects.
base class for elements of a programmatically constructed SQL expression.
compare this ClauseElement to the given ClauseElement.
Subclasses should override the default behavior, which is a straight identity comparison.
compile this SQL expression.
Uses the given Compiler, or the given AbstractDialect or Engine to create a Compiler. If no compiler arguments are given, tries to use the underlying Engine this ClauseElement is bound to to create a Compiler, if any. Finally, if there is no bound Engine, uses an ANSIDialect to create a default Compiler.
bindparams is a dictionary representing the default bind parameters to be used with the statement. if the bindparams is a list, it is assumed to be a list of dictionaries and the first dictionary in the list is used with which to compile against. The bind parameters can in some cases determine the output of the compilation, such as for UPDATE and INSERT statements the bind parameters that are present determine the SET and VALUES clause of those statements.
return a copy of this ClauseElement, iff this ClauseElement contains other ClauseElements.
If this ClauseElement is not a container, it should return self. This is used to create copies of expression trees that still reference the same "leaf nodes". The new structure can then be restructured without affecting the original.
represents a dictionary/iterator of bind parameter key names/values.
Tracks the original BindParam objects as well as the keys/position of each parameter, and can return parameters as a dictionary or a list. Will process parameter values according to the TypeEngine objects present in the BindParams.
an ordered dictionary that stores a list of ColumnElement instances.
overrides the __eq__() method to produce SQL clauses between sets of correlated columns.
represents a column element within the list of a Selectable's columns. A ColumnElement can either be directly associated with a TableClause, or a free-standing textual column with no table, or is a "proxy" column, indicating it is placed on a Selectable such as an Alias or Select statement and ultimately corresponds to a TableClause-attached column (or in the case of a CompositeSelect, a proxy ColumnElement may correspond to several TableClause-attached columns).
Columns accessor which just returns self, to provide compatibility with Selectable objects.
foreign key accessor. points to a ForeignKey object which represents a Foreign Key placed on this column's ultimate ancestor.
a Set containing TableClause-bound, non-proxied ColumnElements for which this ColumnElement is a proxy. In all cases except for a column proxied from a Union (i.e. CompoundSelect), this set will be just one element.
represents a compiled SQL expression. the __str__ method of the Compiled object should produce the actual text of the statement. Compiled objects are specific to the database library that created them, and also may or may not be specific to the columns referenced within a particular set of bind parameters. In no case should the Compiled object be dependent on the actual values of those bind parameters, even though it may reference those values as defaults.
construct a new Compiled object.
statement - ClauseElement to be compiled
parameters - optional dictionary indicating a set of bind parameters specified with this Compiled object. These parameters are the "default" values corresponding to the ClauseElement's _BindParamClauses when the Compiled is executed. In the case of an INSERT or UPDATE statement, these parameters will also result in the creation of new _BindParamClause objects for each key and will also affect the generated column list in an INSERT statement and the SET clauses of an UPDATE statement. The keys of the parameter dictionary can either be the string names of columns or _ColumnClause objects.
engine - optional Engine to compile this statement against
returns the bind params for this compiled object.
Will start with the default parameters specified when this Compiled object was first constructed, and will override those values with those sent via **params, which are key/value pairs. Each key should match one of the _BindParamClause objects compiled into this object; either the "key" or "shortname" property of the _BindParamClause.
represents a 'thing that can produce Compiled objects and execute them'.
represents an element that can be used within the FROM clause of a SELECT statement.
given a ColumnElement, return the ColumnElement object from this Selectable which corresponds to that original Column via a proxy relationship.
True if the name of this FromClause may be prepended to a column in a generated SQL statement
represents a SELECT statement, with appendable clauses, as well as the ability to execute itself and return a result set.
the schema module provides the building blocks for database metadata. This means all the entities within a SQL database that we might want to look at, modify, or create and delete are described by these objects, in a database-agnostic way.
A structure of SchemaItems also provides a "visitor" interface which is the primary method by which other methods operate upon the schema. The SQL package extends this structure with its own clause-specific objects as well as the visitor interface, so that the schema package "plugs in" to the SQL package.
builds upon MetaData to provide the capability to bind to an Engine implementation.
represents a column in a database table. this is a subclass of sql.ColumnClause and represents an actual existing table in the database, in a similar fashion as TableClause/Table.
constructs a new Column object. Arguments are:
name : the name of this column. this should be the identical name as it appears, or will appear, in the database.
type: the TypeEngine for this column. This can be any subclass of types.AbstractType, including the database-agnostic types defined in the types module, database-specific types defined within specific database modules, or user-defined types.
type: the TypeEngine for this column. This can be any subclass of types.AbstractType, including the database-agnostic types defined in the types module, database-specific types defined within specific database modules, or user-defined types. If the column contains a ForeignKey, the type can also be None, in which case the type assigned will be that of the referenced column.
*args: Constraint, ForeignKey, ColumnDefault and Sequence objects should be added as list values.
**kwargs : keyword arguments include:
key=None : an optional "alias name" for this column. The column will then be identified everywhere in an application, including the column list on its Table, by this key, and not the given name. Generated SQL, however, will still reference the column by its actual name.
primary_key=False : True if this column is a primary key column. Multiple columns can have this flag set to specify composite primary keys. As an alternative, the primary key of a Table can be specified via an explicit PrimaryKeyConstraint instance appended to the Table's list of objects.
nullable=True : True if this column should allow nulls. Defaults to True unless this column is a primary key column.
default=None : a scalar, python callable, or ClauseElement representing the "default value" for this column, which will be invoked upon insert if this column is not present in the insert list or is given a value of None. The default expression will be converted into a ColumnDefault object upon initialization.
_is_oid=False : used internally to indicate that this column is used as the quasi-hidden "oid" column
index=False : Indicates that this column is indexed. The name of the index is autogenerated. to specify indexes with explicit names or indexes that contain multiple columns, use the Index construct instead.
unique=False : Indicates that this column contains a unique constraint, or if index=True as well, indicates that the Index should be created with the unique flag. To specify multiple columns in the constraint/index or to specify an explicit name, use the UniqueConstraint or Index constructs instead.
autoincrement=True : Indicates that integer-based primary key columns should have autoincrementing behavior, if supported by the underlying database. This will affect CREATE TABLE statements such that they will use the databases "auto-incrementing" keyword (such as SERIAL for postgres, AUTO_INCREMENT for mysql) and will also affect the behavior of some dialects during INSERT statement execution such that they will assume primary key values are created in this manner. If a Column has an explicit ColumnDefault object (such as via the "default" keyword, or a Sequence or PassiveDefault), then the value of autoincrement is ignored and is assumed to be False. autoincrement value is only significant for a column with a type or subtype of Integer.
quote=False : indicates that the Column identifier must be properly escaped and quoted before being sent to the database. This flag should normally not be required as dialects can auto-detect conditions where quoting is required.
case_sensitive=True : indicates that the identifier should be interpreted by the database in the natural case for identifiers. Mixed case is not sufficient to cause this identifier to be quoted; it must contain an illegal character.
A plain default value on a column. this could correspond to a constant, a callable function, or a SQL clause.
represents a table-level Constraint such as a composite primary key, foreign key, or unique constraint.
Implements a hybrid of dict/setlike behavior with regards to the list of underying columns
builds upon MetaData to provide the capability to bind to multiple Engine implementations on a dynamically alterable, thread-local basis.
defines a column-level ForeignKey constraint between two columns.
ForeignKey is specified as an argument to a Column object.
One or more ForeignKey objects are used within a ForeignKeyConstraint object which represents the table-level constraint definition.
Construct a new ForeignKey object.
"column" can be a schema.Column object representing the relationship, or just its string name given as "tablename.columnname". schema can be specified as "schema.tablename.columnname"
"constraint" is the owning ForeignKeyConstraint object, if any. if not given, then a ForeignKeyConstraint will be automatically created and added to the parent table.
table-level foreign key constraint, represents a colleciton of ForeignKey objects.
Represents an index of columns from a database table
Constructs an index object. Arguments are:
name : the name of the index
*columns : columns to include in the index. All columns must belong to the same table, and no column may appear more than once.
**kw : keyword arguments include:
unique=True : create a unique index
represents a collection of Tables and their associated schema constructs.
create all tables stored in this metadata.
This will conditionally create tables depending on if they do not yet exist in the database.
connectable - a Connectable used to access the database; or use the engine bound to this MetaData.
tables - optional list of tables, which is a subset of the total tables in the MetaData (others are ignored)
drop all tables stored in this metadata.
This will conditionally drop tables depending on if they currently exist in the database.
connectable - a Connectable used to access the database; or use the engine bound to this MetaData.
tables - optional list of tables, which is a subset of the total tables in the MetaData (others are ignored)
a default that takes effect on the database side
represents a sequence, which applies to Oracle and Postgres databases.
represents a relational database table. This subclasses sql.TableClause to provide a table that is "wired" to an engine. Whereas TableClause represents a table as its used in a SQL expression, Table represents a table as its created in the database.
Be sure to look at sqlalchemy.sql.TableImpl for additional methods defined on a Table.
Construct a Table.
Table objects can be constructed directly. The init method is actually called via the TableSingleton metaclass. Arguments are:
name : the name of this table, exactly as it appears, or will appear, in the database. This property, along with the "schema", indicates the "singleton identity" of this table. Further tables constructed with the same name/schema combination will return the same Table instance.
*args : should contain a listing of the Column objects for this table.
**kwargs : options include:
schema=None : the "schema name" for this table, which is required if the table resides in a schema other than the default selected schema for the engine's database connection.
autoload=False : the Columns for this table should be reflected from the database. Usually there will be no Column objects in the constructor if this property is set.
mustexist=False : indicates that this Table must already have been defined elsewhere in the application, else an exception is raised.
useexisting=False : indicates that if this Table was already defined elsewhere in the application, disregard the rest of the constructor arguments.
owner=None : optional owning user of this table. useful for databases such as Oracle to aid in table reflection.
quote=False : indicates that the Table identifier must be properly escaped and quoted before being sent to the database. This flag overrides all other quoting behavior.
quote_schema=False : indicates that the Namespace identifier must be properly escaped and quoted before being sent to the database. This flag overrides all other quoting behavior.
case_sensitive=True : indicates that the identifier should be interpreted by the database in the natural case for identifiers. Mixed case is not sufficient to cause this identifier to be quoted; it must contain an illegal character.
case_sensitive_schema=True : indicates that the identifier should be interpreted by the database in the natural case for identifiers. Mixed case is not sufficient to cause this identifier to be quoted; it must contain an illegal character.
issue a CREATE statement for this table.
see also metadata.create_all().
creates a new Engine instance. Using the given strategy name, locates that strategy and invokes its create() method to produce the Engine. The strategies themselves are instances of EngineStrategy, and the built in ones are present in the sqlalchemy.engine.strategies module. Current implementations include "plain" and "threadlocal". The default used by this function is "plain".
"plain" provides support for a Connection object which can be used to execute SQL queries with a specific underlying DBAPI connection.
"threadlocal" is similar to "plain" except that it adds support for a thread-local connection and transaction context, which allows a group of engine operations to participate using the same connection and transaction without the need for explicit passing of a Connection object.
The standard method of specifying the engine is via URL as the first positional argument, to indicate the appropriate database dialect and connection arguments, with additional keyword arguments sent as options to the dialect and resulting Engine.
The URL is in the form <dialect>://opt1=val1&opt2=val2. Where <dialect> is a name such as "mysql", "oracle", "postgres", and the options indicate username, password, database, etc. Supported keynames include "username", "user", "password", "pw", "db", "database", "host", "filename".
**kwargs represents options to be sent to the Engine itself as well as the components of the Engine, including the Dialect, the ConnectionProvider, and the Pool. A list of common options is as follows:
pool=None : an instance of sqlalchemy.pool.DBProxy or sqlalchemy.pool.Pool to be used as the underlying source for connections (DBProxy/Pool is described in the previous section). If None, a default DBProxy will be created using the engine's own database module with the given arguments.
echo=False : if True, the Engine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. A Engine instances' "echo" data member can be modified at any time to turn logging on and off. If set to the string 'debug', result rows will be printed to the standard output as well.
logger=None : a file-like object where logging output can be sent, if echo is set to True. This defaults to sys.stdout.
encoding='utf-8' : the encoding to be used when encoding/decoding Unicode strings
convert_unicode=False : True if unicode conversion should be applied to all str types
module=None : used by Oracle and Postgres, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle. For mysql, MySQLdb.
use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of some Oracle databases, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using <column1>(+)=<column2> must be used in order to achieve a LEFT OUTER JOIN. Its advised that the Oracle database be configured to have full ANSI support instead of using this feature.
provides a listing of all the database implementations supported. this data is provided as a list of dictionaries, where each dictionary contains the following key/value pairs:
name : the name of the engine, suitable for use in the create_engine function
description: a plain description of the engine.
arguments : a dictionary describing the name and description of each parameter used to connect to this engine's underlying DBAPI.
This function is meant for usage in automated configuration tools that wish to query the user for database and connection information.
interface for an object that can provide an Engine and a Connection object which correponds to that Engine.
represents a single DBAPI connection returned from the underlying connection pool. Provides execution support for string-based SQL statements as well as ClauseElement, Compiled and DefaultGenerator objects. provides a begin method to return Transaction objects.
The Connection object is **not** threadsafe.
connect() is implemented to return self so that an incoming Engine or Connection object can be treated similarly.
contextual_connect() is implemented to return self so that an incoming Engine or Connection object can be treated similarly.
executes the given statement string and parameter object. the parameter object is expected to be the result of a call to compiled.get_params(). This callable is a generic version of a connection/cursor-specific callable that is produced within the execute_compiled method, and is used for objects that require this style of proxy when outside of an execute_compiled method, primarily the DefaultRunner.
defines an interface that returns raw Connection objects (or compatible).
releases all resources corresponding to this ConnectionProvider, such as any underlying connection pools.
this method should return a Connection or compatible object from a DBAPI which also contains a close() method. It is not defined what context this connection belongs to. It may be newly connected, returned from a pool, part of some other kind of context such as thread-local, or can be a fixed member of this object.
a visitor which accepts ColumnDefault objects, produces the dialect-specific SQL corresponding to their execution, and executes the SQL, returning the result value.
DefaultRunners are used internally by Engines and Dialects. Specific database modules should provide their own subclasses of DefaultRunner to allow database-specific behavior.
Defines the behavior of a specific database/DBAPI.
Any aspect of metadata defintion, SQL query generation, execution, result-set handling, or anything else which varies between databases is defined under the general category of the Dialect. The Dialect acts as a factory for other database-specific object implementations including ExecutionContext, Compiled, DefaultGenerator, and TypeEngine.
All Dialects implement the following attributes:
positional - True if the paramstyle for this Dialect is positional
paramstyle - the paramstyle to be used (some DBAPIs support multiple paramstyles)
supports_autoclose_results - usually True; if False, indicates that rows returned by fetchone() might not be just plain tuples, and may be "live" proxy objects which still require the cursor to be open in order to be read (such as pyPgSQL which has active filehandles for BLOBs). in that case, an auto-closing ResultProxy cannot automatically close itself after results are consumed.
convert_unicode - True if unicode conversion should be applied to all str types
encoding - type of encoding to use for unicode, usually defaults to 'utf-8'
compile the given ClauseElement using this Dialect.
a convenience method which simply flips around the compile() call on ClauseElement.
returns a sql.ClauseVisitor which will produce a string representation of the given ClauseElement and parameter dictionary. This object is usually a subclass of ansisql.ANSICompiler.
compiler is called within the context of the compile() method.
given a sql.ClauseParameters object, returns an array or dictionary suitable to pass directly to this Dialect's DBAPI's execute method.
given a dictionary of key-valued connect parameters, returns a tuple consisting of a *args/**kwargs suitable to send directly to the dbapi's connect function. The connect args will have any number of the following keynames: host, hostname, database, dbanme, user,username, password, pw, passwd, filename.
subclasses override this method to provide the DBAPI module used to establish connections.
returns a schema.SchemaVisitor instances that can execute defaults.
returns the currently selected schema given an connection
returns the oid column name for this dialect, or None if the dialect cant/wont support OID/ROWID.
given an Connection and a Table object, reflects its columns and properties from the database.
returns a schema.SchemaVisitor instance that can drop schemas, when it is invoked to traverse a set of schema objects.
schemagenerator is called via the drop() method on Table, Index, and others.
returns a schema.SchemaVisitor instance that can generate schemas, when it is invoked to traverse a set of schema objects.
schemagenerator is called via the create() method on Table, Index, and others.
Connects a ConnectionProvider, a Dialect and a CompilerFactory together to provide a default implementation of SchemaEngine.
returns a Connection object which may be newly allocated, or may be part of some ongoing context. This Connection is meant to be used by the various "auto-connecting" operations.
creates a table or index within this engine's database connection given a schema.Table object.
drops a table or index within this engine's database connection given a schema.Table object.
given a Table object, reflects its columns and properties from the database.
executes the given function within a transaction boundary. this is a shortcut for explicitly calling begin() and commit() and optionally rollback() when execptions are raised. The given *args and **kwargs will be passed to the function, as well as the Connection used in the transaction.
a messenger object for a Dialect that corresponds to a single execution. The Dialect should provide an ExecutionContext via the create_execution_context() method. The pre_exec and post_exec methods will be called for compiled statements, afterwhich it is expected that the various methods last_inserted_ids, last_inserted_params, etc. will contain appropriate values, if applicable.
returns the count of rows updated/deleted for an UPDATE/DELETE statement
return the list of the primary key values for the last insert statement executed.
This does not apply to straight textual clauses; only to sql.Insert objects compiled against a schema.Table object, which are executed via statement.execute(). The order of items in the list is the same as that of the Table's 'primary_key' attribute.
In some cases, this method may invoke a query back to the database to retrieve the data, based on the "lastrowid" value in the cursor.
return a dictionary of the full parameter dictionary for the last compiled INSERT statement.
Includes any ColumnDefaults or Sequences that were pre-executed.
return a dictionary of the full parameter dictionary for the last compiled UPDATE statement.
Includes any ColumnDefaults that were pre-executed.
return True if the last row INSERTED via a compiled insert statement contained PassiveDefaults.
The presence of PassiveDefaults indicates that the database inserted data beyond that which we passed to the query programmatically.
called after the execution of a compiled statement. proxy is a callable that takes a string statement and a bind parameter list/dictionary.
wraps a DBAPI cursor object to provide access to row columns based on integer position, case-insensitive column name, or by schema.Column object. e.g.:
row = fetchone()
col1 = row[0] # access via integer position
col2 = row['col2'] # access via name
col3 = row[mytable.c.mycol] # access via Column object.
ResultProxy also contains a map of TypeEngine objects and will invoke the appropriate convert_result_value() method before returning columns, as well as the ExecutionContext corresponding to the statement execution. It provides several methods for which to obtain information from the underlying ExecutionContext.
ResultProxy objects are constructed via the execute() method on SQLEngine.
close this ResultProxy, and the underlying DBAPI cursor corresponding to the execution.
If this ResultProxy was generated from an implicit execution, the underlying Connection will also be closed (returns the underlying DBAPI connection to the connection pool.)
This method is also called automatically when all result rows are exhausted.
fetch many rows, juts like DBAPI cursor.fetchmany(size=cursor.arraysize)
return last_inserted_ids() from the underlying ExecutionContext.
See ExecutionContext for details.
return last_inserted_params() from the underlying ExecutionContext.
See ExecutionContext for details.
return last_updated_params() from the underlying ExecutionContext.
See ExecutionContext for details.
proxies a single cursor row for a parent ResultProxy. Mostly follows "ordered dictionary" behavior, mapping result values to the string-based column name, the integer position of the result in the row, as well as Column instances which can be mapped to the original Columns that produced this result set (for results that correspond to constructed SQL expressions).
a visitor that can gather text into a buffer and execute the contents of the buffer.
construct a new SchemaIterator.
engine - the Engine used by this SchemaIterator
proxy - a callable which takes a statement and bind parameters and executes it, returning the cursor (the actual DBAPI cursor). The callable should use the same cursor repeatedly.
represents the components of a URL used to connect to a database.
This object is suitable to be passed directly to a create_engine() call. The fields of the URL are parsed from a string by the module-level make_url() function. the string format of the URL is an RFC-1738-style string.
Attributes on URL include:
drivername
username
password
host
port
database
query - a dictionary containing key/value pairs representing the URL's query string.
translate this URL's attributes into a dictionary of connection arguments.
given a list of argument names corresponding to the URL attributes ('host', 'database', 'username', 'password', 'port'), will assemble the attribute values of this URL into the dictionary using the given names.
the mapper package provides object-relational functionality, building upon the schema and sql packages and tying operations to class properties and constructors.
create a BackRef object with explicit arguments, which are the same arguments one can send to relation().
used with the "backref" keyword argument to relation() in place of a string argument.
attempt to create a series of relations() between mappers automatically, via introspecting the foreign key relationships of the underlying tables.
given a list of classes and/or mappers, identifies the foreign key relationships between the given mappers or corresponding class mappers, and creates relation() objects representing those relationships, including a backreference. Attempts to find the "secondary" table in a many-to-many relationship as well. The names of the relations will be a lowercase version of the related class. In the case of one-to-many or many-to-many, the name will be "pluralized", which currently is based on the English language (i.e. an 's' or 'es' added to it).
NOTE: this method usually works poorly, and its usage is generally not advised.
given a ClassKey, returns the primary Mapper associated with the key.
remove the given mapper from the storage of mappers.
when a new mapper is created for the previous mapper's class, it will be used as that classes' new primary mapper.
remove all mappers that have been created thus far.
when new mappers are created, they will be assigned to their classes as their primary mapper.
compile all mappers that have been defined.
this is equivalent to calling compile() on any individual mapper.
return a MapperOption that will indicate to the query that the given attribute will be eagerly loaded without any row decoration, or using a custom row decorator.
used when feeding SQL result sets directly into query.instances().
return a MapperOption that will convert the column property of the given name into a deferred load.
used with query.options()
return a DeferredColumnProperty, which indicates this object attributes should only be loaded from its corresponding table column when first accessed.
used with the 'properties' dictionary sent to mapper().
return a MapperOption that will convert the property of the given name into an eager load.
used with query.options().
return a MapperOption that will insert the given MapperExtension to the beginning of the list of extensions that will be called in the context of the Query.
used with query.options().
return a MapperOption that will convert the property of the given name into a lazy load.
used with query.options().
return a new Mapper object.
See the Mapper class for a description of arguments.
return a MapperOption that will convert the property of the given name into a non-load.
used with query.options().
given an object, returns the primary Mapper associated with the object instance
create a UNION statement used by a polymorphic mapper.
See the SQLAlchemy advanced mapping docs for an example of how this is used.
provide a relationship of a primary Mapper to a secondary Mapper.
This corresponds to a parent-child or associative table relationship.
base implementation for an object that provides overriding behavior to various Mapper functions. For each method in MapperExtension, a result of EXT_PASS indicates the functionality is not overridden.
receive an object instance after that instance is DELETEed.
receive an object instance after that instance is INSERTed.
receive an object instance after that instance is UPDATEed.
receive an object instance before that instance is appended to a result list.
If this method returns EXT_PASS, result appending will proceed normally. if this method returns any other value or None, result appending will not proceed for this instance, giving this extension an opportunity to do the appending itself, if desired.
mapper - the mapper doing the operation
selectcontext - SelectionContext corresponding to the instances() call
row - the result row from the database
instance - the object instance to be appended to the result
identitykey - the identity key of the instance
result - list to which results are being appended
isnew - indicates if this is the first time we have seen this object instance in the current result set. if you are selecting from a join, such as an eager load, you might see the same object instance many times in the same result set.
receive an object instance before that instance is DELETEed.
receive an object instance before that instance is INSERTed into its table.
this is a good place to set up primary key values and such that arent handled otherwise.
receive an object instance before that instance is UPDATEed.
receieve a row when a new object instance is about to be created from that row. the method can choose to create the instance itself, or it can return None to indicate normal object creation should take place.
mapper - the mapper doing the operation
selectcontext - SelectionContext corresponding to the instances() call
row - the result row from the database
class_ - the class we are mapping.
override the get method of the Query object.
the return value of this method is used as the result of query.get() if the value is anything other than EXT_PASS.
override the get_by method of the Query object.
the return value of this method is used as the result of query.get_by() if the value is anything other than EXT_PASS.
retrieve a contextual Session instance with which to register a new object.
Note: this is not called if a session is provided with the __init__ params (i.e. _sa_session)
override the load method of the Query object.
the return value of this method is used as the result of query.load() if the value is anything other than EXT_PASS.
receive a newly-created instance before that instance has its attributes populated.
The normal population of attributes is according to each attribute's corresponding MapperProperty (which includes column-based attributes as well as relationships to other classes). If this method returns EXT_PASS, instance population will proceed normally. If any other value or None is returned, instance population will not proceed, giving this extension an opportunity to populate the instance itself, if desired.
Defines the correlation of class attributes to database table columns.
Instances of this class should be constructed via the sqlalchemy.orm.mapper() function.
construct a new mapper.
All arguments may be sent to the sqlalchemy.orm.mapper() function where they are passed through to here.
class_ - the class to be mapped.
local_table - the table to which the class is mapped, or None if this mapper inherits from another mapper using concrete table inheritance.
properties - a dictionary mapping the string names of object attributes to MapperProperty instances, which define the persistence behavior of that attribute. Note that the columns in the mapped table are automatically converted into ColumnProperty instances based on the "key" property of each Column (although they can be overridden using this dictionary).
primary_key - a list of Column objects which define the "primary key" to be used against this mapper's selectable unit. This is normally simply the primary key of the "local_table", but can be overridden here.
non_primary - construct a Mapper that will define only the selection of instances, not their persistence.
inherits - another Mapper for which this Mapper will have an inheritance relationship with.
inherit_condition - for joined table inheritance, a SQL expression (constructed ClauseElement) which will define how the two tables are joined; defaults to a natural join between the two tables.
extension - a MapperExtension instance or list of MapperExtension instances which will be applied to all operations by this Mapper.
order_by - a single Column or list of Columns for which selection operations should use as the default ordering for entities. Defaults to the OID/ROWID of the table if any, or the first primary key column of the table.
allow_column_override - if True, allows the usage of a `relation()` which has the same name as a column in the mapped table. The table column will no longer be mapped.
entity_name - a name to be associated with the class, to allow alternate mappings for a single class.
always_refresh - if True, all query operations for this mapped class will overwrite all data within object instances that already exist within the session, erasing any in-memory changes with whatever information was loaded from the database.
version_id_col - a Column which must have an integer type that will be used to keep a running "version id" of mapped entities in the database. this is used during save operations to ensure that no other thread or process has updated the instance during the lifetime of the entity, else a ConcurrentModificationError exception is thrown.
polymorphic_on - used with mappers in an inheritance relationship, a Column which will identify the class/mapper combination to be used with a particular row. requires the polymorphic_identity value to be set for all mappers in the inheritance hierarchy.
_polymorphic_map - used internally to propigate the full map of polymorphic identifiers to surrogate mappers.
polymorphic_identity - a value which will be stored in the Column denoted by polymorphic_on, corresponding to the "class identity" of this mapper.
concrete - if True, indicates this mapper should use concrete table inheritance with its parent mapper.
select_table - a Table or (more commonly) Selectable which will be used to select instances of this mapper's class. usually used to provide polymorphic loading among several classes in an inheritance hierarchy.
allow_null_pks - indicates that composite primary keys where one or more (but not all) columns contain NULL is a valid primary key. Primary keys which contain NULL values usually indicate that a result row does not contain an entity and should be skipped.
batch - indicates that save operations of multiple entities can be batched together for efficiency. setting to False indicates that an instance will be fully saved before saving the next instance, which includes inserting/updating all table rows corresponding to the entity as well as calling all MapperExtension methods corresponding to the save operation.
column_prefix - a string which will be prepended to the "key" name of all Columns when creating column-based properties from the given Table. does not affect explicitly specified column-based properties
adds the given dictionary of properties to this mapper, using add_property.
add an indiviual MapperProperty to this mapper.
If the mapper has not been compiled yet, just adds the property to the initial properties dictionary sent to the constructor. if this Mapper has already been compiled, then the given MapperProperty is compiled immediately.
execute a callable for each element in an object graph, for all relations that meet the given cascade rule.
type - the name of the cascade rule (i.e. save-update, delete, etc.)
object - the lead object instance. child items will be processed per the relations defined for this object's mapper.
callable_ - the callable function.
recursive - used by the function for internal context during recursive calls, leave as None.
iterate each element in an object graph, for all relations taht meet the given cascade rule.
type - the name of the cascade rule (i.e. save-update, delete, etc.)
object - the lead object instance. child items will be processed per the relations defined for this object's mapper.
recursive - used by the function for internal context during recursive calls, leave as None.
return true if the given mapper shares a common inherited parent as this mapper
compile this mapper into its final internal format.
this is the 'external' version of the method which is not reentrant.
issue DELETE statements for a list of objects.
this is called within the context of a UOWTransaction during a flush operation.
return an instance attribute using a Column as the key.
return the mapper used for issuing selects.
this mapper is the same mapper as 'self' unless the select_table argument was specified for this mapper.
return the contextual session provided by the mapper extension chain, if any.
raises InvalidRequestError if a session cannot be retrieved from the extension chain
return the identity key for the given instance, based on its primary key attributes.
this value is typically also found on the instance itself under the attribute name '_instance_key'.
return an identity-map key for use in storing/retrieving an item from an identity map.
primary_key - a list of values indicating the identifier.
return an identity-map key for use in storing/retrieving an item from the identity map.
row - a sqlalchemy.dbengine.RowProxy instance or other map corresponding result-set column names to their values within a row.
return a list of mapped instances corresponding to the rows in a given ResultProxy.
return True if this mapper handles the given instance.
this is dependent not only on class assignment but the optional "entity_name" parameter as well.
iterates through the collection including this mapper and all descendant mappers.
this includes not just the immediately inheriting mappers but all their inheriting mappers as well.
To iterate through an entire hierarchy, use mapper.base_mapper().polymorphic_iterator().
populate an instance from a result row.
This method iterates through the list of MapperProperty objects attached to this Mapper and calls each properties execute() method.
return the list of primary key values for the given instance.
returns the primary mapper corresponding to this mapper's class key (class + entity_name)
compiles this mapper if needed, and returns the dictionary of MapperProperty objects associated with this mapper.
register DependencyProcessor instances with a unitofwork.UOWTransaction.
this calls register_dependencies on all attached MapperProperty instances.
issue INSERT and/or UPDATE statements for a list of objects.
this is called within the context of a UOWTransaction during a flush operation.
save_obj issues SQL statements not just for instances mapped directly by this mapper, but for instances mapped by all inheriting mappers as well. This is to maintain proper insert ordering among a polymorphic chain of instances. Therefore save_obj is typically called only on a "base mapper", or a mapper which does not inherit from any other mapper.
encapsulates the object-fetching operations provided by Mappers.
given a WHERE criterion, produce a ClauseElement-based statement suitable for usage in the execute() method.
given a WHERE criterion, create a SELECT COUNT statement, execute and return the resulting count value.
returns the count of instances based on the given clauses and key/value criterion. The criterion is constructed in the same way as the select_by() method.
execute the given ClauseElement-based statement against this Query's session/mapper, return the resulting list of instances.
After execution, closes the ResultProxy and its underlying resources. This method is one step above the instances() method, which takes the executed statement's ResultProxy directly.
return an instance of the object based on the given identifier, or None if not found.
The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.
return a single object instance based on the given key/value criterion.
this is either the first value in the result list, or None if the list is empty.
the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.
e.g. u = usermapper.get_by(user_name = 'fred')
return a list of mapped instances corresponding to the rows in a given "cursor" (i.e. ResultProxy).
return a ClauseElement representing the WHERE clause that would normally be sent to select_whereclause() by select_by().
given the key name of a property, will recursively descend through all child properties from this Query's mapper to locate the property, and will return a ClauseElement representing a join from this Query's mapper to the endmost mapper.
given a list of keys that represents a path from this Query's mapper to a related mapper based on names of relations from one mapper to the next, returns a ClauseElement representing a join from this Query's mapper to the endmost mapper.
return an instance of the object based on the given identifier.
If not found, raises an exception. The method will *remove all pending changes* to the object already existing in the Session. The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.
return a new Query object, applying the given list of MapperOptions.
selects instances of the object from the database.
arg can be any ClauseElement, which will form the criterion with which to load the objects.
For more advanced usage, arg can also be a Select statement object, which will be executed and its resulting rowset used to build new object instances. in this case, the developer must ensure that an adequate set of columns exists in the rowset with which to build new object instances.
return an array of object instances based on the given clauses and key/value criterion.
*args is a list of zero or more ClauseElements which will be connected by AND operators.
**params is a set of zero or more key/value parameters which are converted into ClauseElements. the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.
e.g. result = usermapper.select_by(user_name = 'fred')
given a ClauseElement-based statement, execute and return the resulting instances.
given a literal string-based statement, execute and return the resulting instances.
given a WHERE criterion, create a SELECT statement, execute and return the resulting instances.
works like select(), but only returns the first result by itself, or None if no objects returned.
works like select_by(), but only returns the first result by itself, or None if no objects returned. Synonymous with get_by()
works like selectfirst(), but throws an error if not exactly one result was returned.
created within the Query.compile() method to store and share state among all the Mappers and MapperProperty objects used in a query construction.
created within the query.instances() method to store and share state among all the Mappers and MapperProperty objects used in a load operation.
SelectionContext contains these attributes:
mapper - the Mapper which originated the instances() call.
session - the Session that is relevant to the instances call.
identity_map - a dictionary which stores newly created instances that have not yet been added as persistent to the Session.
attributes - a dictionary to store arbitrary data; eager loaders use it to store additional result lists
populate_existing - indicates if its OK to overwrite the attributes of instances that were already in the Session
version_check - indicates if mappers that have version_id columns should verify that instances existing already within the Session should have this attribute compared to the freshly loaded value
given a ClassKey, returns the primary Mapper associated with the key.
encapsulates a set of objects being operated upon within an object-relational operation.
The Session object is **not** threadsafe. For thread-management of Sessions, see the sqlalchemy.ext.sessioncontext module.
bind the given Mapper to the given Engine or Connection.
All subsequent operations involving this Mapper will use the given bindto.
bind the given Table to the given Engine or Connection.
All subsequent operations involving this Table will use the given bindto.
removes all object instances from this Session. this is equivalent to calling expunge() for all objects in this Session.
returns a unique connection corresponding to the given mapper. this connection will not be part of any pre-existing transactional context.
returns a Connection corresponding to the given mapper. used by the execute() method which performs select operations for Mapper and Query. if this Session is transactional, the connection will be in the context of this session's transaction. otherwise, the connection is returned by the contextual_connect method, which some Engines override to return a thread-local connection, and will have close_with_result set to True.
the given **kwargs will be sent to the engine's contextual_connect() method, if no transaction is in progress.
returns a new SessionTransaction corresponding to an existing or new transaction. if the transaction is new, the returned SessionTransaction will have commit control over the underlying transaction, else will have rollback control only.
mark the given instance as deleted.
the delete operation occurs upon flush().
using the given mapper to identify the appropriate Engine or Connection to be used for statement execution, executes the given ClauseElement using the provided parameter dictionary. Returns a ResultProxy corresponding to the execution's results. If this method allocates a new Connection for the operation, then the ResultProxy's close() method will release the resources of the underlying Connection, otherwise its a no-op.
mark the given object as expired.
this will add an instrumentation to all mapped attributes on the instance such that when an attribute is next accessed, the session will reload all attributes on the instance from the database.
remove the given object from this Session.
this will free all internal references to the object. cascading will be applied according to the 'expunge' cascade rule.
flush all the object modifications present in this session to the database.
'objects' is a list or tuple of objects specifically to be flushed; if None, all new and modified objects are flushed.
return an instance of the object based on the given identifier, or None if not found.
The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.
the entity_name keyword argument may also be specified which further qualifies the underlying Mapper used to perform the query.
return the Engine or Connection which is used to execute statements on behalf of the given Mapper.
Calling connect() on the return result will always result in a Connection object. This method disregards any SessionTransaction that may be in progress.
The order of searching is as follows:
if an Engine or Connection was bound to this Mapper specifically within this Session, returns that Engine or Connection.
if an Engine or Connection was bound to this Mapper's underlying Table within this Session (i.e. not to the Table directly), returns that Engine or Conneciton.
if an Engine or Connection was bound to this Session, returns that Engine or Connection.
finally, returns the Engine which was bound directly to the Table's MetaData object.
If no Engine is bound to the Table, an exception is raised.
a dictionary consisting of all objects within this Session keyed to their _instance_key value.
return True if the given object has been marked as expired.
return an instance of the object based on the given identifier.
If not found, raises an exception. The method will *remove all pending changes* to the object already existing in the Session. The ident argument is a scalar or tuple of primary key columns in the order of the table def's primary key columns.
the entity_name keyword argument may also be specified which further qualifies the underlying Mapper used to perform the query.
given an Class, return the primary Mapper responsible for persisting it
copy the state of the given object onto the persistent object with the same identifier.
If there is no persistent instance currently associated with the session, it will be loaded. Return the persistent instance. If the given instance is unsaved, save a copy of and return it as a newly persistent instance. The given instance does not become associated with the session. This operation cascades to associated instances if the association is mapped with cascade="merge".
return a new Query object corresponding to this Session and the mapper, or the classes' primary mapper.
reload the attributes for the given object from the database, clear any changes made.
Add a transient (unsaved) instance to this Session.
This operation cascades the "save_or_update" method to associated instances if the relation is mapped with cascade="save-update".
The 'entity_name' keyword argument will further qualify the specific Mapper used to handle this instance.
save or update the given object into this Session.
The presence of an '_instance_key' attribute on the instance determines whether to save() or update() the instance.
works like execute() but returns a scalar result.
Bring the given detached (saved) instance into this Session.
If there is a persistent instance with the same identifier already associated with this Session, an exception is thrown.
This operation cascades the "save_or_update" method to associated instances if the relation is mapped with cascade="save-update".
represents a Session-level Transaction. This corresponds to one or more sqlalchemy.engine.Transaction instances behind the scenes, with one Transaction per Engine in use.
the SessionTransaction object is **not** threadsafe.
provides a connection pool implementation, which optionally manages connections on a thread local basis. Also provides a DBAPI2 transparency layer so that pools can be managed automatically, based on module type and connect arguments, simply by calling regular DBAPI connect() methods.
given a DBAPI2 module and pool management parameters, returns a proxy for the module that will automatically pool connections, creating new connection pools for each distinct set of connection arguments sent to the decorated module's connect() function.
Arguments: module : a DBAPI2 database module.
poolclass=QueuePool : the class used by the pool module to provide pooling.
Options: See Pool for options.
a Pool implementation which will raise an exception if more than one connection is checked out at a time. Useful for debugging code that is using more connections than desired.
TODO: modify this to handle an arbitrary connection count.
a Pool implementation which does not pool connections; instead it literally opens and closes the underlying DBAPI connection per each connection open/close.
Base Pool class. This is an abstract class, which is implemented by various subclasses including:
QueuePool - pools multiple connections using Queue.Queue
SingletonThreadPool - stores a single connection per execution thread
NullPool - doesnt do any pooling; opens and closes connections
AssertionPool - stores only one connection, and asserts that only one connection is checked out at a time.
the main argument, "creator", is a callable function that returns a newly connected DBAPI connection object.
Options that are understood by Pool are:
echo=False : if set to True, connections being pulled and retrieved from/to the pool will be logged to the standard output, as well as pool sizing information. Echoing can also be achieved by enabling logging for the "sqlalchemy.pool" namespace.
use_threadlocal=True : if set to True, repeated calls to connect() within the same application thread will be guaranteed to return the same connection object, if one has already been retrieved from the pool and has not been returned yet. This allows code to retrieve a connection from the pool, and then while still holding on to that connection, to call other functions which also ask the pool for a connection of the same arguments; those functions will act upon the same connection that the calling method is using.
recycle=-1 : if set to non -1, a number of seconds between connection recycling, which means upon checkout, if this timeout is surpassed the connection will be closed and replaced with a newly opened connection.
auto_close_cursors = True : cursors, returned by connection.cursor(), are tracked and are automatically closed when the connection is returned to the pool. some DBAPIs like MySQLDB become unstable if cursors remain open.
disallow_open_cursors = False : if auto_close_cursors is False, and disallow_open_cursors is True, will raise an exception if an open cursor is detected upon connection checkin.
If auto_close_cursors and disallow_open_cursors are both False, then no cursor processing occurs upon checkin.
uses Queue.Queue to maintain a fixed-size list of connections.
Arguments include all those used by the base Pool class, as well as:
pool_size=5 : the size of the pool to be maintained. This is the largest number of connections that will be kept persistently in the pool. Note that the pool begins with no connections; once this number of connections is requested, that number of connections will remain.
max_overflow=10 : the maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of "sleeping" connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections.
timeout=30 : the number of seconds to wait before giving up on returning a connection
Maintains one connection per each thread, never moving a connection to a thread other than the one which it was created in.
this is used for SQLite, which both does not handle multithreading by default, and also requires a singleton connection if a :memory: database is being used.
options are the same as those of Pool, as well as:
pool_size=5 - the number of threads in which to maintain connections at once.
A simple wrapper for ScopedRegistry that provides a "current" property which can be used to get, set, or remove the session in the current scope.
By default this object provides thread-local scoping, which is the default scope provided by sqlalchemy.util.ScopedRegistry.
Usage: engine = create_engine(...) def session_factory(): return Session(bind_to=engine) context = SessionContext(session_factory)
s = context.current # get thread-local session context.current = Session(bind_to=other_engine) # set current session del context.current # discard the thread-local session (a new one will # be created on the next call to context.current)
this plugin installs thread-local behavior at the Engine and Session level.
The default Engine strategy will be "threadlocal", producing TLocalEngine instances for create_engine by default. With this engine, connect() method will return the same connection on the same thread, if it is already checked out from the pool. this greatly helps functions that call multiple statements to be able to easily use just one connection without explicit "close" statements on result handles.
on the Session side, module-level methods will be installed within the objectstore module, such as flush(), delete(), etc. which call this method on the thread-local session.
Note: this mod creates a global, thread-local session context named sqlalchemy.objectstore. All mappers created while this mod is installed will reference this global context when creating new mapped object instances.
Builds a query one component at a time via separate method calls, each call transforming the previous SelectResults instance into a new SelectResults instance with further limiting criterion added. When interpreted in an iterator context (such as via calling list(selectresults)), executes the query.
constructs a new SelectResults using the given Query object and optional WHERE clause. ops is an optional dictionary of bind parameter values.
join the table of this SelectResults to the table located against the given property name.
subsequent calls to join_to or outerjoin_to will join against the rightmost table located from the previous join_to or outerjoin_to call, searching for the property starting with the rightmost mapper last located.
return the results represented by this SelectResults as a list.
this results in an execution of the underlying query.
outer join the table of this SelectResults to the table located against the given property name.
subsequent calls to join_to or outerjoin_to will join against the rightmost table located from the previous join_to or outerjoin_to call, searching for the property starting with the rightmost mapper last located.
raised for all those conditions where invalid arguments are sent to constructed objects. This error generally corresponds to construction time state errors.
corresponds to internal state being detected in an invalid state
raised when a concurrent modification condition is detected
sqlalchemy was asked to do something it cant do, return nonexistent data, etc. This error generally corresponds to runtime state errors.
raised by RowProxy when a nonexistent column is requested from a row
sqlalchemy was asked to load a table's definition from the database, but the table doesn't exist.
raised when the execution of a SQL statement fails. includes accessors for the underlying exception, as well as the SQL and bind parameters
raised when a connection pool times out on getting a connection
An SQLEngine proxy that automatically connects when necessary.
Basis for all proxy engines.
this method is required to be present as it overrides the compiler method present in sql.Engine