English | Site Directory

Datastore Python API Overview

The Google App Engine datastore provides robust scalable data storage for your web application. The datastore is designed with web applications in mind, with an emphasis on read and query performance. It stores data entities with properties, organized by application-defined kinds. It can perform queries over entities of the same kind, with filters and sort orders on property values and keys. All queries are pre-indexed for fast results over very large data sets. The datastore supports transactional updates, using entity groupings defined by the application as the unit of transactionality in the distributed data network.

Introducing the Datastore

The App Engine datastore stores and performs queries over data objects, known as entities. An entity has one or more properties, named values of one of several supported data types. A property can be a reference to another entity.

The datastore can execute multiple operations in a single transaction, and roll back the entire transaction if any of the operations fail. This is especially useful for distributed web applications, where multiple users may be accessing or manipulating the same data object at the same time.

Unlike traditional databases, the datastore uses a distributed architecture to manage scaling to very large data sets. An App Engine application can optimize how data is distributed by describing relationships between data objects, and by defining indexes for queries.

The App Engine datastore is strongly consistent, but it's not a relational database. While the datastore interface has many of the same features of traditional databases, the datastore's unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically.

Data Modeling With Python

Datastore entities are schemaless: Two entities of the same kind are not obligated to have the same properties, or use the same value types for the same properties. The application is responsible for ensuring that entities conform to a schema when needed. For this purpose, the Python SDK includes a rich library of data modeling features that make enforcing a schema easy.

In the Python API, a model describes a kind of entity, including the types and configuration for its properties. An application defines a model using Python classes, with class attributes describing the properties. Entities of a kind are represented by instances of the corresponding model class, with instance attributes representing the property values. An entity can be created by calling the constructor of the class, then stored by calling the put() method.

import datetime
from google.appengine.ext import db
from google.appengine.api import users

class Employee(db.Model):
  name = db.StringProperty(required=True)
  role = db.StringProperty(required=True, choices=set(["executive", "manager", "producer"]))
  hire_date = db.DateProperty()
  new_hire_training_completed = db.BooleanProperty()
  account = db.UserProperty()

e = Employee(name="",
             role="manager",
             account=users.get_current_user())
e.hire_date = datetime.datetime.now()
e.put()

The datastore API provides two interfaces for queries: a query object interface, and a SQL-like query language called GQL. A query returns entities in the form of instances of the model classes that can be modified and put back into the datastore.

training_registration_list = [users.User("Alfred.Smith@example.com"),
                              users.User("jharrison@example.com"),
                              users.User("budnelson@example.com")]
employees_trained = db.GqlQuery("SELECT * FROM Employee WHERE account IN :1",
                                training_registration_list)
for e in employees_trained:
    e.new_hire_training_completed = True
    db.put(e)

Entities and Properties

A data object in the App Engine datastore is known as an entity. An entity has one or more properties, named values of one of several data types, including integers, floating point values, strings, dates, binary data, and more.

Each entity also has a key that uniquely identifies the entity. The simplest key has a kind and a unique numeric ID provided by the datastore. The ID can also be a string provided by the application.

An application can fetch an entity from the datastore by using its key, or by performing a query that matches the entity's properties. A query can return zero or more entities, and can return the results sorted by property values. A query can also limit the number of results returned by the datastore to conserve memory and run time.

Unlike relational databases, the App Engine datastore does not require that all entities of a given kind have the same properties. The application can specify and enforce its data model using libraries included with the SDK, or its own code.

A property can have one or more values. A property with multiple values can have values of mixed types. A query on a property with multiple values tests whether any of the values meets the query criteria. This makes such properties useful for testing for membership.

Queries and Indexes

An App Engine datastore query operates on every entity of a given kind (a data class). It specifies zero or more filters on entity property values and keys, and zero or more sort orders. An entity is returned as a result for a query if the entity has at least one value (possibly null) for every property mentioned in the query's filters and sort orders, and all of the filter criteria are met by the property values.

Every datastore query uses an index, a table that contains the results for the query in the desired order. An App Engine application defines its indexes in a configuration file. The development web server automatically adds suggestions to this file as it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application. As the application makes changes to datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index.

This mechanism supports a wide range of queries and is suitable for most applications. However, it does not support some kinds of queries you may be used to from other database technologies.

Transactions and Entity Groups

With the App Engine datastore, every attempt to create, update or delete an entity happens in a transaction. A transaction ensures that every change made to the entity is saved to the datastore, or, in the case of failure, none of the changes are made. This ensures consistency of data within an entity.

You can perform multiple actions on an entity within a single transaction using the transaction API. For example, say you want to increment a counter field in an object. To do so, you need to read the value of the counter, calculate the new value, then store it. Without a transaction, it is possible for another process to increment the counter between the time you read the value and the time you update the value, causing your app to overwrite the updated value. Doing the read, calculation and write in a single transaction ensures that no other process interferes with the increment.

You can make changes to multiple entities within a single transaction. To support this, App Engine needs to know in advance which entities will be updated together, so it knows to store them in a way that supports transactions. You must declare that an entity belongs to the same entity group as another entity when you create the entity. All entities fetched, created, updated or deleted in a transaction must be in the same entity group.

Entity groups are defined by a hierarchy of relationships between entities. To create an entity in a group, you declare that the entity is a child of another entity already in the group. The other entity is the parent. An entity created without a parent is a root entity. A root entity without any children exists in an entity group by itself. Each entity has a path of parent-child relationships from a root entity to itself (the shortest path being no parent). This path is an essential part of the entity's complete key. A complete key can be represented by the kind and ID or key name of each entity in the path.

The datastore uses optimistic concurrency to manage transactions. While one app instance is applying changes to entities in an entity group, all other attempts to update any entity in the group fail instantly. The app can try the transaction again to apply it to the updated data.

Quotas and Limits

Each call to the datastore API counts toward the Datastore API Calls quota. Note that some library calls result in multiple calls to the underlying datastore API.

Data sent to the datastore by the app counts toward the Data Sent to (Datastore) API quota. Data received by the app from the datastore counts toward the Data Received from (Datastore) API quota.

The total amount of data currently stored in the datastore for the app cannot exceed the Stored Data (adjustable) quota. This includes entity properties and keys, but does not include indexes.

The amount of CPU time consumed by datastore operations applies to the following quotas:

  • CPU Time (adjustable)
  • Datastore CPU Time

For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console.

In addition to quotas, the following limits apply to the use of the datastore:

Limit Amount
maximum entity size 1 megabyte
maximum number of values in an index for an entity (1) 1,000 values
maximum number of entities in a batch put or batch delete 500 entities
maximum number of entities in a batch get 1,000 entities
maximum results offset for a query 1,000
  1. An entity uses one value in an index for every column × every row that refers to the entity, in all indexes. The number of indexes values for an entity can grow large if an indexed property has multiple values, requiring multiple rows with repeated values in the table.