Google Code offered in: English - Español - 日本語 - 한국어 - Português - Pусский - 中文(简体) - 中文(繁體)
With the Task Queue API, applications can perform work outside of a user request but initiated by a user request. If an app needs to execute some background work, it may use the Task Queue API to organize that work into small, discrete units, called Tasks. The app then inserts these Tasks into one or more Queues. App Engine automatically detects new Tasks and executes them when system resources permit.
The Task Queue service is currently released as an experimental feature. This means the API and behavior of the service may change in ways that are not compatible with earlier releases, even for minor releases of the runtime environments. Please try out this feature and let us know what you think! (more info)
A Python app sets up queues using a configuration file named queue.yaml. See Python Task Queue Configuration. If an app does not have a queue.yaml file, it has a queue named default with some default settings.
To enqueue a task, you call the taskqueue.add() function. (You can also create a Task object and call its add() method.) The task consists of data for a request, including a URL path, parameters, HTTP headers, and an HTTP payload. It can also include the earliest time to execute the task (the default is as soon as possible) and a name for the task. The task is added to a queue, then performed by the Task Queue service as the queue is processed.
Note: While this feature is experimental, the Python module for the Task Queue API is google.appengine.api.labs.taskqueue. When the feature is formally added to the main API, the package path will change.
The following example defines a task handler (CounterWorker) that increments a counter in the datastore, mapped to the URL /worker. It also defines a user-accessible request handler that displays the current value of the counter for a GET request, and for a POST request enqueues a task and returns. Please note that the task in this example should run at a rate no greater than once per second.
import wsgiref.handlers
from google.appengine.api.labs import taskqueue
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
class Counter(db.Model):
count = db.IntegerProperty(indexed=False)
class CounterHandler(webapp.RequestHandler):
def get(self):
self.response.out.write(template.render('counters.html',
{'counters': Counter.all()}))
def post(self):
key = self.request.get('key')
# Add the task to the default queue.
taskqueue.add(url='/worker', params={'key': key})
self.redirect('/')
class CounterWorker(webapp.RequestHandler):
def post(self): # should run at most 1/s
key = self.request.get('key')
def txn():
counter = Counter.get_by_key_name(key)
if counter is None:
counter = Counter(key_name=key, count=1)
else:
counter.count += 1
counter.put()
db.run_in_transaction(txn)
def main():
wsgiref.handlers.CGIHandler().run(webapp.WSGIApplication([
('/', CounterHandler),
('/worker', CounterWorker),
]))
if __name__ == '__main__':
main()
(In this example, 'counters.html' refers to a Django template that contains the HTML for a page that displays the counter value, and a button to trigger a POST request to the / URL.)
Note that this example is not idempotent. It is possible for the task queue to execute a task more than once. In this case, the counter is incremented each time the task is run, possibly skewing the results.
In App Engine background processing, a task is a complete description of a small unit of work. This description consists of two parts:
As an example, consider a calendaring application which needs to notify an invitee, via email, that an event has been updated. The particular 'email notification task' for this might be defined as:
Perhaps there are multiple invitees who need to be updated for a given event. The developer may choose to create a new notification Task for each attendee individually:
As this example shows, it is possible that multiple tasks share the same common code and differ only in their data payload. Similarly, multiple tasks may share the same data payload but reference different code, as in this for an ecommerce site:
More examples of Tasks include:
For App Engine to support concrete Task instances, two mechanisms are needed:
Fortunately, the Internet provides such a solution already, in the form of an HTTP request and its response. The data payload is the content of the HTTP request, such as web form variables, XML, JSON, or encoded binary data. The code reference is the URL itself; the actual code is whatever logic the server executes in preparing the response.
Revisiting the calendar app example above, the tasks can be revised as:
Using this model, App Engine's Task Queue API allows you to specify tasks as HTTP Requests (both the contents of the request as its data, and the target URL of the request as its code reference). Programmatically referring to a bundled HTTP request in this fashion is sometimes called a "web hook."
Importantly, the offline nature of the Task Queue API allows you to specify web hooks ahead of time, without waiting for their actual execution. Thus, an application might create many web hooks at once and then hand them off to App Engine; the system will then process them asynchronously in the background (by 'invoking' the HTTP request). This web hook model enables efficient parallel processing - App Engine may invoke multiple tasks, or web hooks, simultaneously.
To summarize, the Task Queue API allows a developer to execute work in the background, asynchronously, by chunking that work into offline web hooks. The system will invoke those web hooks on the application's behalf, scheduling for optimal performance by possibly executing multiple webhooks in parallel. This model of granular units of work, based on the HTTP standard, allows App Engine to efficiently perform background processing in a way that works with any programming language or web application framework.
As mentioned above, a Task references its implementation via URL. For example, a task which fetches and parses an RSS feed might use a worker URL called /app_worker/fetch_feed. With the App Engine Task Queue API, you may use any URL as the worker for a task, so long as it is within your application; all Task worker URLs must be specified as relative URLs.
taskqueue.add(url='/path/to/my/worker') taskqueue.add(url='/path?a=b&c;=d', method='GET')
In addition to a Task's contents (its data payload and worker URL), it is also possible to specify a Task's name. This provides a lightweight mechanism for ensuring once-only semantics. Once a Task with name N is written, any subsequent attempts to insert a Task named N will fail. Eventually (at least seven days after the task successfully executes), the task will be deleted and the name N can be reused.
Requests from the Task Queue service contain the following HTTP headers:
X-AppEngine-QueueName, the name of the queue (possibly default)X-AppEngine-TaskName, the name of the task, or a system-generated unique ID if no name was specifiedX-AppEngine-TaskRetryCount, the number of times this task has been retried; for the first attempt, this value is 0You can prevent users from accessing URLs of tasks by restricting access to administrator accounts. Task queues can access admin-only URLs. You can restrict a URL by adding login: admin to the handler configuration in app.yaml.
An example might look like this in app.yaml:
application: hello-tasks version: 1 runtime: python api_version: 1 handlers: - url: /tasks/process script: process.py login: admin
For more information see Python Application Configuration: Requiring Login or Administrator Status.
To test a task web hook, sign in as an administrator and visit the URL of the handler in your browser.
Once you have created a Task and inserted it into a queue for processing, App Engine will execute it as soon as possible (subject to the app's scheduling criteria, if specified). Since a task is a web hook, its lifecycle is the same as any other request in App Engine - it may use the same APIs and is subject to the same constraints as an 'online' request from a user's browser. Notably, this means that the lifetime of a single task's execution is limited to 30 seconds. If your task's execution nears the 30 second limit, App Engine will raise an exception which you may catch and then quickly save your work or log process.
When the developer inserts a new task into a queue, the order in which that task will execute (relative to other tasks) is governed by the contents and properties of that queue. However, it is possible to specify certain properties (such as an ETA) which request special scheduling on a per-task basis.
If the execution of a particular Task fails (by returning any HTTP status code outside of the range 200-299), App Engine will attempt to retry until it succeeds. The system will back off gradually so as not to flood your application with too many requests, but it will retry failed tasks at least once a day at minimum.
When implementing the code for Tasks (as worker URLs within your app), it is important that you consider whether the task is idempotent. App Engine's Task Queue API is designed to only invoke a given task once, however it is possible in exceptional circumstances that a Task may execute multiple times (e.g. in the unlikely case of major system failure). Thus, your code must ensure that there are no harmful side-effects of repeated execution.
If a task performs sensitive operations (such as modifying important data), the developer may wish to protect the worker URL to prevent a malicious external user from calling it directly. This is possible by marking the worker URL as admin-only in the app configuration.
Thus far, this document has explained how Tasks can be used to encapsulate small chunks of work for asynchronous execution. When used in large numbers, Tasks provide an efficient and powerful tool for background processing. However, the developer may need to manage the execution of these tasks - in particular, he may need to control the rate at which tasks are invoked, so as not to exhaust resources.
The Task Queue API provides Queues as a container for tasks. All new tasks must be inserted into a queue; a developer can influence task execution by modifying properties of the queue. As an example, the developer may wish to ensure that his system sends no more than 10 emails per second. He can accomplish this by using a queue called email-throttle, which he configures with a rate of 10 invocations per second. Within his app's code, he makes it so all tasks which send email are inserted into this email-throttle queue. The App Engine system respects his configuration on the email-throttle queue and invokes its tasks at a rate of less than or equal to 10 per second, even if thousands of tasks are inserted at a single instant.
Beyond influencing the rate of task execution, Queues also provide the ordering in which tasks are consumed by the system (not withstanding the special task scheduling parameters). Fundamentally, Queues deliver a best effort FIFO order (first in, first out) - a developer creates new tasks and inserts them into the tail of the queue, the system removes tasks from the head for execution. However, the system attempts to deliver the lowest latency possible for any given task via specially optimized notifications to the scheduler. Thus, in the case that a queue has a large backlog of tasks, the system's scheduling may "jump" new tasks to the head of the queue.
Although a queue defines a general FIFO ordering, tasks are not executed entirely serially. Multiple tasks from a single queue may be executed simultaneously by the scheduler, so the usual locking and transaction semantics need to be observed for any work performed by a task.
It is important to understand that Queues are a mechanism for manipulating tasks in aggregate; Queues do not dictate the contents of any given Task. It is possible for a single Queue to contain many different types of tasks, which have varying data payloads and worker URLs.
For convenience, App Engine provides a default queue to all applications. A developer may use this queue immediately without any additional configuration. This queue automatically has a throughput rate of five task invocations per second, however you may configure its properties in the same fashion as any user-defined queue with configuration for a queue named default. Code may always insert new Tasks into the default queue, but if you wish to disable execution of these tasks, you may do so by adding it to your configuration and lowering its rate to zero.
You can specify the worker URL for a Task by passing it to the Task object constructor. If you do not specify a worker URL, the Task uses a default worker URL named after the queue:
/_ah/queue/queue_name
As an example, for a queue named email-worker-queue, you may implement a default request handler at /_ah/queue/email-worker-queue (within your application). Any new Task inserted into the email-worker-queue which does not have a worker URL of its own (in other words, it's purely data with no code reference) will be invoked using the URL of /_ah/queue/email-worker-queue.
A queue's default URL is used if, and only if, a Task does not have a worker URL of its own. If a task does have its own worker URL, then it will only be invoked at that URL, never another. (Once inserted into a Queue, a Task is immutable).
Please note that if a Task does not have a worker URL, then the Task will be invoked against the queue's default URL even if there is currently no handler defined for the queue's default URL! In this case, the system would attempt to invoke the Task with a nonexistent URL which would fail with a 404 (this 404, along with the exact URL that was tried, will be available in your application's logs). Due to the failure state of this 404, the system will save the Task and retry it until it is eventually successful. You can clear (or 'flush') tasks that can't complete successfully using the Administration Console.
All versions of an application share the same task queues. The version of the app used to perform a task depends on how the task was enqueued.
If the version of the app that enqueues a task is the default version (http://app-id.appspot.com or a Google Apps domain), the queue will use the default version of the app to perform the task, even if the default version has changed since the task was enqueued. If the app enqueues a task then the default version is changed, the queue will use the task's URL path with the new default version of the app to perform the task.
If the version of the app that enqueues a task is not the default version when the task is enqueued (such as http://3.latest.app-id.appspot.com/), the queue will use that version of the app to perform the task, regardless of which version is the default version.
You can manage task queues for an application using the Administration Console. You can use the Console to list queues and their configuration, inspect the tasks currently waiting to be executed, and manually delete individual tasks or every task in a queue. This is useful if a task cannot be completed successfully and is stuck waiting to be retried.
To manage queues, sign in to the Administration Console and select "Task Queues." Note that the default queue only appears in the Console after the app has enqueued a task to it for the first time.
When your app is running in the development server, task queues are not processed automatically. Instead, task queues accrue tasks which you can examine and execute from the developer console:
http://localhost:8080/_ah/admin/taskqueue
To execute tasks, select the queue by clicking on its name, then select the tasks to execute. To clear a queue without executing any tasks, click the "Flush Queue" button.
Execution of a task counts toward the following quotas:
The act of executing a task consumes bandwidth-related quotas for the request and response data, just as if the request handler were called by a remote client. When the task queue processes a task, the response data is discarded.
For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console.
In addition to quotas, the following limits apply to the use of Task Queues:
| Limit | Amount |
|---|---|
| task object size | 10 kilobytes |
| number of active queues (not including the default queue) | 10 |
| total queue execution rate | 20 task invocations per second |
| maximum countdown/ETA for a task | 30 days from the current date and time |
Note: Quotas and limits for the Task Queue service may change while the feature is still experimental.
google.appengine.api.labs.taskqueue
In particular, we are looking for feedback on the following:
Once the API is finalized, we'll move it out of Labs to its final location at:
google.appengine.api.taskqueue
Of course, we'll give multiple weeks worth of notice before this happens and we'll work with developers on the best transition possible. Please keep in mind that the following may change before the Task Queue API leaves Labs: