.. _async: ``madrona.async`` - Asyncronous Processing =========================================== Madrona includes a strategy for runnning lengthy processes in the background. To implement this strategy we use `Celery `_ as our distributed task queue, and we created the ``madrona.async`` app for easy exchanges between the Celery tables and the codebase. For more information on how to get Celery working on your machine, see the :ref:`Asynchronous Task Queue` documentation. Overview ******** The ``madrona.async`` app makes many of the typical interactions with `Celery `_ simpler as the ``madrona.async`` app provides the ability to store and retrieve process results based on a url (as it is often the case that the same url request expects the same result), as well as making common Celery requests and interactions such as checking the status or retrieving the results of a task easier by abstracting away the need to manually digging through the celery tables yourself. .. note:: If `Celery `_ is not yet set up for your machine, you'll want to see the :ref:`Asynchronous Task Queue` documentation to get Celery up and running. How to Use the Async App ************************ First, you will want to add a ``tasks.py`` file to the app that contains the process you want to run asynchronously. This file will house the tasks that will be called by the ``async`` app, and each of the tasks should define a discrete operation that will be tagged as such (``@task``) so that they will be registered with Celery when the ``celeryd`` process is started. The following shows a sample ``tasks.py`` with a simple ``add`` method that may be useful for testing: .. code-block:: python from celery.decorators import task @task def add(x, y): return x + y Once you have a ``tasks.py``, you'll be able to utilize the ``async`` app to start your tasks, check their status, and retrieve their results. Basic Uses ---------- Often times, you'll want to simply run a process in the background. In these cases a simple call to ``async.check_status_or_begin`` with the task method, the task method arguments, and a flag will suffice. .. code-block:: python from my_app import tasks status_text, task_id = tasks.check_status_or_begin(tasks.add, task_args=(3,5)) The above call returns an id that helps to identify the process for later retrieval. Rather than force you to store this ``task_id`` somewhere for later use, it may be more useful to utilize the url that triggered this process in the first place. .. code-block:: python from my_app import tasks url = request.META['PATH_INFO'] status_text, task_id = tasks.check_status_or_begin(tasks.add, task_args=(3,5), polling_url=url) The advantage to this url strategy is that there is often a one-to-one correlation between a url and an expected result. When the same url is accessed, such as ``/add/3/5/``, ``async`` methods can be used to help determine whether the process associated with that url has already been completed and can be retrieved for the user (without running the process again), or whether the process is still running and the user should receive some sort of 'process is still running...' message. For the most part, methods in the ``async`` app have been configured to use both task ids and polling urls as identifiers (or keys) to a process. A common flow of control may be as follows: .. code-block:: python #get the url that caused this view to execute url = request.META['PATH_INFO'] #check to see if the requested process has been run already if process_is_complete(url): return HttpResponse(str(get_process_result(url))) else: #start the process or continue to wait for the process to complete from my_app import tasks status_text, task_id = check_status_or_begin(tasks.add, task_args=(3,5), polling_url=url) return render_to_response(my_template, RequestContext( request, {'status': status_text} )) The above strategy allows the code to deal with the possibility that the process has already completed and the results are cached, or that the process is still running in the background, or that the process hasn't begun at all. If the results have already been cached, then they can be retrieved by the get_process_result method. In the other cases, the check_status_or_begin method will provide the user with an explanation relating to whether the process is still running or whether it just now begun. In both of these latter cases, the task_id is returned as well in case you are wish to use that as an identifier rather than the url. .. note:: The manner in which the import tasks statement is structured is very important to Celery. Where one of the following strategies may work on one machine or platform, the other strategy might be necessary on another machine or platform. .. code-block:: python >>>from my_proj.my_app.tasks import add >>>result = add.delay(2,2) >>>result.status PENDING >>>from my_proj.my_app import tasks >>>result = tasks.add.delay(2,2) >>>result.status SUCCESS If the process seems to register with Celery but never completes (status equals ``PENDING`` and never changes), then your import command may not be structured correctly for your platform. If ``result.status`` eventually returns ``STARTED`` or ``SUCCESS``, then your import command is structured correctly and should be written as such in your code. madrona.async API ----------------- The following is a list of all the functions included with the ``async`` app. **check_status_or_begin(task_method, task_args=(), task_kwargs={}, polling_url=None, task_id=None, check_running=True, cache_results=True)** If check_running is left as True, this method begins the process only if the process is not already running. .. note:: In order to check whether the process is running or not, either a polling_url or a task_id must be passed. If neither is provided, the method assumes that this check should not be made. If check_running is set to False (or if neither task_id, nor polling_url is provided), this method begins the process. In such cases, the function referred to by ``task_method`` will be called with the arguments included in the ``task_args`` parameter. If ``polling_url`` is given a value and ``cache_results`` remains set to ``True``, then the ``polling_url`` can, in the future, be used as a key for cache retrieval. .. note:: This method does not check to see if the process has already been completed. The process_is_completed method can be used to check for process completion, and the get_process_result method can be used for retrieving the results. The return values include a rendered template, explaining whether the process was already running, or has been started, and the task_id of that process. **process_is_running_or_complete(polling_url=None, task_id=None)** This method takes either the polling url or the task id as a unique identifier. Returns ``True`` if the process is currently running, or if the process is complete. **process_is_running(polling_url=None, task_id=None)** This method takes either the polling url or the task id as a unique identifier. Returns ``True`` if the process is running (``status=='STARTED'``). **process_is_complete(polling_url=None, task_id=None)** This method takes either the polling url or the task id as a unique identifier. Returns ``True`` if the process is complete (``status=='SUCCESS'``). **get_process_result(polling_url=None, task_id=None)** This method takes either the polling url or the task id as a unique identifier. Returns the cached result of the process. **get_taskid_from_url(polling_url=None)** This method takes a polling url and returns the related task id. **get_url_from_taskid(task_id=None)** This method takes a task id and returns the related polling url.