Django Celery

Posted onby admin
  • Celery 4.0 supports Django 1.8 and newer versions. Please use Celery 3.1 for versions older than Django 1.8. To use Celery with your Django project you must first define an instance of the Celery library (called an “app”) If you have a modern Django project layout like.
  • Djangocelerybeat.models.PeriodicTasks; This model is only used as an index to keep track of when the schedule has changed. Whenever you update a PeriodicTask a counter in this table is also incremented, which tells the celery beat service to reload the schedule from the database.

Introduction

Manage asynchronous tasks with Django and Celery With Celery, you can schedule tasks that run outside the HTTP request/response flow, ensuring that your users are never slowed down by work like: Running machine learning models.

The time has come, when the application we created and developed is ready for deployment.In this post, we are going to show a quick way of setting it to “production” using:

  • RabbitMQ as a message broker,
  • Gunicorn for running of the application,
  • Supervisor for monitoring of both Django and Celery parts, and
  • Nginx to be our proxy server.

The deployment stategy is largely inspired by this work.However, it is slighlty more tuned towards the author’s own needs for this project.

Creating space

Linux permissions

The space for our application will be a designated Linux user with non-sudo privilages, sharing permissions through devapps group.

This series of commands sets up the user space for the project with all required permissions.

Virtual environment

As mentioned at the end of the earier posts, we use virtual environment for keeping track of all package dependencies of each project. Once again, we need to install the virtual environment - this time on the server.For that, we will switch to the hello_celery user.

Porting the project

Django

Last time, we have committed our project to github.Having prepared the space for it now, we can get it by cloning the repository, and use the requirements.txt file to recreate our environment.

The project is now stored under /devapps/hello_celery/hello_celery.

Updating settings

There are two things we must not forget when moving the project towards production:

  • We will use another IP address or domain name, other than 127.0.0.1.
  • We do not want DEBUG = True, as this will expose sensitive information about the application to the outside world.

For this reason settings.py needs to be updated, with the proper domain name (or IP address) updated, and DEBUG set to False.

RabbitMQ

RabbitMQ should be installed on Ubuntu by default. Still, it could make sense to update it to the newestversion.

Sometimes, rabbitMQ may refuse to work, which happens if, for example, the server has been operating under a different IP address before.If this could be the case, inspect the /etc/hosts file for consistency, and execute sudo /etc/init.d/rabbitmq-server restart.

Gunicorn

The project is now ready to be served by proper tools.As the first tool, we will configure Gunicorn.Assuming we are still logged in as hello_celery user and operate under virtual environment, we can install Gunicorn through pip.

To test it, we should bind it through a wsgi.py script in the following way.

Here, 192.168.xxx.yyy:pppp refer to a local IP address and a designated port we can use in our LAN for to test the application.At this stage, the Celery-related part will not function though.

Daemonizing Gunicorn

For to daemenize Gunicorn, we can set up a script (like in here) and locate it under /devapps/hello_celery/bin/gunicorn_start.

t this point, the script should be able to run the appliction when executed as user example.Note that this file needs to be added permission to execute.

Supervisor

Remember earier, we had to use two terminal windowns to run the project:one for the django application and the other for Celery.Here, we are going to use Supervisor to monitor both of them, and let the operate in the background.

Configuration

First, we must install Supervisor if not already installed on our machine.

Once installed, it is configured through a set of conf files that are speific to processes ran.Both of the scripts should reside under /etc/supervisor/conf.d/.

hello_celery.conf (for main application)

hello_celery_celery.conf (for the celery part)

Launching

Once both files are saved, execute:

Make sure that /devapps/hello_celery/logs directory exists.If necessary, use touch command to create empty files for hello_celery.log and hello_celery-celery.log files.

Additional useful commands include replacing the word start with restart, stop and status.Also, in case of possible errors, the logs can be inspected usingtail -f /devapps/hello_celery/logs/<logfile>.

Nginx

The last piece of the puzzle is the proxy sevrer.Here, we use Nginx.

If the server may is, for some reason, not set to listen at usual port 80, the defualt configurations must be updated under /etc/nginx/sites-available/default, in which line listen 80; should be replaced with the correct port (e.g. listen 1234).When updated, the service can be initiated by making a symbolic link to sites-enabled and executing the following:

Configuration

To configure Nginx for to serve our application, we define settings in /etc/nginx/sites-available/hello_celery.

hello_celery (excluding all comments)

Launching

Finally, we create another symbolic link.

Now, the application should be running on a server.

Final Words

Django Celery

In this series of three posts, we have come a long way.We created an application from scratch and used Celery as an anynchroneous task scheduler and an AJAX-based mechanism for monitoring the progress.Finally, we have also presented a quick way towards deployment in few steps.

Obviously, Celery offers many more ways of scheduling the tasks, just as much as AJAX APIs offers many more options for handling the requests.The core part is, however, there and it is up to your needs and imagination of what you will use it for.

Hey! Do you mind helping me out?

It's been 4 years since I launched this blog. Now, I would like to bring it to the next level. I want to record some screencast tutorial videos on the very topics that brought you here!

If you want more of the stuff, you will help me greatly by filling out a survey I have prepared for you. By clicking below, you will be redirected to Google Forms with a few questions. Please, answer them. They won't take more than 5 minutes and I do not collect any personal data.

Thank you! I appreciate it.

Debugging Celery Tasks in Django Projects

I recently had the opportunity to work on a Django project that was using Celery with RabbitMQ to handle long-running server-side processing tasks. Some of the tasks took several hours to complete. The tasks had originally been executed with the at command and others had been managed with cron jobs. The client had started to migrate several of the tasks to use Celery when I joined the project.

As I started to debug the Celery tasks, I quickly discovered there were many moving parts involved and it was not immediately clear which piece of the system was the cause of the problem. I was working in a local development environment that closely matched the production system. The dev environment consisted of a virtualenv created using the same method Szymon Guz wrote about in his article Django and virtualenvwrapper. Django, Celery and their related dependencies were installed with pip and I installed RabbitMQ with Homebrew. After activating my virtualenv and starting up everything up (or so I thought), I jumped in to an IPython shell and began to debug the tasks interactively. Some tasks completed successfully but they finished almost instantaneously which didn’t seem right. The client had experienced the same issue when they excecuted tasks on their development server.

Because I was joining an existing project in progress, the system administration and configuration had already been taken care of by other members of the team. Howerver, in the process of configuring my local development server to mimic the production systems, I learned a few things along the way, described below.

RabbitMQ

RabbitMQ is a message broker; at its most basic it sends and receives messages between sender (publisher) and receiver (consumer) applications. It’s written in Erlang which helps to make it highly parallel and reliable. The RabbitMQ web site is a great place to learn more about the project. For my purposes I needed to create a user and virtual host (vhost) and set up permissions for Celery to communicate with the RabbitMQ server. This was done with the rabbitmqctl command. I issued the following command to start up the server and let the process run in the background.

I also enabled the management plugin which provides both a web-based UI and a command line interface for managing and monitoring RabbitMQ. This is what the web-based UI looks like:

django-celery

Celery works very well with Django thanks in large part to the django-celery module. The django-celery module includes the djcelery app which can be plugged in to the Django admin site for your project. Connecting Django to Celery and RabbitMQ requires a few simple steps:

  1. Add djcelery to the list of INSTALLED_APPS in the settings.py file for the project.
  2. Add the following lines to settings.py:

Django-celery-beat

  1. Create the celery database tables using the syncdb management command
  2. Configure the broker setttings in settings.py:

Celery

With the RabbitMQ server up and running and Django configured to connect to Celery the last few steps involved starting up the Celery worker and its related monitoring apps. The Celery daemon (celeryd) has lots of options that you can check out by running the following command:

For my purposes, I wanted Celery to broadcast events which the various monitoring applications could then subscribe to. It would also be good to print some helpful debugging info to the logs. I started up the Celery worker daemon with the following command:

Because I specified the -E flag, the celeryev application could be used to monitor and manage the Celery worker inside a terminal which was very helpful:

For Django to capture and save Celery task information to the database, the celerycam application needs to be running. This command line app takes a snapshot of Celery every few seconds or at an interval you specify on the command line:

With celerycam running, the Django admin interface is updated as Celery tasks are executed:

You can also view the detail for a particular task including any error messages from the task code:

With RabbitMQ, celeryd and celerycam ready to go, the Django development server could be started to begin testing and debugging Celery task code. To demonstrate this workflow in action, I wrote a simple Celery task that could be used to simulate how Django, Celery and RabbitMQ all work together.

Tying it all Together

With everything configured, I was ready to get to work debugging some Celery tasks. I set up a dashboard of sorts in tmux to keep an eye on everything as I worked on the code for a particular task:

Django Celery Rabbitmq

Clockwise from the bottom left you’ll see an IPython shell to debug the code interactively, the Django development server log, celeryev, Celery daemon (with debugging info) and the task code in Vim.

When I started developing task-related code I wasn’t sure why my changes were not showing up in the Celery or Djcelery logs. Although I had made some changes, the same errors persisted. When I looked into this further I found that Celery caches the code used for a particular task and re-uses it the next time said task is executed. In order for my new changes to take effect I needed to restart the Celery daemon. As of Celery version 2.5 there is an option to have Celery autoreload tasks. However, the version of Celery used in this client project did not yet support this feature. If you find yourself working with Django, Celery and RabbitMQ, I hope you’ll find this helpful.

Comments

Django Celery Beat

Visit theGitHub issueto view and write comments.