Cron and Queues in Drupal 8

Cron is used to perform periodic actions. For example you would like to:

  • Send a weekly newsletter every Monday at 12:00 a.m.
  • Create a database backup once per day.
  • Publish or unpublish a scheduled node.
  • Send reminder emails to users to activate their accounts.

... or some other task(s) that has to be automated and run at specific intervals.

Cron in Drupal

Cron configuration can be found at Administration > Configuration > System > Cron

alt

What tasks does Drupal perform when cron is run?

This depends entirely on what modules you have enabled and use of course, but here are some pretty usual examples on what tasks are run in cron:

  • Updating search indexes for your search engine when using Search core module.
  • Publishing or unpublishing nodes when using the Scheduler module.
  • If you have Update Manager module enabled, a task is run to look for updates. It also sends an email if you configured it to do so.
  • If you have dblog (Database logging) enabled this task deletes messages after a set limit.
  • Temporary uploaded files are deleted by the File module.
  • Fetch aggregated content when using Aggregator module.

Running cron

First we have the Automated Cron core module (sometimes referred as Poor man's cron) which during a page request checks when cron was last run and if it has been to long it processes the cron tasks as part of that requests.

alt

Cron is set to run every third hours.

There are two things to consider when using this approach. If no one visits your website the cron doesn't run. Secondly, if the website is complex or the cron tasks are heavy the memory can exceed and slow down the page request.

The second approach is to actually setup a cron job that runs at the intervals you specify. Configuring this up depends on what system you use, but typically isn't that hard to do. If you use a shared host it's most likely you can do that right off in your control panel, and if you have your own server you can use the crontab command.

Read the Configuring cron jobs using the cron command on drupal.org for more details.

Implementing Cron tasks in Drupal

Cron tasks are defined by implementing the hook_cron hook in your module, just like in previous Drupal versions.

/**
 * Implements hook_cron().
 */
function example_cron() {  
  // Do something here.
}

And that's pretty much it. Rebuild cache and next time cron runs your hook will be called and executed.

There are a couple of things we have to take in to consideration:

When did my Cron task run the last time?

One way to remember that is using State API which stores transient information, the documentation explains it as such:

It is specific to an individual environment. You will never want to deploy it between environments.
You can reset a system, losing all state. Its configuration remains.
So, use State API to store transient information, that is okay to lose after a reset. Think: CSRF tokens, tracking when something non-critical last happened …

With that in mind, we could do something like:

 $last_run = \Drupal::state()->get('example.last_run', 0);

  // If 60 minutes passed since last time.
  if ((REQUEST_TIME - $last_run) > 3600) {
    // Do something.

    // Update last run.
    \Drupal::state()->set('example.last_run', REQUEST_TIME);
  }

To ensure our task is only run once per hour. Again though, if our Cron is set to run in a periodic longer than one hour it won't run every hour. (Who could have guessed that?) If you use Automatic cron and have no activity during some hours, the cron won't be run then as well.

How time consuming is my task?

Operations like deleting rows from a table in the database with timestamp as condition is pretty light task and can be executed directly in the hook_cron implementation. Like so:

 // Example from the docs.
 $expires = \Drupal::state()->get('mymodule.last_check', 0);
  \Drupal::database()->delete('mymodule_table')
    ->condition('expires', $expires, '>=')
    ->execute();
  \Drupal::state()->set('mymodule.last_check', REQUEST_TIME);

But if you have to run tasks that takes time, generating PDF, updating a lot of nodes, import aggregated content and such you should instead use something called QueueWorkers which lets you split up the work that needs to be done in to a queue that can later be processed over the course of later cron runs and prevents that a single cron eventually fails due to a time out.

QueueWorkers and Queues

So, we have a long-running task we want to process. As mentioned earlier we shouldn't just put all the processing into the hook as it can lead to timeouts and failures. Instead we want to split up the work into a queue and process them. The queues will later be processed in a later cron.

So let's pretend we've created a site where user can subscribe to things and when they do, they get an email sent with an attached PDF, for the sake of the example we'll also send emails to the admins that someone subscribed. Both sending emails and generating PDF are long running tasks especially if we are doing them at the same time, so let's add those items to an queue and let a queue worker process it instead.

To add a queue, we first get the queue and then add the item to it:

// Get queue.
$queue = \Drupal::queue('example_queue');

// Add some fake data.
$uid = 1;
$subscriber_id = 2;
$item = (object) ['uid' => $uid, 'subscriber_id' => $subscriber_id];

// Create item to queue.
$queue->createItem($item);

So we get an queue object by a name, a name which is later used to identify which Queue Worker that should process it. And then we add an item to it by simply calling the createItem method.

Next we'll have to create a QueueWorker plugin. The QueueWorker is responsible for processing a given queue, a set of items.

Let's define a plugin with some pseudo long running task:

modules/custom/example_queue/src/Plugin/ExampleQueueWorker.php:

<?php  
/**
 * @file
 * Contains \Drupal\example_queue\Plugin\QueueWorker\ExampleQueueWorker.
 */

namespace Drupal\example_queue\Plugin\QueueWorker;

use Drupal\Core\Queue\QueueWorkerBase;

/**
 * Processes tasks for example module.
 *
 * @QueueWorker(
 *   id = "example_queue",
 *   title = @Translation("Example: Queue worker"),
 *   cron = {"time" = 90}
 * )
 */
class ExampleQueueWorker extends QueueWorkerBase {

  /**
   * {@inheritdoc}
   */
  public function processItem($item) {
    $uid = $item->uid;
    $subscrition_id = $item->subscription_id;

    $user = \Drupal\user\Entity\User::load($uid);

    // Get some email service.
    $email_service = \Drupal::service('example.email');

    // Generate PDF
    $subscriber_service = \Drupal::service('example.subscriber_pdf');
   $pdf_attachment = $subscriber_service->buildPdf($subscriber_id, $user);

    // Do some stuff and send a mail.
    $emailService->prepareEmail($pdf_attachment);
    $emailService->send();

    $emailService->notifyAdmins($subscriber_id, $user);
  }

}

So let's break it down.

We use the Annotation to tell Drupal it's a QueueWorker plugin we created.

/**
 * Processes tasks for example module.
 *
 * @QueueWorker(
 *   id = "example_queue",
 *   title = @Translation("Example: Queue worker"),
 *   cron = {"time" = 90}
 * )
 */

The id argument is the most important since it must match the machine name of the queue we defined earlier.

The cron argument is optional and basically tells Drupal that when the cron is run it should spend maximum this time to process the queue, for this example we used 90 seconds.

Then we implement the public function processItem($item) { method which will pass the data we gave for each item when we created the queue.

In the pseudo example I'm loading the user uid we passed in to the queue item and then getting 2 services which one generates a PDF (pretty heavy operation) and the second one that supposedly later emails it. We then send emails to all the admins through the notifyAdmins method. So that was pretty simple. We simply create a new plugin class, use the Annotation to tell Drupal its a plugin and then implement the method which gets the data from where we added the item to the queue.

For this example we just added some operation to be processed in a queue that doesn't necessarily belong in the cron hook but instead when the user actually subscribed for something. So what I'm essentially saying here is that you don't need to create a queue in a cron hook, but can do that anywhere in your code.
In practise its the same thing, you get the queue $queue = \Drupal::queue('example_queue') and then add item to the queue $queue->createItem($data) and then define ourselves a QueueWorker which then processes the queue items when cron is run.

So the question we should ask ourselves here: Should we add individual tasks to a queue and let cron process it? And the answer - it depends. If the task slows down the request and keeps the user waiting, it's definitely something to consider. These things may be a better case for using something like a Background job, but you may not always be able to do that (and nothing that comes out of the box in Drupal) and if so a cron will take of some significant time from the request so it's not too slow for the user (..or timeouts for that matter).

Here's all the code without the pseudo code that you can use as boilerplate:

<?php  
/**
 * @file
 * Contains \Drupal\example_queue\Plugin\QueueWorker\ExampleQueueWorker.
 */

namespace Drupal\example_queue\Plugin\QueueWorker;

use Drupal\Core\Queue\QueueWorkerBase;

/**
 * Processes tasks for example module.
 *
 * @QueueWorker(
 *   id = "example_queue",
 *   title = @Translation("Example: Queue worker"),
 *   cron = {"time" = 90}
 * )
 */
class ExampleQueueWorker extends QueueWorkerBase {

  /**
   * {@inheritdoc}
   */
  public function processItem($item) {
  }

}

For a real example, take a look at the Aggregator module which uses Cron and QueueWorkers.