Dylan Tack's blog

  • Choosier cron runs

    Jun 19, 2009

    hook_cron() is widely implemented in the Drupal ecosystem – but what if your modules have varying frequency needs? For example, perhaps you'd like your aggregator feeds to update every fifteen minutes, and notifications should fire every minute to keep emails timely. But system_cron() should run as infrequently as practicable, because it calls cache_clear_all()!

    Here is a small cron.php replacement that accomplishes this task. Just plunk the file in your drupal root directory. (You could move it to your sites folder if you modified the include path.) Different cron jobs for each module or set of modules can now be configured.

    <?php
    // $Id$
     
    /**
     * @file
     * Handles incoming requests to fire off regularly-scheduled tasks (cron jobs).
     *
     * The file executes cron hooks selectively, instead of all-or-nothing.
     * This allows cron jobs to be configured with variable frequency.
     * Example usage:
     * * * * * * curl example.com/cron_selective.php?modules=notifications
     * *∕15 * * * * curl example.com/cron_selective.php?modules=aggregator
     * 0 0 * * * curl example.com/cron_selective.php?modules=system,dblog
     *
     */
     
    include_once './includes/bootstrap.inc';
    drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
     
    $modules = array_intersect(module_list(), explode(',', $_GET['modules']));
     
    drupal_cron_run_selective($modules);
     
    /**
     * Executes a cron run when called
     * @return
     * Returns TRUE if ran successfully
     */
    function drupal_cron_run_selective($modules) {
      // If not in 'safe mode', increase the maximum execution time:
      if (!ini_get('safe_mode')) {
        set_time_limit(240);
      }
     
      // Fetch the cron semaphore
      $semaphore = variable_get('cron_semaphore', FALSE);
     
      if ($semaphore) {
        if (time() - $semaphore > 3600) {
          // Either cron has been running for more than an hour or the semaphore
          // was not reset due to a database error.
          watchdog('cron', 'Cron has been running for more than an hour and is most likely stuck.', array(), WATCHDOG_ERROR);
     
          // Release cron semaphore
          variable_del('cron_semaphore');
        }
        else {
          // Cron is still running normally.
          watchdog('cron', 'Attempting to re-run cron while it is already running.', array(), WATCHDOG_WARNING);
        }
      }
      else {
        // Register shutdown callback
        register_shutdown_function('drupal_cron_cleanup');
     
        // Lock cron semaphore
        variable_set('cron_semaphore', time());
     
        // Iterate through the modules calling their cron handlers (if any):
        foreach ($modules as $module) {
          module_invoke($module, 'cron');
          watchdog('cron', 'Selective cron run completed for %module.', array('%module' => $module), WATCHDOG_NOTICE);
        }
     
        // Release cron semaphore
        variable_del('cron_semaphore');
      }
    }

  • Running mod_ssl with Virtual Hosts

    Apr 17, 2009

    Like many Drupal developers, at any particular time I'm running dozens of Apache virtual hosts on my workstation. This allows access to each project under a friendly url. We frequently create sites that employ SSL, which creates a wrinkle: officially, it's not possible to used name-based virtual hosting with SSL.

    Simply disabling the SSL features on a development copy isn't the best option, because we need to test this functionality as we're developing.

    It turns out that you can use SSL and vhosts together, sort of. Name-based virtual hosts can be configured on port 443 just like any other port. Apache won't stop you, though it will throw a stern warning when it starts up:

    [warn] Init: You should not use name-based virtual hosts in conjunction with SSL!!

    Double exclamation! This is SERIOUS!! Down to business:
    # httpd-vhosts.conf
    # Use name-based virtual hosting.
    NameVirtualHost *:80
    NameVirtualHost *:443
     
    # [PROJECT].conf
    # Create two virtual hosts, one on each port
    # You can have as many of these as you want
    <VirtualHost *:80>
      DocumentRoot [/OVER/THE/RAINBOW/TRUNK/DRUPAL]
      ServerName [PROJECT].[HOST].example.com
    </VirtualHost>
     
    <VirtualHost *:443>
      DocumentRoot [/OVER/THE/RAINBOW/TRUNK/DRUPAL]
      ServerName [PROJECT].[HOST].example.com
     
      SSLEngine on
      SSLCertificateFile ["/ETC/APACHE2/SERVER.CRT"]
      SSLCertificateKeyFile ["/ETC/APACHE2/SERVER.KEY"]
    </VirtualHost>   

    Now, the limitation described in the Apache manual still applies – we can only have one SSL certificate. This means that without doing anything further, you're going to get "host mismatch" validation errors in your browser. This can be a real pain, especially with Firefox 3's more aggressive stance on untrusted certs (which I fully support, for reasons that have been discussed to death).

    So what's needed is a wildcard certificate, which can be created by putting a * in the cert's common name, like *.[HOST].example.com.

    Since we want the browser to trust our cert, without needing to manually add an exception for each virtual host, we also need to create our own certificate authority. Here is a straightforward recipe for setting up a CA; it's an easy process requiring only a few more commands than a self-signed certificate. When you're finished, you can import your ca.crt into your browser's list of trusted roots.

    In Firefox, this is done under
    Preferences » Advanced » Encryption » View Certificates » Authorities » Import.
    For Safari, simply double-click the ca.crt file, it will open in Keychain Access. When asked, say "Always Trust".

    The conclusion is that SSL and virtual hosts can play nice, as long as they share a common domain. Looking to the future, there is a solution in a newer set of TLS extensions, called Server Name Indication. This supersedes an earlier, now abandoned approach known as TLS Upgrade, described in RFC 2817.

  • Content Delivery for the Masses

    Mar 17, 2009

    I attended several sessions at Drupalcon on performance and scalability. One common thread was the need to offload static files to dedicated servers or a content delivery network (CDN). The use of a CDN is also part of the well-known YSlow test for measuring front-end performance. With CDN providers getting cheaper by the day, content delivery is not just for big media companies anymore! A bit more explanation is required to understand why this is complicated in Drupal, and what's being done about it. As a bonus — I'll share a trick for getting a useful chunk of your files on a CDN in less than 3 minutes.

    The state of content delivery in Drupal

    With the advent of cheap, auto-magical mirroring CDNs, this should be easy right? One of the challenges is that links to files can come from several disparate Drupal subsystems:

    • Links to Drupal's filesystem are created with file_create_url(), which doesn't filter links through custom_url_rewrite_outbound(). There are several approaches in the works for D7 to address this; I've worked on one of the patches, though I'm not yet sure how the reworked File API impacts this.
    • Some files, like CSS, come from functions such as drupal_get_css() that do their own string URL manipulations.
    • Files may be linked from inline image tags added by content editors, either manually or via a WYSIWYG editor.
    • Modules and themes may add their own inline images: print '<img src="' . $base_path . path_to_theme() . '/images/logo_bw.png" />' isn't uncommon.

    At Drupalcon, Wim Leers presented his work on the CDN Integration module, and it seemed very promising to me. There are plans to support both push and pull CDN providers, and even possibly things like video transcoding servers. However, it's neither production-ready yet, nor available for Drupal 6.

    So what are we to do? To cover all these bases, we might imagine a core hack or two, an implementation of hook_filter, custom_url_rewrite_outbound, and more. But I promised you a win in three minutes, so...

    • Grab the low-hanging fruit (CSS files)
    • Abuse the theme layer!
      • /* This goes in your theme's template.php, and will modify the $styles
         * variable that gets passed to your page template.  CSS background
         * images will also be pulled from your CDN.
         */
        function mytheme_preprocess_page(&$vars, $hook) {
          if ($url = variable_get('cdn_url', '')) {
            $vars['styles'] = str_replace('href="/', 'href="' . $url, $vars['styles']);
          }
        }
         
        /* This goes in your server's settings.php,
         * and assumes you have a "mirror bucket" set up at this URL
         */
        $conf = array(
          'cdn_url' => 'http://mirror-e7.your-cdn-provider.net/',
        );

  • Dylan Tack

    Dylan Tack

    OpenSourcery Alumnus