Aller directement à la fin des métadonnées
Aller au début des métadonnées

Documentation for this functionality can be found at Managing Scheduled Jobs

Some remarks from conversations/discussions about this spec that already happened:

  • don't use Cron Expression, just provide ability to configure job to run once a day or every time the "base job" invoking other cron jobs is run
  • being able to cron jobs run through both web and cli
  • we need to address timeouts issue somehow
  • administrators should be able to goto a page and see what cron job ran and when and what was the output (if any) - included
  • each cron job should be implementing simple interface (or rather abstract class), containing at least 4 methods:
    • run - used to kickstart the job, potential "run piece by piece" logic implemented here if needed. This method is cli/web agnostic and gets all it parameters from the base class / environment
    • logStart - put the log entry that cron job started at given time(can be implemented in abstract class)
    • logEnd - put the log entry that cron job finished at given time, plus additional information about success/failure (can be implemented in abstract class)
    • logEntry - put log entry during cron exection
  • integrate with hook_cron drupal (already included)
  • remove the need for username and password in web script calls, introduce a setting for default contact (id)

This page refers to Consolidated CiviCRM Cron MIH and describes the concept of implementation of this feature.

Basic concept

CiviCRM will have single cron script/url (depending on user preference), which will need to be configured in "traditional way" - by adding a line to server's crontab. Recommended configuration for this cron job is to be run every minute for best accuracy or at least every hour for acceptable accuracy.

This script/url will trigger a check on Scheduled Jobs list configured inside CiviCRM's administration panel and stored in CiviCRM database. It will execute those based on their individual "cron like" configuration. All the Scheduled Jobs runs are logged - both their start and end. To make transition between current way of handling things and new way, in this phase, Scheduled Job is an equivalent to existing script (e.g. UpdateMembershipRecord.php) and is executed by "emulating" url call method.

"Scheduled Job" name is used to avoid confusion with cron job (which is required to run Scheduled Jobs).

Permissions

"Administer CiviCRM" permission is required to manage Scheduled Jobs.

Schema changes

Two tables will be added: civicrm_job, to store information about scheduled jobs, and civicrm_job_log, to save job run information.

User interface

Administer -> Manage -> Scheduled Jobs (admin/job)

Default view

This screen presents list of configured Scheduled Jobs, allows going to edit job screen, as well as disabling/enabling them and deleting them. Also contains a link to Add Job screen.

Add action

Each scheduled job has following properties, that can be configured by administrator:

  • Name - required, unique name of the Job
  • Description - optional description
  • Cron String - required, cron string stating the frequency of runs (e.g. "5 * * * *" )
  • Script - required, file name of the script. Url is being constructed using this name, assumed directory is $civicrm_root/bin, assumed base URL is "CiviCRM Resource URL" from global settings.

Administer -> Manage -> Scheduled Jobs Log (admin/joblog)

This screen presents the list of etries from civicrm_job_log screen. The only action that can be done on this screen is purging the log (effecting in DELETE from civicrm_job_log).

Other functionality

  • civicrm_job_log should be automatically purged (DELETE FROM civicrm_job_log WHERE run_datetime < 7 days ago) on every scheduled jobs run.

Implementation notes

  • In order to parse and interpret Cron String property of Scheduled Job, Cron Expression library is going to be used. It offers parsing cron expressions and checking if given cron expression makes specific job due to run relative to current moment in time. Every time "main cron job" is run, all the Scheduled Jobs are checked whether they're due, and run or skipped according to test result. NOTE: exact accuracy of cron expressions defined for Scheduled Jobs can be achieved only if "main cron job" is set to (* * * * * - run every minute), therefore this should be recommended setting.
  • CRM_Utils_System::authenticateScript will need to be modified to provide authentication based on "short living" hash string generated for each cron script before it's run and stored in civicrm_job. Hash string is passed as parameter in url and authenticateScript checks whether parameter matches the contents of database before it authenticates the script.

Second phase - potential tasks and improvements

  • implementing Drupal cron hook, so it's not necessary to configure separate cron job just for CiviCRM.
  • implementing user friendly interface to replace Cron String
  • designing CiviCRM Scheduled Job API - each script (to be specific, PHP class) will need to have specific methods, e.g. will have to implement "class->run() method, class->resultLog() method for storing run log message (success/failure), etc
  • reworking existing scripts to follow above "API"
  • adding alternative job call method - "callback" pointing to specific PHP class
Étiquette
  • Aucun
  1. Jul 23, 2011

    Since we are controlling the environment and have defined all the possible individual scripts that could be run, it seems like it should be pretty easy to structure this such that the user does not need to manually add the cron statement. i.e. -- we should be able to create a new job and select from a drop down list the desired script to run. I'm envisioning the user creates a job, selects the script run, adds parameters to pass (via a text box -- but eventually it would be great if those are presented as user friendly options), and the processing frequency.

    Also -- if I read correctly, we are planning to run the scripts using the URL method (wget). It would *really* be great if they could be processed as shell scripts. For some of the scripts, the URL method adds significant overhead and performance issues (UpdateMembershipRecord.php in particular).

    1. Aug 26, 2011

      I agree about running it via cli. I think preserving the ability to kick off the parent cron job via wget is important, in case people need to run the cron job from a remote server. However, it seems like we should be encouraging everyone to run the script from the CLI as the first choice. In addition to better performance, it is more secure (many people seem to be sending their username and password in clear text via GET).

  2. Aug 26, 2011

    Lobo refactored the migrate/import script (http://issues.civicrm.org/jira/browse/CRM-8718?page=com.atlassian.jira.ext.fisheye%3Afisheye-issuepanel) in a way that helps us toward this transition.

    I think the method Lobo used could be a model for the other scripts and first step toward completing this project.

    In short - he moved the class definition contained in the migrate/import and migrate/export scripts into the CRM directory so it's bootstrapped with the rest of the code. The only code left in the import and export scripts is the code needed to call and execute the class.

    Therefore, the script is fully backward compatible with people who are calling the script via wget and CiviCRM or drush can easily run the same code without needing to exec the script or pass any username/password values before including it.

    Perhaps re-factoring all the scripts in this manor should be the first step in the process?

    jamie

  3. Aug 26, 2011

    And, one more question: do any of the scheduled jobs require a UF username? If not, I'd like us to remove all authentication when running from the CLI. If you can run a script from the CLI that can properly read the civicrm.settings.php file, you already have full access to the database, so requiring the user to put their username and password and site key into a cron job seems both tedious and less secure.

    If any of the scripts do need a username (for logging perhaps?), then (as discussed over lunch yesterday), perhaps we could have a default user - e.g. for Drupal it could be the user with UID 1. At a later stage we could perhaps have an option to map a unix user to a UF user.

    jamie

  4. Jul 06, 2013

    We've been having problems with 'daily' jobs that lock tables and interfere with other operations running during the day.  In the short term, our solution will probably be to run these as separate system crons, but it would be good to fix this up within CiviCRM.

    It would be good to see an ability to specify within CiviCRM what time of day the daily cron jobs should start running at, with CiviCRM serialising these jobs starting from the next cron job to run after that time.

    We have quite a few different domains, with their own crons running.  If civi could serialise daily jobs in a way that coordinates between domains well, that would be  very nice.  Failing that, it's important that the setting for what time the daily jobs should start from can be set separately for each domain.

    In some ways, it might be best to stick to running the daily jobs with a separate system cron calling CiviCRM for a given domain, telling it to run the daily jobs only.  That means that the administration of what jobs run on what schedule is in the hands of the civicrm administrator, but the system administrator is able to coordinate the timing of the heavier processing tasks with other system tasks (which the civicrm administrator is likely not to have much awareness of or visibility into).

     

  5. Jul 06, 2013

    Regarding authentication, on a typical linux system, anyone who has shell access can see the command line options (and environment variables) of every command running on the system, so if they can run a cli command, they can most likely also see the password that's already being used to run cron.


Creative Commons License
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-Share Alike 3.0 United States Licence.