Documentation for this functionality can be found at Managing Scheduled Jobs |
|
Some remarks from conversations/discussions about this spec that already happened:
|
This page refers to Consolidated CiviCRM Cron MIH and describes the concept of implementation of this feature.
Basic concept
CiviCRM will have single cron script/url (depending on user preference), which will need to be configured in "traditional way" - by adding a line to server's crontab. Recommended configuration for this cron job is to be run every minute for best accuracy or at least every hour for acceptable accuracy.
This script/url will trigger a check on Scheduled Jobs list configured inside CiviCRM's administration panel and stored in CiviCRM database. It will execute those based on their individual "cron like" configuration. All the Scheduled Jobs runs are logged - both their start and end. To make transition between current way of handling things and new way, in this phase, Scheduled Job is an equivalent to existing script (e.g. UpdateMembershipRecord.php) and is executed by "emulating" url call method.
"Scheduled Job" name is used to avoid confusion with cron job (which is required to run Scheduled Jobs).
Permissions
"Administer CiviCRM" permission is required to manage Scheduled Jobs.
Schema changes
Two tables will be added: civicrm_job, to store information about scheduled jobs, and civicrm_job_log, to save job run information.
User interface
Administer -> Manage -> Scheduled Jobs (admin/job)
Default view
This screen presents list of configured Scheduled Jobs, allows going to edit job screen, as well as disabling/enabling them and deleting them. Also contains a link to Add Job screen.
Add action
Each scheduled job has following properties, that can be configured by administrator:
- Name - required, unique name of the Job
- Description - optional description
- Cron String - required, cron string stating the frequency of runs (e.g. "5 * * * *" )
- Script - required, file name of the script. Url is being constructed using this name, assumed directory is $civicrm_root/bin, assumed base URL is "CiviCRM Resource URL" from global settings.
Administer -> Manage -> Scheduled Jobs Log (admin/joblog)
This screen presents the list of etries from civicrm_job_log screen. The only action that can be done on this screen is purging the log (effecting in DELETE from civicrm_job_log).
Other functionality
- civicrm_job_log should be automatically purged (DELETE FROM civicrm_job_log WHERE run_datetime < 7 days ago) on every scheduled jobs run.
Implementation notes
- In order to parse and interpret Cron String property of Scheduled Job, Cron Expression library is going to be used. It offers parsing cron expressions and checking if given cron expression makes specific job due to run relative to current moment in time. Every time "main cron job" is run, all the Scheduled Jobs are checked whether they're due, and run or skipped according to test result. NOTE: exact accuracy of cron expressions defined for Scheduled Jobs can be achieved only if "main cron job" is set to (* * * * * - run every minute), therefore this should be recommended setting.
- CRM_Utils_System::authenticateScript will need to be modified to provide authentication based on "short living" hash string generated for each cron script before it's run and stored in civicrm_job. Hash string is passed as parameter in url and authenticateScript checks whether parameter matches the contents of database before it authenticates the script.
Second phase - potential tasks and improvements
- implementing Drupal cron hook, so it's not necessary to configure separate cron job just for CiviCRM.
- implementing user friendly interface to replace Cron String
- designing CiviCRM Scheduled Job API - each script (to be specific, PHP class) will need to have specific methods, e.g. will have to implement "class->run() method, class->resultLog() method for storing run log message (success/failure), etc
- reworking existing scripts to follow above "API"
- adding alternative job call method - "callback" pointing to specific PHP class

4 Comments
Hide/Show CommentsJul 23, 2011
Brian Shaughnessy
Since we are controlling the environment and have defined all the possible individual scripts that could be run, it seems like it should be pretty easy to structure this such that the user does not need to manually add the cron statement. i.e. -- we should be able to create a new job and select from a drop down list the desired script to run. I'm envisioning the user creates a job, selects the script run, adds parameters to pass (via a text box -- but eventually it would be great if those are presented as user friendly options), and the processing frequency.
Also -- if I read correctly, we are planning to run the scripts using the URL method (wget). It would *really* be great if they could be processed as shell scripts. For some of the scripts, the URL method adds significant overhead and performance issues (UpdateMembershipRecord.php in particular).
Aug 26, 2011
Jamie McClelland
I agree about running it via cli. I think preserving the ability to kick off the parent cron job via wget is important, in case people need to run the cron job from a remote server. However, it seems like we should be encouraging everyone to run the script from the CLI as the first choice. In addition to better performance, it is more secure (many people seem to be sending their username and password in clear text via GET).
Aug 26, 2011
Jamie McClelland
Lobo refactored the migrate/import script (http://issues.civicrm.org/jira/browse/CRM-8718?page=com.atlassian.jira.ext.fisheye%3Afisheye-issuepanel) in a way that helps us toward this transition.
I think the method Lobo used could be a model for the other scripts and first step toward completing this project.
In short - he moved the class definition contained in the migrate/import and migrate/export scripts into the CRM directory so it's bootstrapped with the rest of the code. The only code left in the import and export scripts is the code needed to call and execute the class.
Therefore, the script is fully backward compatible with people who are calling the script via wget and CiviCRM or drush can easily run the same code without needing to exec the script or pass any username/password values before including it.
Perhaps re-factoring all the scripts in this manor should be the first step in the process?
jamie
Aug 26, 2011
Jamie McClelland
And, one more question: do any of the scheduled jobs require a UF username? If not, I'd like us to remove all authentication when running from the CLI. If you can run a script from the CLI that can properly read the civicrm.settings.php file, you already have full access to the database, so requiring the user to put their username and password and site key into a cron job seems both tedious and less secure.
If any of the scripts do need a username (for logging perhaps?), then (as discussed over lunch yesterday), perhaps we could have a default user - e.g. for Drupal it could be the user with UID 1. At a later stage we could perhaps have an option to map a unix user to a UF user.
jamie