Project Description
This project syncs one or more child databases to a master database. The databases might exist in different geographic locations. For the first phase we will only consider core contact and related objects and do the sync via a regular cron job. In later phases we will add hook support to do this in real-time.
We will use a json format exactly the same as what we intend to use for SOLR. There will be two modes of syncing.
- Incremental sync
- This will only incorporate all changes since the last sync which will be remembered by the cron script
- In 4.2 and later we will use the modified_date in the contact record. In earlier versions we will use the change log modified date
- Complete sync
- This will be a migration of all records from the child to the master
- This will delete all child records from the master and will repopulate the master with new data
Hours
- We will spend a total of 50 hours on this project
- 5 hours specification
- 30 hours design and development
- 10 hours unit tests
- 5 hours documentation
Other research
Assumptions and Requirements
- Each child database will have a distinct data set for contact data in the master database. Thus if contact A appears in multiple child databases, it will also appear multiple times in the master database.
- The CiviCRM install on the master database is not aware about the multiple child databases. It will see one large unified set of contacts, with potentially some duplicates.
- This is a one way child -> master sync only. The master database is read only. Any changes made to the master database are liable to get lost in a sync operation.
- It is the responsibility of the administrator to set up the cron jobs and transfer the files between the master and the various children machines.
- The meta data across all child databases is exactly the same. Thus all location types, website, activity types, relationship types, tags, option groups and values are the same.
Phase 1 - Core Data
- Contacts
- Address
- Phone
- IM
- Website
- Note
- Relationship
- Group
- Group Nesting (is this still used)
- Group Organization
- Tag
- Saved Search
- Activity
Process:
- Regular cron job generates a incremental sync file in json SOLR format of the objects in phase 1. We will create a framework to make adding additional objects relatively easy (similar to adding additional objects in our views integration)
- Master database reads this sync file along with the "child prefix" and adds new objects and updates existing objects
- To keep track of new objects vs existing objects master database will have one new table for each child instance:
- mapping_master_CHILDPREFIX
- entity_table - varchar(255)
- child_entity_id - int
- master_entity_id - int
- mapping_master_CHILDPREFIX
Example JSON Format:
Future Work:
Features
- Allow reverse sync, i.e. the ability to create a child database from the master database
- Allow real time syncing via hooks
- Allow meta data to vary between client databases
- Use above format to implement SOLR search and autocomplete
- Use json format to help implement Civi Offline
Phase 2 - Meta Data
- Options Groups / Values
- Custom Group, Fields and Value tables
- Custom File Tables
- Profiles
- ACL
Phase 3 - Other Objects
- Contributions
- Financial Trxn
- Pledge
- Soft Contributions
- Recurring Contributions
- PCP
- Contribution Page
- Event
- Price Sets and Fields
- Line Item
- Event Discount
- Participant
- Membership
- Mailing
- Case
- Campaign
- Survey
- Grant
Labels:

1 Comment
Hide/Show CommentsMay 15, 2012
Shawn Duncan
This looks very useful!