Aller directement à la fin des métadonnées
Aller au début des métadonnées

Chapter 19 of the CiviCRM book has an excellent overview of the import process.

This page has some additional details and specific ideas for dealing with bugs and features of import in the current CiviCRM version.

Importing Contacts in CiviCRM 4.2 - notes and comments

  • Contact subtypes (ie, Media Contact subtype of Organization), are CASE SENSITIVE.  If subtype doesn’t match an existing subtype, the import of that record fails.
  • Best to make a group AND a tag for each import step, plus a group for the entire import. It is easy to delete any unneeded groups or tags later.
  • Including groups/tags helps in an important way to work around some bugs in 4.2: Because the CiviCRM import stats report records as updated when they are actually skipped, the group/tag update is the only way to tell how many contacts have actually been updated or added by the import.
  • Deceased contacts:  The boolean field in CiviCRM 4.2 is called "Is deceased".  There is a "Deceased Date" field also.  If you import to the Deceased Date field without also setting the "Is Deceased" field your contacts won't show as deceased, so set both. 
  • If your CiviCRM installation is on a shared host, you are going to be limited as to how many contacts you can upload at once.  On CiviHosting.com, 400 contacts at once is possible, or 1000 memberships or relationships.  If your files are larger, just split them into several files.  On a dedicated, powerful server one user was able to import 16,000 contacts at once, but the import took six hours.
  • Does your data have 'deleted' or 'inactive' records?  You might carefully consider whether or not to import these into CiviCRM.  Perhaps you will import some of them but not all. 
    • For instance, one company keeps inactive records for people who have asked for no further contact.  You will want to import these with the appropriate no contact settings.
    • The same company keeps marks 'inactive' records that were accidentally created, incomplete, or merged with another contact record (duplicates).  It is these that may cause problems in your CiviCRM import–if your import dedupe setting causes the inactive record to match with the similar active record it can cause confusion and problems.  These are the type of record you might consider excluding from your CiviCRM import entirely.

Importing with Custom Fields

These comments may apply to custom fields whether imported as part of a contact, membership, or contribution.  CiviCRM uses different import routines for each type of import, so some issues may be restricted to one type of import only; others may apply to all type of data import.  The issues below were particularly noted when importing contacts:

  • Custom Fields with select, multi-select, or ADVMulti-select, are CASE SENSITIVE for import
    • However, if you have case mis-matches the import for that record will fail silently in CiviCRM 4.2.  So you'll have to double-check exact matches manually.
  • Best to import Custom Data with ‘value’ rather than ‘label’, particularly so if the label is long & complicated (partially because the the bug mentioned above, the import will sometimes fail silently if there is a mismatch).
    • Particularly: In CiviCRM 4.2, Custom Data fields which include a comma or dash don't match even if they appear to match (ie, copy/pasted from same source).  Some other 'problem characters' may have this problem as well-- ' " ; % and the like.  Fields including these characters will be silently skipped on import, even if the fields actually match character for character.  The rest of the record will be imported, only the non-matching field will be skipped. Bug report for this issue: CRM-10839
  • Custom Data fields that are dates or boolean (for instance, yes/no fields) cannot be empty.  In CiviCRM 4.2, records with empty date or boolean fields will be silently skipped–the entire record will be skipped, not just the problem field.  Solution: Fill out boolean fields with 0 in any empty fields.  For dates, fill out empty fields with 1/1/1900 or some other nonsense date.  Another possible solution is separate imports for contacts with data in those fields.  For the contacts that have empty fields, just omit that field entirely for those contact imports.
  • In CiviCRM 4.2, a bug prevents import of address custom fieldsCRM-10822

Dedupe rules and external keys on import

These comments may apply to custom fields whether imported as part of a contact, membership, or contribution:

  • Dedupe vs External ID:  Which wins? In 4.2, if both external ID AND dedupe rule selected, the External ID overrules the dedupe rule match.  If the External ID doesn’t match and the dedupe rule does, there will be no match and the data will not be imported for that record.  Beware: On the final import screen it will show those records as being imported in the final tally even though they are not.
  • If no dedupe rule is chosen, CiviCRM seems to use the default dedupe rule for that contact type.
  • In some cases, rather than using External ID, cases it might be better to import the ID key from the external database to a specifically created Custom Data Field.  Then define a dedupe rule to control dedupe detection using this new Custom Data Field however you like.  In fact, you could create several custom data fields to hold the external DB keys for data from several different external databases if necessary.  This would have most of the advantages of the External ID without the disadvantages.  (The one disadvantage of this method is that it would be possible to create two different records with the same external database ID, which is not possible to with with CiviCRM’s External ID.  However it would be very easy to detect and merge these duplicates using dedupe rules if that situation occurs.) You may lose some ease of import of memberships, relationships, and contributions via external id, but you may be able to create and use dedupe rules that will give the same end result.
  • External IDs are helpful but they also cause problems. Extraneous matches by External ID are a potential problem.  If you don't need existing External IDs in your contact database, consider deleting all external ids from the SQL DB except the ones you know will match or are trying to match.

Memberships Import in CiviCRM 4.2

  • Max memberships import on CiviHosting is about 1000 at a time.  If larger than that, the import must be split into smaller files.
  • If a too-large memberships import fails, it will most probably have imported many of those memberships.  Check before proceeding to re-import–you may end up with many double memberships.
  • You generally cannot import memberships with start date in the future because that doesn't match any membership status that is defined.  Of course you could define a membership status ("advance payment" or the like) to match this possibility and then proceed.
  • Leaving out the membership status and just allowing the system to calculate it from the given start and end dates appears to be the best procedure.  If you specify a status and it conflicts with the membership status rules then that membership is simply not imported.
  • When memberships and contributions are both imported, unfortunately they are not linked together as they normally are when a membership payment is made in CiviCRM.  This forum post has a small PHP program that can be run to re-connect the memberships with their associated payment. This can be important for instance if you want the membership renewal link on the users profile page to take them to the contribution/renewal page.

Contributions Import in CiviCRM 4.2

  • The main thing you need to know about importing Contributions in 4.2 and 4.2.1 is there is a major bug and Contribution Import was broken.  Import appears to work but results in internal database problems. So don't import any contributions into CiviCRM 4.2 or 4.2.1. If you upgrade to 4.2.2 or later you can import contributions again.
  • Contribution import in versions prior to 4.1 or earlier works fine and 4.2.2 and later works fine.
  • You'll need to make sure your previous DB's Contribution Types (which should be one field of your .csv import file) match up with the Types you have set up in CiviContribute.  
  • Before import, double check that all External IDs will match actual CCRM contact External IDs.  If you are editing/merging/consolidating contacts as you proceed through your import, you may find that some External IDs no longer match with contacts.  How to check all the External IDs in your import file:
    • In CiviCRM, find all contacts with a non-empty external id
    • Export to a CSV file
    • Opening with Excel
    • Use Excel vlookup function to match the External IDs in your import file with those in the CSV download file you just downloaded from CiviCRM search
    • If any External IDs don't match, you'll need to individually find/search/correct, etc. in your import file before importing.  These 'missing external ids' are likely contacts that have been merged, deleted, etc., between the time you imported your contact list and the time you imported this contribution list.  So if you find the merged/updated/deleted contact, find its External ID (or create one) and update the External Id in your contribution import file to match.
  • Back up the entire CIviCRM SQL database before starting contributions import and at each sub-step.
  • Remember that contributions from individuals & organizations must be separated & uploaded separately. 
  • Field mappings: How do you know which field in your data matches with which field in CiviCRM?  Look at these resources to help:
    • Contribution entry screens
    • Contribution view and edit screens
    • The civicrm_contribution SQL table
  • Suggestion: Add a source code in the Source field, for example, “My Org PreviousDBSoftware DB import 2012-08-11”, so that this batch of imports can be identified and also traced back to their original source in database of your previous software.
  • Suggestion: Create a different source code for each separate file you upload.  For instance, if you break your import into three import files, use source codes  “My Org PreviousDBSoftware DB import 2012-08-11-01", “My Org PreviousDBSoftware DB import 2012-08-11-02", “My Org PreviousDBSoftware DB import 2012-08-11-03".  This way you can easily search for all records in the import OR only the records imported from a particular file.
  • Suggestion:  After reach import step, search on the source code for that import file under Contributions/Find Contributions.  The Search Results screen shows the number of contributions and the Total Amount–which you can double check against the similar numbers you calculate from your import file.
  • If you don't have specific data in your transaction ID field, you might put the record ID from your previous DB here.  This ties the new CiviCRM contribution back to the specific record ID of the transaction in your previous software.
  • The field “Amount Label” corresponds to SQL DB field amount_level which allows you to include detail about the transaction more detailed the Contribution Type.  For instance, CiviCRM lists all line items here.
  • Suggestion: First test your contribution import with a VERY small file.  Try just one entry for each of the Contribution Types you have.  If that is successful, move on to a slightly larger import (50-100 entries) and then move forward with a larger import.  Removing hundreds to thousands of incorrectly imported contributions is no fun.  You'll save more time by doing very small, then moderate sized imports to test the process.
  • How many contributions can you import at once?  You may have to determine this by trial and error.  On one shared hosting service, about 800 records was the limit.  The good news is that if a file import fails, you can simply re-import the same file again.  The importer will skip the previously imported records quickly and proceed to import the previously skipped records.  The disadvantage of this is that each previously imported record will generate an error and since there is a limit of 250 records in the error file you will miss any errors within the new imports, if you have more than 250 previously imported records in your file.  So a better procedure would be to edit the file to remove those previously imported records before importing again. 
  • Suggestion: If external id and/or dedupe matching somehow and contributions are matched with the wrong contact, it will be almost impossible to detect.  Consider adding something like Contact Name, Contact ID, or a similar identifier to the beginning of the Notes field for each contribution. That way if you're browsing contributions later, you'll immediately spot the mismatch.
  • Net Amount, and Non-deductible amount: This will NOT be auto-calculated from the Total Amount, so if you wish these fields to be populated, calculate them as separate fields and include them in your CSV upload file.
  • Import contributions will not accept $0 contributions (normally CiviCRM allows and records $0 contributions–so this is a bit of a discrepancy). If you need to import $0 contributions the fix is pretty easy and described here: http://issues.civicrm.org/jira/browse/CRM-10994
  • In CiviCRM 4.2.3, certain errors are stopping execution.  If you see an error message, rather than the normal landing page, and find that only a portion of your records were imported, there is a fairly simple fix: http://issues.civicrm.org/jira/browse/CRM-10996  The good news is that you can just reimport any files again, after patching the code or removing/fixing the offending entry in your import file.  The Contribution Import will simply skip any records that have already been imported.
  • As mentioned above, when memberships and contributions are both imported, unfortunately they are not linked together as they normally are when a membership payment is made in CiviCRM.  This forum post has a small PHP program that can be run to re-connect the memberships with their associated payment. 
  • More detailed notes from a large contribution import are found in this forum post.

Using 'Master Address Belongs To' Field

The Master Address Belongs To field is unlike all the other fields in the import dialogs.  Similar to the Relationship fields, this field allows you to associate a contact with a specific address.  For instance, to link the home address of individuals to the household address or to link the work address of individuals to their company.  However, this field can only import the civicrm_address.id field.  Unfortunately, this field is not accessible in CiviCRM directly via reports, exports or in any other mechanism.  Currently (4.5.3), this information can only be pulled from the raw database tables directly.

A Step-by-Step Method to pull the data to use 'Master Address Belongs To' field for shared addresses

This describe a mainly GUI approach to making the 'Master Address Belongs to' field useful.  You will need access to the MySQL Datatables through phpmyadmin as described below or use another tool to pull the data in a similar fashion.

Of course, before importing anything, clean up your data as much as possible.  Assuming none of the needed data is currently in the system, we will be importing data in two parts and you will need two CSV files (the primary contacts, and the secondary contacts).  The first file should include the basic information for the primary contacts (the entity that will hold the master address location, typically household or organization, but sometimes individuals).  This file should include all the relevant address information for the primary contact.

In the second CSV file, the secondary contacts information should be listed along with an identification field (e.g. a household or company name) for the primary contact but not include the address information.  Each entry in the individuals file should include any other fields not shared with the primary.

Use the import wizard to create the new contact entries from the primary contacts csv file.  This will also create the address.id’s needed within CiviCRM.  Once all of the primary contacts are imported, the address.id information must be pulled from the raw data tables.  You’ll need to use phpmyadmin or a similar route to grab the data.

Assuming you use phpmyadmin, select your civicrm database, click the SQL tab and run:

SELECT civicrm_contact.display_name, civicrm_address.id, civicrm_address.street_address, civicrm_address.city, civicrm_address.postal_code
FROM civicrm_contact, civicrm_address 
WHERE civicrm_contact.contact_type = "Household" and civicrm_address.contact_id = civicrm_contact.id

(Note: The display_name and id in the select clause are required, the others are optional - see the note below.)

Run this query and use the Export link to download the file listing all the households with their name and the address id as a csv.  (Note: This is all that is needed for contacts with just one address.  If you have multiple addresses per contact then you probably need additional Where clauses to select the right address.)

Now you have a csv file with the necessary data include the relevant civicrm_address.id in the secondary contact csv file. To combine these into one file for eventual import, one option is to use the vlookup spreadsheet function.  To do so, arrange your data in one spreadsheet workbook with two worksheets: one with the secondary contact information, the other with the SQL Export.  Add a column on the secondary contact with a heading of Address.ID on cell 1 and formula something like =VLOOKUP($A2,Sheet2!$A2:$E200,2,0) in subsequent cells for each row.  (This assumes column A is the primary contact identification column (household or company name) you included and your mapping data is A2:E200.  The 2 is the index id – this indicates that the address.id is in the 2nd column of the SQL Export sheet. The result should be the relevant address id.)

Now insert additional columns and change the index number to populate the street address(3), city(4) and postal code(5).  This is not required, however when a shared address is imported with just the address.id, CiviCRM shows that the individual shares an address with the household but does not show the actual address as expected until the first click through or update to the master address page.  This provides a work around by allowing an import of the address fields as well as the shared address id - hence the street_address, city, postal_code in the SQL query and the individual import file.

Check for any lines where vlookup has failed.  Correct or remove them otherwise the import will fail.  Then save it as a CSV.

Use the CiviCRM import wizard and map the Address.id column to “Master Address belongs to” and select an appropriate location type.  Map all the other address fields as usual & import.

Once the import has finished, you should find the contacts location information is correctly mapped to the primary address.  CiviCRM will also attempt to create a relationship when you link addresses this way.  For instance, it will automatically create a ‘Member of Household’ relationship if importing to ‘Home’ address.

 

Étiquette
  • Aucun
  1. Oct 08, 2012

    Hey Brent, thanks for these additions to the import process - great work. let us know if you want to get more involved in helping out with documentation. you would be more than welcome to merge relevant insights here with the import chapter / improve it in other ways if you are up for it.