Data Migration Roadmap
Data Migration Roadmap
This document is a work in progress. It is meant to be a sketch of a roadmap or guide to data migration for Mifos. The goal of this document is to provide information on tools and techniques that may be useful for data migration projects and to compile information based on the experience of MFIs who have gone through the data migration process.
Background Work
Prior to considering data migration an MFI should have gone through the information concerning Mifos deployment planning. In particular, a gap analysis that includes a detailed comparison of the organization and features of Mifos to the existing system is important. This comparison should provide a good starting point for understanding the relationship between data in the existing system to data in Mifos.
What Data Can Be Migrated?
Mifos provides support for importing a subset of data used by the system.
The following data must be entered manually, prior to migration:
- Offices
- Roles
- System Users
- Fees
- Funds (if applicable)
- Product categories
- Products (ie Product Definitions)
The following data can be imported into Mifos:
- Active Centers
- Active Groups
- Active Clients
- Active Accounts: _ Client/Group/Center Accounts ("Client/Group/Center Charges" in the UI): Fee schedule, opening balance, amount overdue. _ Savings Accounts: Mandatory/Voluntary savings deposit schedule, opening balance, deposits overdue (if mandatory savings) * Loan Accounts (including full transactional history)
- Selected Historical Data for the above entities: _ Centers: Center Start Date _ Groups: Amount of Last Loan; Loan Cycle per Product; Group Approved Date _ Clients: Amount of Last loan; Loan Cycle Number; Loan Cycle per Product; Meetings Attended; Meetings Missed; Client Start Date _ Savings Accounts: Date Account Opened; Total Deposits; Total Withdrawals; Total interest earned; Missed Deposits * Loan Accounts: Full performance statistics will be available since the full transactional history is being migrated.
The Mifos Data Exchange XML (MDEX) Format
Data imported into Mifos is read from XML files that follow the Mifos Data Exchange XML (MDEX) format. This format is defined by an XML schema that is available in two basic formats: RELAX NG and W3C XML Schema. (The wikipedia pages just referenced for RELAX NG and W3C XML Schema both include links for learning about and understanding those schema languages).
The schema which defines the MDEX format can found in the Mifos Subversion repository using the following links:
All three of these MDEX schemas are equivalent. They are just different representations of the same information. Some people may find the RELAX NG schema easier to read and understand. The W3C? XML Schema is provided for those already familiar with that format and for use with software tools that require it.
An example of an MDEX file that follows the format specified by the MDEX schema can be found below:
Understanding the structure of an MDEX file and the constraints the MDEX format places on the data it contains is the key to migrating data into Mifos.
The main task of a Mifos data migration project will be the transformation of data from the existing system into the MDEX format.
What Will the Data Migration Process Be?
The basic pattern which will be followed during data migration is to:
- Extract data from the existing system
- Transform and clean the data
- Generate MDEX files from the data
- Load the MDEX files into Mifos
- Validate that the data in Mifos is correct
Tools for Data Migration
The approach that is best for a given Mifos data migration project will depend on what tools and development environments are most familiar to the Mifos specialist working on the project.
As more Mifos data migration projects are completed there will be a larger body of experience to draw upon for making suggestions. Possible approaches include:
- extracting data into an intermediate database, performing transformations on the SQL data and then writing it out into MDEX format.
- using an ETL (Extract, Transform, Load) tool to do the extraction, transformation, and generation of MDEX data.
There are a number of open source ETL tools available. Some of the better known tools include:
Thus far, Mifos specialists have reported using the following ETL tools for Mifos data migration projects
Kettle - this tool was used for extracting data from an MS Access database and Excel spreadsheets and load it into a MySQL database. An attempt was made to use it for transformation, but it was not judged to be stable enough to do that.
Extraction Tools and Techniques +++++++++++++++++++++++++++++++ The form of the source data will help determine the choice of a method for extracting data from the source system. If the source data is already in a database, then extracting may just be a matter of copying data into a new set of tables to use for transformation, cleaning and MDEX file generation. ETL tools may be useful for this step.
Transformation Tools and Techniques
The choice of tools for transformation and cleaning of data will depend on what intermediate form is chosen for the data. ETL tools may provide a convenient way of performing transformations, doing transformation in SQL is another option, or programming transformations directly in a your programming language of choice may work best.
MDEX File Generation Tools and Techniques
Since MDEX files are just XML files, any code or tool that can generate XML files can be used to generate MDEX files. Some ETL tools include components for mapping data into XML. There are tools available that can generate code for reading and writing XML files that follow a given XML schema. These tools insure that valid XML is generated. Tools or libraries are listed below for reading and writing XML from a given programming language:
- Java: JAXB from Sun Microsystems. (In the future Mifos may include a library that provides prebuilt java classes for writing MDEX files).
- .NET: xsd.exe from Microsoft (for C# and Visual Basic) `xsd.exe documentation <http://msdn2.microsoft.com/en-us/library/x6c1kb0s(vs.80).aspx>`, `xsd download <http://www.microsoft.com/downloads/details.aspx?familyid=89e6b1e5-f66c-4a4d-933b-46222bb01eb0&displaylang=en>`
The Data Migration Process
For a given MFI, data migration is likely to be a process that will need to be repeated more than once. For example, after an initial data migration, both Mifos and the legacy system are likely to be run in parallel for some length of time to insure that the newly migrated Mifos installation is correct. To support multiple migrations, it is suggested that the data migration process should be automated as much as possible.
Data Migration Experiences
As data migration projects are completed, feedback from each project can be assembled and presented here to share experiences, offer suggestions, and provide tips and help for MFIs working on their own data migration projects.