Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Improved upon the language & formatting, as well as fixed some grammatical errors.

Problems occur in production, and you must be prepared for them.  Review this checklist and make sure you have considered, DOCUMENTED, and TESTED the response to any problems that may occur.

Testing the Process

...

  • Letting database backup scripts run and perform a full restore. Note the amount of time the process takes and what steps could have been improvedany scope for improvement in the steps involved.
  • Verifying O/S backup restore.
  • Verifying all phone numbers on contact lists.
  • Noting the amount of time taken elapsed starting from the outage start to the full recovery.

Documenting the Process

  • Compile a list of emergency contact information, including vendors and contacts at the financial institution (Who should be called in the event of an outage?)
  • Instruction Instructions for restoring a system should be detailed and clear. Try to write them for a new user.
  • Make sure a procedure exists for updating existing disaster recovery documentation as new scripts are added/removed (setting up a monthly email reminder to make these updates is often a good idea).
  • Good versus bad examples <we will document some examples of good versus bad documentation here in the future>

...

  • Natural disasters (e.g. flooding of main office or branch offices), fires, and political emergencies
    • OFF SITE DATA STORAGE: make sure data is stored at a location that is not aside from the head office only , for BOTH the database and the OS. <link>
    • Well-documented procedures for recovery steps <link>
    • Contact list <link to template>
  • Security breach of Mifos server and/or Act act of sabotage by staff
    • What are the processes for immediately changing passwords. ? Are they documented?
    • What needs to be evaluated for your organization (check accounts, database evaluation)?

  • Failure or loss of Mifos server, database and/or server disk storage
    • Make sure certain scripts are running for database backups.
    • Make sure certain scripts are running for OS backups.
    • Make sure certain backup and recovery procedures are DOCUMENTED IN DETAIL, such as:
      • Restoring the OS
      • Restoring the database
      • Mifos configuration settings
      • Any custom scripts that may be running
      • Verification test plan (10-12 trials to make to ensure system is functioning properly and stable)
    • Make sure all the scripts are stored in a documented location and include instructions for recreating the production setup.

  •  Loss of Internet access/power at main office or branch office
    • What procedures need to be in place for working around the problem at a branch? Document and test.
    • What procedures need to be in place at the Head Office in the event of an extended power outage?

  • Loss of key staff members
    • Staff turnover is a fact of running an organization. Make sure all procedures, instructions, and important details are documented and available to newer staff members who may be forced to troubleshoot or process a recovery. 

...

All system administration and maintenance processes should be well-documented.