GK Performance Issues

Notes on current (May 2008) performance issues at Grameen Koota.

Symptoms

  • page load time: several minutes
  • server errror (HTTP 500) from Web UI
  • "database error" from Web UI
  • MySQL database "grinds to a halt"... can't connect to database
  • number of MySQL threads monotonically increasing

Repro

  • ~50 concurrent users
  • running reports and/or just plain using Mifos
  • selenium work
  • JMeter work (see matrix)
    csv-table:: Repro matrix:header: repro,application,dataset,why,results

    clicks amp reports against testblue v1.0,large (based on GK prod),closest to gk prod,successful repro clicks & reports against testgreen,v1.0,minimal,would be easier to test,no repro clicks against testblue while executing reports manually as SQL scripts,v1.0,large (based on GK prod),to isolate problem to ReportsConnectionPool,TBD "clicks & reports against testblue; different JBoss, MySQL, or Java versions",v1.0,large (based on GK prod),look for problems in rest of Mifos stack,TBD only clicks against testgreen,v1.1,large (based on GK prod),TBD,TBD clicks & reports against testgreen,v1.1,large (based on GK prod),TBD,TBD clicks & reports against testgreen,v1.1,minimal,TBD,TBD

Firefighting

Brainstorm stopgap fixes here.

  • only allow a certain number of reports to run at a time. Concurrent attempts would lead to an error being displayed to the client, ala: "Sorry, system is too busy. Please wait."
  • play with hibernate connection pool size / application connection pool in general
    • con: reduces app to one connection (we think)
    • con: `real` fix is to make sure Session.close() is always called
  • tune jasper.pool.size
  • patch v1.0 code by replacing ReportsConnectionPool with DBCP (Adam Feuer)
  • experiment with MySQL settings in my.ini (Aliya Walji, Jeff Brewster)
  • tune GK's MySQL database (consider paid MySQL support contract)
  • `script to assist Naganand in killing sleeping threads <http://mifos.svn.sourceforge.net/viewvc/mifos/documents/deployment/GrameenKoota/tools/sleeper_killer.py?revision=HEAD&view=markup>`_
  • try turning on c3p0
  • try some other connection pooling software
  • play with jasper.pool.size db connection config setting
  • upgrade MySQL from 4.x to 5.x
    • pro: repro doesn't work against 5.x!
    • pro: more/better data from show status
    • pro: get bugfixes/improvements included in 5.x
    • con: problem is most likely the application
    • con: have to update MySQL JDBC driver, too
    • con: might break their custom SQL (what custom SQL exists?)
  • migrate GK to alpha release of v1.1
    • con: bulk loan entry broken
    • con: batch jobs broken(question)

Potentially Related/Similar Problems

Development

Brainstorm starting points for more in-depth fixes here.