!!!!!! THIS PAGE MAY BE OUTDATED !!!!!!

Instructions for Local Development Database Setup

Things are different here because we are wiping and rebiulding the databases frequently. And it takes some time (about 4 hours) to reload everything from complete scratch, plus we have not automated all processes to create certain knowledge from UMLS and RxNorm combined with the SPL labels.

We have 4 systems:

  1. hsqldb, this cannot use all data, reloaddb target still applies.
  2. oracle (madhuri) - used currently for creating the merger with RxNorm and UMLS, SNOMED etc.
  3. pgsql-aishwarya (alias psqldev)
  4. pgsql (aurora) (public test database)

The approach is for reloading from total scratch we begin with oracle and then create augmented knowledge. It's described somewhere here but not fully automated and still managed by myself at this point.

The next step is to load stuff into pgsql-aishwarya. This is load all from XML, SPL, RxNorm, SNOMED. Takes 2 hours or so.

Then the knowledge content is dumped out into reloadable database dumps.

  • mw.knowledge.dump.gz - SPL labels
  • mw.snomed.dump.gz - SNOMED problem list concepts

Other targets to reload are:

  • setupDB - wipes a database, creates new schema
  • loadSampleData - basic system, command definitions
  • loadSamplePatientData - the detailed sample patient data
  • loadSamplePatients - dummy patients from scambled public data

Now, whenever we reload a system without wanting to change the knowdledge content. We begin on aishwarya. There we log in as user pgsql.

First we wipe all data, which may require to take the database offline to clear all locks.

pg_ctl -m immediate stop
pg_ctl restart
psql mw < mw.schema.wipe.sql

The mw.schema.wipe does:

drop schema mw cascade;
create schema authorization mw;
grant all on schema mw to mw;

so we really delete everything in the mw schema.

Then we rebuild the schema. On some machine (I use manisha alias scienceworkbench)

ant/bin/ant -Dedition.dir=../encumbered -Dweb.port=9090 -Ddatabase=pgsql-aishwarya setupDB

After which I take a dump of just the schema as pgsql@aishwarya

pg_dump --schema-only mw |gzip -cv9 > mw.schema.gz

Then on manisha again I load the basic setup:

ant/bin/ant -Dedition.dir=../encumbered -Dweb.port=9090 -Ddatabase=pgsql-aishwarya reloaddb

and take another dump of just that, and we now load the knowledge

pg_dump --data-only mw |gzip -cv9 > mw.basic.gz
gzcat mw.snomed.dump.gz |psql mw
gzcat mw.knowledge.dump.gz |psql mw

that takes just a few minutes. Back on manisha I loadSamplePatienData. It's important that that step happen after the knowledge is loaded, because references to the knowledge base are being created now. We dump the sample patients in as well. The final step is run the wordIndexer. This

ant/bin/ant -Dedition.dir=../encumbered -Dweb.port=9090 -Ddatabase=pgsql-aishwarya loadSamplePatientData loadSamplePatients wordIndexer

The word indexer takes some time. Now we have the system loaded and ready to take our final snapshot. A complete dump is created and then sent to aurora or a complete rebiuld from one single database dump.

pg_dump --clean mw |gzip -cv9 > mw.dump.gz
(cat mw.schema.wipe.sql ; gzcat mw.dump.gz) |psql -h aurora.regenstrief.org -U mw

From that point on, rebuilding requires only

pg_ctl -m immediate stop
pg_ctl restart
gzcat mw.dump.gz |psql mw

which is done in a few minutes.