Recovering crashed ADIT-NMR session

Two possible scenarios are

Restart ID problems

Restart ID contains the date the session was created and the name of the server where it was created. Before digging into ADIT-NMR sessions, take a closer look at both:

  • server name is not deposit.bmrb.wisc.edu: chances are the user started the deposition in Osaka and is trying to continue it here (just tell them to continue on Osaka server)
  • if date is before Dec 6 2010 (or server name is batfish.bmrb.wisc.edu): an old session started before required chemical shifts version of ADIT-NMR went online. Instructions for recovering old sessions.

If it's none of the above and you're getting internal server error when trying to continue session, it's likely that ADIT-NMR crashed in the middle of a previous session and left behind a corrupt file. The error is caused by ADIT-NMR trying to read that corrupt file back in.

The good news is, most of the time that file is BmrbUtils.odb (or BmrbUtils.psf) and all it does is hold the state of the tree menu on the left side of the ADIT-NMR window. I.e. it's perfectly safe to delete, you will not lose any data. So, delete BmrbUtils.odb file from session directory and try the restart ID again. Most of the time this will get rid of the problem.

If that doesn't work, try to look for corruption in the data file itself (Table-db.odb) by using the /cgi-bin/bmrb-adit/obj2cif program to try to turn it into a text cif file. If it can then its possible to recreate a new session by starting a new one and copying the Table-db.odb and contents of the upload/ over into the new session. If it can't then the problem is serious and the data is most likely lost. If there is a saved text representation (preview file), it can be turned into Table-db.odb by cif2obj program (or uploaded into a new session to populate the fields). Otherwise you'll have to apologize to the depositor and ask them to start again (the best you can do is create a new session and re-upload everything from upload/ for them). (Also see It still doesn't work below.)

Submission problems

E.g. RCSB never received rcsbNNNNN CVS from us or files are missing from it. The likely cause is ADIT-NMR crashed in the middle of processing and never created the CVS “project”. The basic recovery procedure is to copy the session over to QA or programmers platform and try to re-submit from there. Chances are the problem was intermittent and submission will work there.

  1. Use scp -r or some other way to copy the entire session directory to programmer platform. (Remember when doing this that you can't run scp from PROD to PROG (push copy), you have to run it from PROG to PROD (pull copy).)
  2. On the PROG copy, if there is an rcsbNNNNNN directory created by the submission attempt, then mv out of the way so the session is in the same state it was in before it ran the submission. (Don't delete it, so after you re-submit you can compare the result (from the prod run) and see if re-submit fixed the problem).
  3. remove the lockFlagFilePdb and lockFlagFileBmrb files from the prog copy of the session so it can be resubmitted there.
  4. (Important) go into adit/config/bmrb-adit/adit-nmr-config.cif settings file for the prog platform and double check to make sure the e-mailer is turned off. (Set to cat > /dev/null instead of /usr/sbin/sendmail.) If this is not done then the real authors would get mail when I re-ran the submission.
  5. After that, use the restart ID to continue deposition on the prog platform and re-submit it there. Make sure to check that it's in PDB view mode first: sometimes the act of submitting the session on the PROD platform got far enough to get to the point where it flipped the view mode flag to BMRB mode before it crashed, sometimes it didn't, so you have to check this on a per-case basis.
    One of two things happens at that point:
    either it does or does not work when you try the re-submit.
    1. If it works on the re-try, then you can take the rcsbNNNNN directory it produces and manually check it into the CVS archive (/cvspdb) to send to PDB as an initial check-in if the previous run never got that far, or as a committing of updates (a “1.2” version) if the previous run did successfully make a rcsbNNNNN directory and check it in but the files just had incomplete contents. Either way, PDB will see the change on their next nightly update. Important: e-mail Monica and ask her to check if they received the entry: the scripts on PDB side may ignore it if it's “too old”.
    2. If it doesn't work: see It still doesn't work below.

It still doesn't work

unfortunately here the instructions can't be too specific. If it is actually crashing, it could be just about anything, you'll have to figure it out.

If you run it on the prog platform and it coredumps, you can usually run the coredump into the gdb command if you know the right program that produced it - to at least find out the line it died on. Use

strings coredumpfilenamehere | head

to see the name of the binary program that produced the crash, then cd to mmcif-input-tool/ and run

gdb bin/$BINARYPROGNAME $COREDUMPNAME

and just do a where command to at least figure out the line it died on.

(If it does coredump, the coredump will be found in one of two places - either in the adit-nmr's cgi-bin/bmrb-adit/ directory, or in the session directory. Where the coredump ends up depends on whether or not the program did a cd to the session directory first or not.

If it doesn't coredump, check that /etc/sysconfig/httpd has

ulimit -Hc 32768
ulimit -Sc 32768

in it. If not, add those, restart httpd, and try again.)

If all else fails sprinkle

fprintf( stderr, "I got this far %s:%d\n", __FILE__,__LINE__);

sorts of code all over the relevant routine (for submission problems: mk_deposition) to trace where it crashed.

Login