merge_MD_batches.py
This function compiles the outputs of MD_batch_rin.py across multiple directories, so that entire trajectories can be broken up into sections and analysed “in parallel” for speed.
Note that the MD_batch_rin outputs in each directory must have the standard file names.
Help message
usage: merge_MD_batches.py [-h] [-s SEED] [-dirs [DIRS ...]] [-addwat]
merge data from batch processed dirs
options:
-h, --help show this help message and exit
-s SEED, -seed SEED seed ch:ID[,ch:ID,ch:ID]
-dirs [DIRS ...] directories to merge data from
-addwat add waters to batch rin for each frame
Inputs
-s: seed, specified and used exactly the same way as in individual processing-dirs: list of directories to combine MD outputs from-type: which of probe and arpeggio to use to generate the RINs-addwat: include the median number of waters in the batch_res_atoms files- off by default for now because distance analysis can be slow and so far we’ve mainly used this function for looking at the summary statistics
Outputs
batch_res_atoms.dat: res atoms file containing all non-water groups that have contacts in any of the given frames.- if
-addwatused, abatch_res_atoms.[frame].datfile will be created for each frame with the median number of waters added as well
- if
batch_stats.dat: summary statistics table of contact counts for each identified functional groupbatch_rawdf.pkl: python pickle file containing pandas dataframe with all contact count info for every functional group in every frame. can be used for data analysis/plotting/etcbatch_atominfo.pkl: lists of selected atoms for each functional group used to create batch_res_atoms.datwaters_per_frame.dat: list of how many water molecules have contacts in each of the given frames