Scripts.DatabankLib.analyze_nmrpca module
This code aims to calculate the relaxation time based on the PCAlipids analysis.
For details check:
[Principal Component Analysis of Lipid Molecule Conformational Changes in Molecular Dynamics Simulations Pavel Buslaev, Valentin Gordeliy, Sergei Grudinin, and Ivan Gushchin Journal of Chemical Theory and Computation 2016 12 (3), 1019-1028 DOI: 10.1021/acs.jctc.5b01106 ]
and
[Principal component analysis highlights the influence of temperature, curvature and cholesterol on conformational dynamics of lipids Pavel Buslaev, Khalid Mustafin, and Ivan Gushchin Biochimica et Biophysica Acta (BBA) - Biomembranes Volume 1862, Issue 7, 1 July 2020, 183253 DOI: 10.1016/j.bbamem.2020.183253]
The main idea is to run PCA on concatenated lipid trajectory, calculate the autocorrelation times, which are linearly related to equilibration times of individual lipids. The parameters of linear transform are calculated based on the trajectories from NMRlipids databank.
The code was developed by Dr. Pavel Buslaev, pavel.i.buslaev@jyu.fi Dr. Samuli Olilla Patrik Kula Alexander Kuzmin
- class Scripts.DatabankLib.analyze_nmrpca.Parser(system: System, eq_time_fname='eq_times.json', path=None, v=True)[source]
Bases:
object
Class Parser is a basic class to work with the trajectory. It has basic utility for preparing trajectories:
Checking if simulation has a correct name
Checking if trajectory is downloaded. Downloading, if not
3. Calling AmberMerger to merge head groups and tails for Amber trajectories 4. Concatenate trajectories
- validate_path()[source]
Path validation. Behaviour depends on the input. 1. If ther were any errors, validation fails. Currently the only error tested is that .xtc and .tpr files are not present. This is the case for non-GROMACS trajectories. 2. If path is not provided, parser is iterating over all trajectories. 3. If path is provided and current path is equal to the provided one, parser reports that it finds the trajectory for the analysis. TODO: must be removed. Substitute path-validation part.
- download_traj()[source]
TODO: must be removed. We don’t download trajectory in the analysis part Basic trajectory and TPR download.
- prepare_gmx_traj()[source]
Preparing trajectory. If centered trajectory is found, use it. If whole trajectory is found, use it. Otherwise, call gmx trjconv to make whole trajectory. The selected trajectory is loaded into Universe.
- class Scripts.DatabankLib.analyze_nmrpca.Topology(traj, lipid_resname: str, mapping_dict: dict)[source]
Bases:
object
Class Topology is a class needed to extract lipid specific data and also to merge lipids from Amber trajectories, where lipid is often represented as 3 residue types. It has the following methods:
1. Simple constructor, which sets the force field,residue names, trajectory, and loads mapping_file 2. Mapping file loader 3. Function that outputs atomNames 4. Checker if the merge is needed. Currently defunct 5. Runner, that returns lists for head, tail1, tail2 orders for merging
Currently defunct
- is_merge_needed()[source]
Checker for merge. Currently it checks if the RESIDUE key is in the mapping file. NOTE: This has to be changed if the content of mapping file changes
- get_lipid_resnames()[source]
Helper function that gets the residue names for lipid, if merge is needed
- assign_resnames(resnames)[source]
Helper function that finds head, sn-1 and sn-2 tails NOTE: currently there are only lipids with 2 different tails available in databank: e.g. POPC or POPG. This leads to different names of tails. This won’t be the case for DOPC. Currently the algorithm is using that all three groups differ
- run_merger()[source]
Find lipid tails that correspond to a particular head-group. NOTE: currently there are only lipids with 2 different tails available in databank: e.g. POPC or POPG. This leads to different names of tails. This won’t be the case for DOPC. Currently the algorithm is using that all three groups differ TODO: get the correspondence from structure
- class Scripts.DatabankLib.analyze_nmrpca.Concatenator(topology: Topology, traj, lipid_resname: str)[source]
Bases:
object
Class Concatenator is a class needed to concatenate trajectory for lipid types. It has the following methods: 1. Simple constructor, which sets the topology,residue names, and trajectory 2. concatenateTraj method to do the basic by lipid concatenation 3. alignTraj method to perform the alignment of the concatenated trajectory 4. The enveloping concatenate method
- concatenate_traj()[source]
concatenateTraj performs basic trajectory concatination. First, it extracts coordinates from trajectory, next, it reshapes the coordinate array, swaps time and resid axes to obtain continuous trajectories of individual lipids (this is needed for autocorrelation time analysis), and finally merges individual lipid trajectories.
- concatenate_traj_with_merging()[source]
concatenateTrajWithMerging performs basic trajectory concatination. In contrast to basic concatenateTraj it additionally merges splitted lipids. First, it creates extracts coordinates from trajectory, next, it reshapes the coordinate array, swaps time and resid axes to obtain continuous trajectories of individual lipids (this is needed for autocorrelation time analysis), and finally merges individual lipid trajectories.
- class Scripts.DatabankLib.analyze_nmrpca.PCA(aligned_traj, av_pos, n_lipid, n_frames, traj_time)[source]
Bases:
object
Class PCA is a class that actually performs PCA. It has the following methods: 1. Simple constructor, which sets the aligned trajtory, average coordinates, number of lipids, number of frames in the concatenated trajectory and trajectory length in ns 2. PCA prepares the trajectory coordinates for analysis and calculates principal components 3. get_proj projects the trajectory on principal components 4. get_lipid_autocorrelation calculates the autocorrelation timeseries for individual lipid 5. get_autocorrelations calculates the autocorrelation timeseries for trajectory
- PCA()[source]
PCA calculates the PCA. First the data is centered and then covariance matrix is calculated manually.
- class Scripts.DatabankLib.analyze_nmrpca.TimeEstimator(autocorrelation)[source]
Bases:
object
Class TimeEstimator is a class that estimates equilbration time from autocorrelation data. It includes the following methods:
Simple constructor, which sets the autocorrelation data
2. Helper method get_nearest_value that finds the interval in the data where the autocorrelation falls below specific value 3. timerelax method calculates the logarithms of autocorrelation and calculates the decay time 4. calculate_time method calculates the estimated equilibration time
- get_nearest_value(iterable, value)[source]
get_nearest_value method return the indices that frame the particular value As an input it gets an array, which is expected to decay. The method tries to find and index (index_A), for which the array gets below the cutoff value for the first time. Then, for index_A-1 the array was larger than the cutoff value for the last time. It means that the function, represented by the array data is equal to cutoff somewhere between timesteps that correspond to index_A-1 and index_A. It can happen, that this index_A is equal 0. Then, method searches for the first occurance, where array is smaller than array[0]. The method returns the interval inbetween the array gets below cutoff.