fairmd.lipids.analib.analyze_nmrpca module

NMRPCA module calculate the relaxation time based on the PCAlipids analysis.

For details check:

Principal Component Analysis of Lipid Molecule Conformational Changes in Molecular Dynamics Simulations Pavel Buslaev, Valentin Gordeliy, Sergei Grudinin, and Ivan Gushchin. Journal of Chemical Theory and Computation (2016), 12(3), 1019–1028. DOI: 10.1021/acs.jctc.5b01106

and

Principal component analysis highlights the influence of temperature, curvature and cholesterol on conformational dynamics of lipids Pavel Buslaev, Khalid Mustafin, and Ivan Gushchin Biochimica et Biophysica Acta (BBA) - Biomembranes Volume 1862, Issue 7, 1 July 2020, 183253 DOI: 10.1016/j.bbamem.2020.183253

The main idea is to run PCA on concatenated lipid trajectory, calculate the autocorrelation times, which are linearly related to equilibration times of individual lipids. The parameters of linear transform are calculated based on the trajectories from NMRlipids databank.

The initial code was developed by Dr. Pavel Buslaev (pavel.i.buslaev@jyu.fi), Dr. Samuli Olilla, Patrik Kula, Alexander Kuzmin

MEM_CUTOFF

Environmental variable which sets the trajectory size cutoff (in Bytes) after which the algorithm will start skipping frames for in-memory representation for computing ACF.

Default: 1,500,000,000.

class fairmd.lipids.analib.analyze_nmrpca.Parser(system: System, eq_time_fname='eq_times.json', path=None, *, verbose=True)[source]

Bases: object

A basic class to work with the trajectory.

It has basic utility for preparing trajectories:

  1. Calling AmberMerger to merge head groups and tails for Amber trajectories

  2. Concatenate trajectories

Parameters:
  • system – simulation data

  • eq_time_fname – name of the output file

  • path – name of a particular trajectory

  • v – verbosity for testing

prepare_traj()[source]
static verify_lipid(lipid: Lipid) bool[source]

Verify that the lipid is supported by the method.

concatenate_traj() None[source]

Create Concatenator and corresponding concatenated trajectories.

Done for all lipids available in the trajectory.

dump_data(data)[source]

Write data to json file.

class fairmd.lipids.analib.analyze_nmrpca.Topology(traj, lipid_resname: str, mapping_dict: dict)[source]

Bases: object

Extract lipid specific data and also to merge lipids if they are more than 1 residue.

In Amber FF naming convention, lipid is often represented as several residue types. The clases has the following methods:

  1. Simple constructor, which sets the force field,residue names, trajectory, and loads mapping_file

  2. Mapping file loader

  3. Function that outputs atomNames

  4. Checker if the merge is needed. Currently defunct

  5. Runner, that returns lists for head, tail1, tail2 orders for merging

Currently defunct.

Parameters:
  • traj – MDAnalysis trajectory

  • lipid_resname – resname of the lipid (str)

  • mapping_dict – preloaded mapping_dict

HEADGRP = 'headgroup'
TAILSN1 = 'sn-1'
TAILSN2 = 'sn-2'
atom_names() list[source]

Extract all names of heavy atoms from the mapping.

is_merge_needed() set[source]

Check if lipid needs to be merged. Empty set means no merge needed.

Currently it checks if the RESIDUE key is in the mapping file. NOTE: This has to be changed if the content of mapping file changes.

get_lipid_resnames()[source]

Get the residue names for lipid, if merge is needed

assign_resnames(resnames)[source]

Find head, sn-1 and sn-2 tails

NOTE: currently there are only lipids with 2 different tails available in databank: e.g. POPC or POPG. This leads to different names of tails. This won’t be the case for DOPC. Currently the algorithm is using that all three groups differ

run_merger() tuple[list, list, list][source]

Find lipid tails that correspond to a particular head-group.

NOTE: currently there are only lipids with 2 different tails available in databank: e.g. POPC or POPG. This leads to different names of tails. This won’t be the case for DOPC. Currently the algorithm is using that all three groups differ

TODO: get the correspondence from structure

class fairmd.lipids.analib.analyze_nmrpca.Concatenator(topology: Topology, traj, lipid_resname: str)[source]

Bases: object

Concatenate trajectory for lipid types.

It has the following methods:

  1. Simple constructor, which sets the topology,residue names, and trajectory

  2. concatenateTraj() method to do the basic by lipid concatenation

  3. alignTraj() method to perform the alignment of the concatenated trajectory

  4. The enveloping concatenate method

Parameters:
  • topology – topology for lipid

  • traj – MDAnalysis trajectory

  • lipid_resname – lipid resname in the trajectory

concatenate_traj() tuple[Universe, int, int][source]

Perform basic trajectory concatination.

First, it extracts coordinates from trajectory, next, it reshapes the coordinate array, swaps time and resid axes to obtain continuous trajectories of individual lipids (this is needed for autocorrelation time analysis), and finally merges individual lipid trajectories.

concatenate_traj_with_merging()[source]

Perform basic trajectory concatination with merging.

In contrast to basic concatenate_traj() it additionally merges splitted lipids. First, it creates extracts coordinates from trajectory, next, it reshapes the coordinate array, swaps time and resid axes to obtain continuous trajectories of individual lipids (this is needed for autocorrelation time analysis), and finally merges individual lipid trajectories.

align_traj(concatenated_traj) tuple[ndarray, ndarray][source]

Align the concatenated trajectory.

Alignment occurs in two steps: (1) it computes average structure after alignment to the first frame, and (2) it alignes the structure to the calculated average structure in (1).

concatenate()[source]

Enveloping function to perform concatenation.

class fairmd.lipids.analib.analyze_nmrpca.PCA(aligned_traj, av_pos, n_lipid, n_frames, traj_time)[source]

Bases: object

Perform Pinciple Component Analysis (PCA).

It has the following methods:

  1. Simple constructor, which sets the aligned trajtory, average coordinates, number of lipids, number of frames in the concatenated trajectory and trajectory length in ns

  2. PCA() prepares the trajectory coordinates for analysis and calculates principal components

  3. get_proj() projects the trajectory on principal components

  4. get_lipid_autocorrelation() calculates the autocorrelation timeseries for individual lipid

  5. get_autocorrelations() calculates the autocorrelation timeseries for trajectory

Parameters:
  • aligned_traj – np.array with positions of concatenated and aligned trajectory

  • av_pos – np.array with average positions for the lipid

  • n_lipid – number of lipids of particular type in the system

  • n_frames – number of frames in the concatenated trajectory

  • traj_time – trajectory length in ns

PCA()[source]

Calculate the PCA.

First the data is centered and then covariance matrix is calculated manually.

get_proj(cdata) None[source]

Projecting the trajectory on the 1st principal component.

get_lipid_autocorrelation(data, variance, mean)[source]

Autocorrelation calculation for individual lipid.

get_autocorrelations() None[source]

Autocorrelation calculation for the trajectory.

class fairmd.lipids.analib.analyze_nmrpca.TimeEstimator(autocorrelation)[source]

Bases: object

Estimate equilbration time from autocorrelation data.

It includes the following methods:

  1. Simple constructor, which sets the autocorrelation data

  2. Helper method get_nearest_value() that finds the interval in the data where the autocorrelation falls below specific value

  3. timerelax() method calculates the logarithms of autocorrelation and calculates the decay time

  4. calculate_time() method calculates the estimated equilibration time

Parameters:

autocorrelation – autocorrelation data

get_nearest_value(iterable, value)[source]

Return the indices that frame the particular value.

As an input it gets an array, which is expected to decay. The method tries to find and index (index_A), for which the array gets below the cutoff value for the first time. Then, for index_A-1 the array was larger than the cutoff value for the last time. It means that the function, represented by the array data is equal to cutoff somewhere between timesteps that correspond to index_A-1 and index_A. It can happen, that this index_A is equal 0. Then, method searches for the first occurance, where array is smaller than array[0]. The method returns the interval inbetween the array gets below cutoff.

timerelax() float[source]

Estimates autocorrelation decay time.

calculate_time() float[source]

Estimate equilibration time from autocorrelation decay time.

The times are linearly connected and the coefficient is calculated experimentally.