Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
MapAlignerPoseClustering

Corrects retention time distortions between maps, using a pose clustering approach.

potential predecessor tools $ \longrightarrow $ MapAlignerPoseClustering $ \longrightarrow $ potential successor tools
FeatureFinderCentroided
(or another feature finding algorithm)
FeatureLinkerUnlabeled or
FeatureLinkerUnlabeledQT

This tool provides an algorithm to align the retention time scales of multiple input files, correcting shifts and distortions between them. Retention time adjustment may be necessary to correct for chromatography differences e.g. before data from multiple LC-MS runs can be combined (feature grouping), or when one run should be annotated with peptide identifications obtained in a different run.

All map alignment tools (MapAligner...) collect retention time data from the input files and - by fitting a model to this data

The map alignment tools differ in how they obtain retention time data for the modeling of transformations, and consequently what types of data they can be applied to. The alignment algorithm implemented here is the pose clustering algorithm as described in doi:10.1093/bioinformatics/btm209. It is used to find an affine transformation, which is further refined by a feature grouping step. This algorithm can be applied to features (featureXML) and peaks (mzML), but it has mostly been developed and tested on features. For more details and algorithm-specific parameters (set in the INI file) see "Detailed Description" in the algorithm documentation.

See also
MapAlignerPoseClustering MapAlignerSpectrum MapRTTransformer

This algorithm uses an affine transformation model.

To speed up the alignment, consider reducing 'max_number_of_peaks_considered'. If your alignment is not good enough, consider increasing this number (the alignment will take longer though).

The command line parameters of this tool are:

MapAlignerIdentification -- Corrects retention time distortions between maps based on common peptide identifi
cations.
Version: 2.0.0 May 29 2015, 13:57:09, Revision: GIT-NOTFOUND

Usage:
  MapAlignerIdentification <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.

Options (mandatory options marked with '*'):
  -in <files>*               Input files separated by blanks (all must have the same file type) (valid format
                             s: 'featureXML', 'consensusXML', 'idXML')
  -out <files>               Output files separated by blanks. Either 'out' or 'trafo_out' has to be provided
                             . They can be used together. (valid formats: 'featureXML', 'consensusXML', 'idXM
                             L')
  -trafo_out <files>         Transformation output files separated by blanks. Either 'out' or 'trafo_out' 
                             has to be provided. They can be used together. (valid formats: 'trafoXML')
                             

Options to define a reference file (use either 'file' or 'index', not both; if neither is given 'index' is 
used).:
  -reference:file <file>     File to use as reference (same file format as input files required) (valid forma
                             ts: 'featureXML', 'consensusXML', 'idXML')
  -reference:index <number>  Use one of the input files as reference ('1' for the first file, etc.).
                             If '0', no explicit reference is set - the algorithm will select a reference. (
                             default: '0' min: '0')

                             
Common TOPP options:
  -ini <file>                Use the given TOPP INI file
  -threads <n>               Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>          Writes the default configuration file
  --help                     Shows options
  --helphelp                 Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section
 - model       Options to control the modeling of retention time transformations from data

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+MapAlignerIdentificationCorrects retention time distortions between maps based on common peptide identifications.
version2.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'MapAlignerIdentification'
in[] Input files separated by blanks (all must have the same file type)input file*.featureXML,*.consensusXML,*.idXML
out[] Output files separated by blanks. Either 'out' or 'trafo_out' has to be provided. They can be used together.output file*.featureXML,*.consensusXML,*.idXML
trafo_out[] Transformation output files separated by blanks. Either 'out' or 'trafo_out' has to be provided. They can be used together.output file*.trafoXML
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++referenceOptions to define a reference file (use either 'file' or 'index', not both; if neither is given 'index' is used).
file File to use as reference (same file format as input files required)input file*.featureXML,*.consensusXML,*.idXML
index0 Use one of the input files as reference ('1' for the first file, etc.).
If '0', no explicit reference is set - the algorithm will select a reference.
0:∞
+++algorithmAlgorithm parameters section
peptide_score_threshold0 Score threshold for peptide hits to be used in the alignment.
Select a value that allows only 'high confidence' matches.
min_run_occur2 Minimum number of runs (incl. reference, if any) a peptide must occur in to be used for the alignment.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
2:∞
max_rt_shift0.5 Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment.
If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale.
0:∞
use_unassigned_peptidestrue Should unassigned peptide identifications be used when computing an alignment of feature maps? If 'false', only peptide IDs assigned to features will be used.true,false
use_feature_rtfalse When aligning feature maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used.
Precludes 'use_unassigned_peptides'.
true,false
+++modelOptions to control the modeling of retention time transformations from data
typeb_spline Type of modellinear,b_spline,interpolated
++++linearParameters for 'linear' model
symmetric_regressionfalse Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'.true,false
++++b_splineParameters for 'b_spline' model
wavelength0 Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points.0:∞
num_nodes5 Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing.0:∞
extrapolatelinear Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range).linear,b_spline,constant,global_linear
boundary_condition2 Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero)0:2
++++interpolatedParameters for 'interpolated' model
interpolation_typecspline Type of interpolation to apply.linear,cspline,akima

OpenMS / TOPP release 2.0.0 Documentation generated on Fri May 29 2015 17:20:34 using doxygen 1.8.9.1