The Proteome Discovery Pipeline for LC-MS Data
The Open Proteomics Journal, "Data Analysis Pipeline for Mass Spectrometry-Based Differential Proteomics Discovery", 2010, 3, pp 8-19
cceHUB Data Repository or User Data Collection
Data Tip
Instrument-generated data files are archived in the cceHUB data repository for the CCE Colorectal Cancer collection, the CPTC collection, and other data collections produced by the XCTPLUS-IONTRAP, MicroMass-QTOF, and LC-MSD-TOF. You can browse the files in the cceHUB repository for input into the discovery pipeline. You can also input data files direcly from your own collection into the discovery pipeline.Workflow Tip
When using the discovery pipeline to analyze instrument-generated data files associated with your study, you will need to process each instrument-generated data file through the Spectrum Deconvolution Tool.
Spectrum Deconvolution Tool
Learn More >
Data Tip
Input = one instrument-generated data file, with a format type of mzXML, mzData or netCDF.Output = one "dlt" deconvoluted data file for each input data file. You can select the location for the output "dlt" file.
Tool Tip
You will run this tool once for each data file in your study. The parameters for the deconvolution specify information about the instrument type and the instrument runtime settings, such as analyte type (peptide or metabolite), instrument type (ion trap or TOF), data acquisition mode (positive or negative), and retention time minumum and maximum. Click here for more information about the parameters.Workflow Tip
The "dlt" files generated by the spectrum deconvolution tool should be placed in a folder in your cceHUB home. You can identify the folder where the "dlt" files should be placed, and then you can direct the peak alignment tool to access them from that folder. You can align all the "dlt" files as a single group. More often, though, you will align the "dlt" files in groups. For example, you may have files representing both cancer and healthy patient samples. In the alignment tool, you may want to load all the files in your study in two groups: one group for cancer patients and the other group for healthy patients.
Peak Alignment Tool
Learn More >
Data Tip
Input = some or all of the "dlt" data files in your study, split into a number of groups. The number of groups can be one, two or more, and (for the cancer studies) can be based on diagnosis, age, gender, treatment outcome, or some other characteristic of the data.Output = a single "org" file with all input data aligned.
Tool Tip
You will specify the number of groups and then select the data files by group. The tool can search the data repository or any folder from your cceHUB home. You will enter parameters that control the alignment, such as m/z variation. Click here for more information about the parameters.Workflow Tip
There is one single "org" file generated by the peak alignment tool, and it contains aligned data from all the input "dlt" files. You may run the alignment tool many times, using different groupings of "dlt" files or different parameters. Every run produces a new "org" file.
Normalization Tool
Learn More >
Data Tip
Input = one "org" data file output by the Peak Alignment Tool. The file represents the aligned data for the data files grouped according to the input specification for the Peak Alignment Tool.Output = a single file with all input data normalized. This file has an "ntx" extension.
Tool Tip
You need to specify the number of groups and data files per group, as identified to the Peak Alignment Tool. The group information is not maintained, and therefore you need specify this information for your input file to the Normalization Tool before you can initiate a normalization run. The tool can search the data repository or any folder from your cceHUB home. You will enter parameters that control the normalization, such as the maximum and minimum of samples for peak detection. Click here for more information about the tool methods and parameters.Workflow Tip
There is one single "ntx" file generated by the normalization tool, and it contains the normalized data from the input "org" file. It represents the aligned, normalized data from your original raw data, organized by groups. You may run the normalization tool many times, using different alignment groupings or different parameters. The output "ntx" file is used as input by the last two analysis programs in the workflow: Significance Testing Tool and Pattern Recognition Tool.
Significance Testing Tool
Learn More >
Data Tip
Input = one "ntx" data file output by the Normalization Tool. The file represents the aligned, normalized data for the original raw data files, grouped according to the input specification for the Peak Alignment Tool.Output = significance results, charts and images.
Tool Tip
You need to specify the number of groups and data files per group, as identified to the Peak Alignment Tool. The group information is not maintained, and therefore you need specify this information for your input file to the Significance Testing Tool before you can initiate a run. The tool can search the data repository or any folder from your cceHUB home. Click here for more information about the tool methods and parameters.Workflow Tip
No output generated by the Significance Testing Tool is needed by other workflow tools. You may run the Significance Testing tool many times, using different alignment groupings.
Pattern Recognition Tool
Learn More >
Data Tip
Input = one "ntx" data file output by the Normalization Tool. The file represents the aligned, normalized data for the original raw data files, grouped according to the input specification for the Peak Alignment Tool.Output = charts and images.
Tool Tip
You need to specify the number of groups and data files per group, as identified to the Peak Alignment Tool. The group information is not maintained, and therefore you need specify this information for your input file to the Pattern Recognition Tool before you can initiate a run. The tool can search the data repository or any folder from your cceHUB home. Click here for more information about the tool methods and parameters.Workflow Tip
No output generated by the Pattern Recognition Tool is needed by other workflow tools. You may run the Significance Testing tool many times, using different alignment groupings.