Data Formats:

Example datasets available for download, including:

  • Compound concentration data - cow, four groups (download)
  • Compound concentration data - human, two groups (download)
  • Binned NMR/MS spectra data (download)
  • Processed peak intensity table (download)
  • Time-series peak intensity data (download)
  • Multi-factor / covariates data - COVID-19 peak intensity table (download) and metadata (download)
  • mzTab 2.0-M file (download)
  • MS peak list data (3 columns - mass, p value, and t-score) (download)

Zipped files (.zip) format datasets, including :

  • NMR peak lists (2 columns - chemical shift and intensity) (download)
  • MS peak lists (2 columns - mass and intensity) (download)
  • MS peak lists (3 columns - mass, retention time, and intensity) (download)
  • LC-MS raw spectra (mzML) (download)
Note: please refer to detailed instructions and screenshots listed below.

Comma Separated Values (.csv) or Tab Delimited Text (.txt):

These two formats are used for concentration data, peak intensity table, and MS/NMR spectral bins. Samples can be in either rows or columns. Note,
  1. Both sample or feature names must be unique and consist of a combination of common English letters, underscores and numbers for naming purpose. Latin/Greek letters are not supported.
  2. Class labels must immediately follow sample names; for two-factor and time series data, there must be two class labels corresponding to the two factors.
  3. For time-series data, the time-point group must be named as Time. In addition, the samples collected from the same subjects at different time points should be consecutive (See the screenshots demo for "Two-factor / Time series")
  4. Data values (concentrations, bins, peak intensities) should contain only numeric and positive values (using empty or NA for missing values). In addition, there should not be spaces between numbers. For instance, 1 600 should be formatted as 1600, if not the value will get read as 1.

mzTab 2.0-M files (.mzTab)

MetaboAnalyst now supports the upload of mzTab files in the Statistical Analysis module. MetaboAnalyst parses both the Metadata Table (MTD) and the Small Molecule Table (SML) to a MetaboAnalyst ready data table format. From the SML, users can either choose to have their features named using the "chemical_name" or "theoretical_neutral_mass". If too many of these are missing however, the features will be named with the "SML_ID". Further, if there are duplicate names, the "SML_ID" will be appended to the end of the selected feature identifier. From the MTD, "study_variable" labeled "Blank" will be excluded from the final data table. Note that MetaboAnalyst supports only mzTab-M 2.0 files that have been validated to ensure that the files can be read by our software.

Zipped files (.zip)

For NMR/MS peak list files and GC/LC-MS spectra data, users need to upload a zipped folder containing data files from different groups under study (one file per spectrum and one sub-folder for each group ). For paired comparison, users need to upload a separate text file specifying the paired information.

GC/LC-MS spectra must be in either NetCDF, mzXML, or mzDATA format. The spectra should be stored in two separate folders according to their class labels then compressed into zip files. Please note, the program is not compatible with the most recent WinZip (v12.0) with default option. Make sure to select the Legacy compression (Zip 2.0 compatible) for compressing files. No space is allowed in either the folder names or the spectra names. The size limit for each uploaded zip file is 50M. Please contact the author if you wish to upload a bigger data size.

The peak list data is composed of peak list files organized into separate folders named by their class labels. For example, if your data contains three groups, the peak list files should be organized into three folders accordingly. Compress these folders into a single zip file then upload them to MetaboAnalyst.

NMR peak list files should contain two comma separated columns with the 1st column for peak positions (ppm) and the 2nd column for peak intensities; MS peak list files can be in either two-column (mass and intensities) or three-column format (mass, retention time and intensities), but not a mixture of both. The first line of each peak list file is reserved for column labels. The file must be saved in comma separated values (.csv) format.

Genome Canada Genome Quebec NIH NSERC CRC
Processing ....

This may take a while to complete, please be patient....

Your session is about to expire!

You will be logged off in seconds.

Do you want to continue your session?