Parameter Setting

Pathway Database
The KEGG pathway was updated in December, 2024 using KEGG API. The integration can occur in two different "universes" defined by metabolic pathways or all pathways. Metabolic pathways include pathways containing both metabolites and metabolic genes while all pathways include both metabolic pathways as well as gene-only pathways (i.e. regulatory pathways). Users can also perform enrichment analysis for metabolites only using metabolic pathways (metabolite only) or for genes only using all pathways (gene only).
Algorithm Selection

The topology analysis evaluates the potential importance of a particular molecule (a node) based on its position within a pathway. Degree Centrality measures the number of links that connect to a node. Betweenness Centrality measures the number of shortest paths from all nodes to all the others that pass through a given node. Closeness Centrality measures the overall distance from a given node to all other nodes.

For integration methods, there are two general approaches - tight integration by combining queries in which genes and metabolites are pooled into a single query and used to perform enrichment analysis within their "pooled universe" or loose integration by combining p values in which enrichment analysis is performed separately for genes and metabolites in their "individual universe", and then individual p-values are combined via weighted Z-tests. Moreover, there are three options for computing weights. Let's assume the pathway database contains a total of 100 pathways covering a total of 1000 metabolites and 4000 genes, respectively. Pathway A contains 5 compounds and 45 genes, while pathway B contains 20 compounds and 30 genes.

  • Unweighted or equal weights (i.e metabolite: 0.5, gene: 0.5);
  • Weights based on the overall proportion of each omics within the "universe" (i.e metabolite: 0.2, gene: 0.8 for all pathways);
  • Weights based on the pathway-level proportion within individual "pathway space" (i.e. pathway A - metabolite 0.1, gene 0.9; pathway B - metabolite 0.4, gene 0.6)
Note that combing p-values can only be applied to pathways that receive hits from both input types (genes + metabolites). For pathways with hits from only one input type, p values calculated from their individual universe will be used. In this case, combining p values can be viewed as adjusting the confidence level based on new evidence (i.e. input from another omics layer). If no new evidence is available, the current confidence level remains.