Please choose one of the following analysis methods:

Hypergeometric test
Global Test
Gene Set Enrichment Analysis
Signaling Pathway Impact Analysis
CliPPER

Select a species:

Homo sapiens
Mus musculus
Drosophila melanogaster

Choose a pathway database:

KEGG
Reactome

Set the minimum number of genes in common between experimental data and pathways:

Have you already computed Differentially Expressed Genes (DEGs)?

Yes
No

Please select a file with a matrix containing the expression values for each gene:

Experimental data source:

Microarray
RNA-seq

Significance level:

Please select a file containing the list of all the genes of your dataset (only the list of gene IDs):


Upload the list of DEGs with their fold change:


Analysis Methods

Hypergeometric test estimates the chance probability of observing a given number of genes from a pathway among the selected differentially expressed genes.

GSEA compares the distribution of pathway genes adjusted for their correlation structure to that of the genes in the entire list of genes.

SPIA captures several aspects of the data combining the fold change of the differentially expressed genes, the pathway enrichment and the topology of pathways.

Global Test uses a penalized logistic regression model to identify the genes in a pathway that better predict samples class division.

CliPPER uses graphical Gaussian models to compare the mean and the correlation structure of the genes of a pathway in two conditions. For significant pathways, CliPPER identifies the best portion of the pathway associated to the phenotype.

Inputs

Normalized expression matrix must be a tab delimited text file; the first row must contain sample names, the sample name represents the sample class; the first column must contain gene IDs.
If the matrix contains missing values (NAs), they will be automatically imputed using the K-nearest neighbur algorithm (impute Bioconductor package)

As the data can be either Microarray or RNA-seq data, "Experimental data source" allows the selection of the method for DEG computation.

  • Microarray: empirical Bayes test from limma Bioconductor package.
  • RNA-seq: negative binomial test from edgeR Bioconductor package.

When choosing RNAseq, the user can also perform a correction for the length of the genes.

The "Significance level" box sets the False Discovery Rate threshold for the identification of DEGs.

Download a sample matrix (microarray expression data - Homo sapiens).

Inputs

The complete list of the genes in the platform must be a plain text file with as many rows as the number of genes. Each line is a gene ID.

The list of differentially expressed genes must be a tab-delimited file with two columns:

  1. Gene ID;
  2. log(Fold Change).

Download Homo sapiens sample files.

Inputs

Expression matrix must be a tab delimited text file; the first row must contain sample names, the sample name represents the sample class; the first column must contain gene IDs.
If the matrix contains missing values (NAs), they will be automatically imputed using the K-nearest neighbur algorithm (impute Bioconductor package)

"Experimental data source" allows the selection between Microarray or RNA-seq data.

Download a sample matrix (microarray expression data - Homo sapiens)