Frequently Asked Questions

Running GSEA with DecoPath

To run GSEA using DecoPath, you will need:

An expression dataset
A file containing class labels (e.g., normal and tumor) for samples in the expression dataset.

What kind of data can I submit to run my experiments?

You can submit RNA-Seq, microarray, and ChIP-Seq data. Gene identifiers must be HUGO Gene Nomenclature Committee (HGNC) symbols.

How do I format expression data to run GSEA with DecoPath?

The expression data you submit to run GSEA should be in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file. The first and second column labels should be 'name' and 'description', respectively. All subsequent column labels should be the sample identifiers. Rows should contain an HGNC symbol, a description (which can be left as NA) and all subsequent columns should contain the expression value for that gene for each of the samples:

gene_symbol	description	sample_1	sample_2	sample_3
TSPAN6	NA	17.844	15.288	19.747
CFH	NA	145.610	209.655	93.282
C1orf112	NA	0.387	0.562	0.144

What is a class labels file?

Much like a categorical class (CLS) file format, this file (a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file) should contain specifications on the labels for each of the samples in the dataset.

For example, for a dataset with expression values for samples that fall into two distinct classes (e.g., normal and tumor), the class labels file should have a column with the same sample identifiers as the ones in the expression dataset file you submit and another column with their corresponding class labels. Please ensure the order of the samples in the expression dataset is the same as the order of samples in the class labels file. Below is an example of a class labels file:

identifier	class_label
sample_1	normal
sample_2	normal
sample_3	tumor
sample_4	tumor

Can I run GSEA against my own list of ranked genes?

Yes, GSEA can be used to analyze a pre-ranked list of genes using DecoPath.

To run GSEA pre-ranked, upload a file in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) format that contains the following 2 columns:

HGNC symbols
Class difference metric for the HGNC symbols

In this case, do not include a header (column names) in the file. See the example below:

TSPAN6	2.1
CFH	1.5
C1orf112	0.6
AKT1	-1.3

Running ORA with DecoPath

How can I run ORA using DecoPath?

To run ORA using DecoPath, you will need:

A gene list

What kind of file can I submit to run ORA?

To run ORA, submit a file that contains a single column with HGNC symbols in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) format.

Submitting your own results

Can I submit my own ORA or GSEA results?

You can also submit the results of an enrichment analysis and use the DecoPath visualizations and functionalities to compare and explore the consensus around different pathway databases. If you do opt to submit your own results, you must use the databases provided in DecoPath. We strongly recommend using gene set files from the following links:

How can I submit results for ORA?

If you submit ORA results, ensure they are in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) text file and the file contains the columns: pathway, p_value, q_value. For example:

pathway	p_value	q_value
WP1403	1.18e-06	9.35e-06
WP411	2.61e-09	5.72e-08
hsa04210	0.074	0.1

How can I submit results for GSEA?

If you submit GSEA results, ensure they are in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) text file and the file contains the columns: pathway, es, nes, p_value and q_value. For example:

pathway	es	nes	p_value	q_value
WP2636	-0.25	-0.25	0.2	0.22
hsa05031	-0.3	-0.13	2.61e-07	2.61e-09
hsa04210	0.074	0.1	2.3e-02	1.2e-03

Performing differential gene expression analysis

Do I need to upload fold changes or files to generate them to use DecoPath?

No, this is an optional field and can be skipped.

While you can run DecoPath without performing differential expression analysis, we also generate visualizations to identify genes that are differentially expressed according to a fold change cutoff.

What kind of format do I need to upload fold changes in?

The file you submit containing log2 fold changes should be a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file with the columns, gene_symbol, log2fc, p-value and q-value. Please ensure that the log2 fold changes of the genes uploaded correspond to the expression dataset or GSEA results file of the same experimental groups in the case of GSEA and to the log2 fold changes of genes in the gene list in the case of ORA. For example:

gene_symbol	log2fc	p_value	q_value
AK2	-0.16	0.0017	0.002
CD38	1.97	0.1	0.11
FKBP4	-0.79	4.15e-5	2.59e-4

What kind of format do I need to upload read counts in to perform differential gene expression analysis?

You will require two files (comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) files):

un-normalized read counts in the form of a matrix of integer values
class labels

The files must be consistently ordered such that the samples in the counts matrix are in the same order as the samples in the class labels file. If you are running GSEA, you only need to submit one class labels file corresponding to both the expression dataset to perform GSEA and the counts matrix to calculate the fold changes. Example files are given below:

Counts matrix

gene_symbol	sample_1	sample_2	sample_3
TSPAN6	2393	3187	4964
CFH	641	182	404
C1orf112	321	474	198

Class labels

identifier	class_label
sample_1	normal
sample_2	normal
sample_3	tumor
sample_4	tumor

Running DecoPath with other databases

Can I run DecoPath with other databases/gene sets?

Currently, we only provide 4 databases (i.e., KEGG, Reactome, WikiPathways and PathBank) to run pathway analysis. You can add more databases by uploading files in the gene matrix transposed (GMT) format containing HGNC symbols and their corresponding gene sets. You must also upload corresponding mapping files between each database you select as input.

So, if you wish to run the comparative analysis on KEGG, Reactome and your uploaded database, you need to first create mappings between KEGG and your uploaded database as well as Reactome and your uploaded database. For example, 'Citric acid cycle (TCA cycle)' pathway from Reactome is equivalent to 'TCA Cycle' from WikiPathways.

You can see example files here:

If you have prepared these files, you can follow this link to run DecoPath using your preferred gene set databases.

Managing your account

Do I have to register to use DecoPath?

You must register with an email address to use DecoPath. Once you have registered, you can view the experiments you have run on the Experiments page.

How can I delete my account?

You can delete your account at anytime on the Account page.