Frequently Asked Questions
Running GSEA with DecoPath
How can I run GSEA using DecoPath?
To run GSEA using DecoPath, you will need:
- An expression dataset
- A file containing class labels (e.g., normal and tumor) for samples in the expression dataset.
What kind of data can I submit to run my experiments?
You can submit RNA-Seq, microarray, and ChIP-Seq data. Gene identifiers must be HUGO Gene Nomenclature Committee (HGNC) symbols.
How do I format expression data to run GSEA with DecoPath?
The expression data you submit to run GSEA should be in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file. The first and second column labels should be 'name' and 'description', respectively. All subsequent column labels should be the sample identifiers. Rows should contain an HGNC symbol, a description (which can be left as NA) and all subsequent columns should contain the expression value for that gene for each of the samples:
gene_symbol | description | sample_1 | sample_2 | sample_3 |
---|---|---|---|---|
TSPAN6 | NA | 17.844 | 15.288 | 19.747 |
CFH | NA | 145.610 | 209.655 | 93.282 |
C1orf112 | NA | 0.387 | 0.562 | 0.144 |
Much like a categorical class (CLS) file format, this file (a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file) should contain specifications on the labels for each of the samples in the dataset.
For example, for a dataset with expression values for samples that fall into two distinct classes (e.g., normal and tumor), the class labels file should have a column with the same sample identifiers as the ones in the expression dataset file you submit and another column with their corresponding class labels. Please ensure the order of the samples in the expression dataset is the same as the order of samples in the class labels file. Below is an example of a class labels file:
identifier | class_label |
---|---|
sample_1 | normal |
sample_2 | normal |
sample_3 | tumor |
sample_4 | tumor |
Can I run GSEA against my own list of ranked genes?
Yes, GSEA can be used to analyze a pre-ranked list of genes using DecoPath.
To run GSEA pre-ranked, upload a file in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) format that contains the following 2 columns:
- HGNC symbols
- Class difference metric for the HGNC symbols
In this case, do not include a header (column names) in the file. See the example below:
TSPAN6 | 2.1 |
CFH | 1.5 |
C1orf112 | 0.6 |
AKT1 | -1.3 |
Running ORA with DecoPath
How can I run ORA using DecoPath?
To run ORA using DecoPath, you will need:
- A gene list
What kind of file can I submit to run ORA?
To run ORA, submit a file that contains a single column with HGNC symbols in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) format.
Submitting your own results
Can I submit my own ORA or GSEA results?
You can also submit the results of an enrichment analysis and use the DecoPath visualizations and functionalities to compare and explore the consensus around different pathway databases. If you do opt to submit your own results, you must use the databases provided in DecoPath. We strongly recommend using gene set files from the following links:
How can I submit results for ORA?
If you submit ORA results, ensure they are in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) text file and the file contains the columns: pathway, p_value, q_value. For example:
pathway | p_value | q_value |
---|---|---|
WP1403 | 1.18e-06 | 9.35e-06 |
WP411 | 2.61e-09 | 5.72e-08 |
hsa04210 | 0.074 | 0.1 |
How can I submit results for GSEA?
If you submit GSEA results, ensure they are in a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) text file and the file contains the columns: pathway, es, nes, p_value and q_value. For example:
pathway | es | nes | p_value | q_value |
---|---|---|---|---|
WP2636 | -0.25 | -0.25 | 0.2 | 0.22 |
hsa05031 | -0.3 | -0.13 | 2.61e-07 | 2.61e-09 |
hsa04210 | 0.074 | 0.1 | 2.3e-02 | 1.2e-03 |
Performing differential gene expression analysis
Do I need to upload fold changes or files to generate them to use DecoPath?
No, this is an optional field and can be skipped.
While you can run DecoPath without performing differential expression analysis, we also generate visualizations to identify genes that are differentially expressed according to a fold change cutoff.
What kind of format do I need to upload fold changes in?
The file you submit containing log2 fold changes should be a comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) file with the columns, gene_symbol, log2fc, p-value and q-value. Please ensure that the log2 fold changes of the genes uploaded correspond to the expression dataset or GSEA results file of the same experimental groups in the case of GSEA and to the log2 fold changes of genes in the gene list in the case of ORA. For example:
gene_symbol | log2fc | p_value | q_value |
---|---|---|---|
AK2 | -0.16 | 0.0017 | 0.002 |
CD38 | 1.97 | 0.1 | 0.11 |
FKBP4 | -0.79 | 4.15e-5 | 2.59e-4 |
You will require two files (comma-separated (*.csv), tab-separated (*.tsv) or plain-text (*.txt) files):
- un-normalized read counts in the form of a matrix of integer values
- class labels
The files must be consistently ordered such that the samples in the counts matrix are in the same order as the samples in the class labels file. If you are running GSEA, you only need to submit one class labels file corresponding to both the expression dataset to perform GSEA and the counts matrix to calculate the fold changes. Example files are given below:
Counts matrix
gene_symbol | sample_1 | sample_2 | sample_3 |
---|---|---|---|
TSPAN6 | 2393 | 3187 | 4964 |
CFH | 641 | 182 | 404 |
C1orf112 | 321 | 474 | 198 |
Class labels
identifier | class_label |
---|---|
sample_1 | normal |
sample_2 | normal |
sample_3 | tumor |
sample_4 | tumor |
Running DecoPath with other databases
Can I run DecoPath with other databases/gene sets?
Currently, we only provide 4 databases (i.e., KEGG, Reactome, WikiPathways and PathBank) to run pathway analysis. You can add more databases by uploading files in the gene matrix transposed (GMT) format containing HGNC symbols and their corresponding gene sets. You must also upload corresponding mapping files between each database you select as input.
So, if you wish to run the comparative analysis on KEGG, Reactome and your uploaded database, you need to first create mappings between KEGG and your uploaded database as well as Reactome and your uploaded database. For example, 'Citric acid cycle (TCA cycle)' pathway from Reactome is equivalent to 'TCA Cycle' from WikiPathways.
You can see example files here:If you have prepared these files, you can follow this link to run DecoPath using your preferred gene set databases.
Managing your account
Do I have to register to use DecoPath?
You must register with an email address to use DecoPath. Once you have registered, you can view the experiments you have run on the Experiments page.
You can delete your account at anytime on the Account page.