- Functional profiling documentation
- Single enrichment analysis
- Set enrichment analysis
- Module enrichment analysis
- Tissue phenotype based profiling
- De novo annotations (Blast2GO)
Functional profiling documentation¶
Single enrichment analysis¶
This is the conventional enrichment test. FatiGO takes two lists of genes (ideally a group of interest and the rest of the genes in the experiment, although any two groups, formed in any way, can be tested against each other) and convert them into two lists of annotations using the corresponding gene-annotation association table. Annotations can be GO terms, pathways (KEGG, Biocarta, reactome), regulatory annotations (TFBSs, miRNA targets, etc.) Then a Fisher's exact test for 2×2 contingency tables is used to check for significant over-representation of annotations in one of the sets with respect to the other one. Multiple test correction to account for the multiple hypothesis tested (one for each annotation) is applied.
Marmite. Enrichment analysis using text-mining derived annotations¶
Marmite stands for My Accurate Resource for MIning TExt and implements single enrichment analysis with text-mining derived annotations. Text-mining methods allow extracting informative annotations (bioentities) with different functional, chemical, clinical, etc. meanings, that can be associated to genes. In this case, the association of an annotation to a gene has a strength derived from the number of times that the gene and the annotation are co-cited in a PubMed abstract. A Kolmogorov-Smirnov test is used instead of the conventional Fisher's exact test. Multiple test correction to account for the multiple hypothesis tested (one for each annotation) is applied.
SNOW. Network enrichment analysis¶
The SNOW tool introduces protein-protein interaction data into the functional profiling of genomic data. SNOW performs two different and complementary types of analysis to the list of proteins/genes submitted:
- Snow identifies hubs in the list of proteins/genes (nodes) and evaluates the global degree of connections, centrality and clustering by comparing the distributions of nodes of the list versus the complete distribution of these parameters into the interactome.
- SNOW calculates the MCN, the minimum network that connects the proteins/genes in the list using or without using an external nodes (a non-listed protein) to connect nodes in the list. The topology of this network is evaluated by comparing distributions of node parameters of this MCN against a set of random MCNs with same size range. This approach is similar to other’s tools for functional enrichment analysis such as FatiGO or Marmite with the difference of not having pre-annotated functional modules to evaluate, instead SNOW have to build it, that is the MCN.
Set enrichment analysis¶
Gene set analysis¶
Gene set methods are much more sensitive than single enrichment methods in detecting gene sets (defined as sets of genes with a common annotation) with a collective behaviour in a genomic experiment. These methods very efficiently detect gene sets (annotations) that are consistently associated to high or low values in a ranked list of genes.
Here two methods have been implemented: the FatiScan, a segmentation test, and the logistic regression, which detect asymmetrical distributions of annotations within ranked lists of genes.
MarmiteScan expands the concept of gene set analysis from gene sets defined with conventional annotations (GO, KEGG, etc.) to text-mining derived annotations (bioentities).
This is a novel web-based resource to check for pathway (or GO terms) associations to diseases (or any other trait) in genome-wide association analysis (GWAS) with SNPs or CNVs.
Module enrichment analysis¶
Genecodis method searchs for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of genomic data and provide a perspective complementary to the conventional enrichment methods.
Tissue phenotype based profiling¶
Resource to extract differences between the distributions of the expression values of two groups of genes in a set of tissues. In order to improve the posibilities of your analysis and to cover most of the scope of the possible experiments users are interested in, we provide data from two type of platforms, SAGE Tags and Microarray (Affymetrix) expression data.
Tissue phenotype profiling based on SAGE Tags.
Tissue phenotype profiling based on Microarray Affymetrix expression data.
De novo annotations (Blast2GO)¶
Tool for the functional annotation of (novel) sequence data. The annotations produced can be used for further functional interpretation of genomic data by the different enrichment methods or gene-set methods described above.