« Previous - Version 3/12 (diff) - Next » - Current version
Jose Carbonell, 01/15/2010 01:21 pm

FatiGO takes two lists of genes (ideally a group of interest and the rest of the genes in the experiment, although any two groups, formed in any way, can be tested against each other) and convert them into two lists of GO terms using the corresponding gene-GO association table. Then a Fisher's exact test for 2×2 contingency tables is used to check for significant over-representation of GO terms in one of the sets with respect to the other one. Multiple test correction to account for the multiple hypothesis tested (one for each GO term) is applied as previously described.

In addition to Gene Ontology (Ashburner et al., 2000) terms it can test simultaneously for KEGG pathways (Kanehisa et al. 2004), InterPro motifs (Mulder et al., 2003), Swissprot keywords (Boeckmann et al., 2003), microRNA (Griffiths-Jones et al., 2006), TFBSs (Wingender et al. 2000), cisRED motifs (Robertson et al., 2006) and BioCarta pathways. The distribution of any combination (or all) of the terms between two groups of genes can be simultaneously tested by means of a Fisher exact test. All the p-values are adjusted by FDR. The functionality of the old modules FatiWise and TransFat (Al-Shahrour et al., 2005) and FatiGO+ (Al-Shahrour et al., 2007) have been completely included here and, consequently these modules have been discontinued.

Data and format

FatiGO supports many gene identifiers for each organism (HGNC symbol, EMBL acc, UniProt/Swiss-Prot, UniProtKB/TrEMBL, Ensembl IDs, RefSeq, EntrezGene, Affymetrix, Agilent, PDB, Protein Id, IPI…), can be checked in the ID converter. These identifiers must be annotated in Ensembl and any gene not annotated in Ensembl will be lost in the analysis. (Please see the Ensembl documentation).

The format is list or a plain text file with a gene or protein identifier per line. See an example of Saccharomyces cerevisiae identifiers list:



  • Al-Shahrour, F., Minguez, P., Tárraga, J., Medina, I., Alloza, E., Montaner, D., & Dopazo, J. (2007). FatiGO+: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Research 35 (Web Server issue): W91-96
  • Al-Shahrour, F., Minguez, P., Vaquerizas, J.M., Conde, L. & Dopazo, J. (2005). BABELOMICS: a suite of web-tools for functional annotation and analysis of group of genes in high-throughput experiments. Nucleic Acids Research, 33 (Web Server issue): W460-W464
  • Al-Shahrour, F., Díaz-Uriarte, R. & Dopazo, J. (2004). FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes]]. Bioinformatics 20: 578-580
  • Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Traver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock G. (2000) Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25-29.
  • Boeckmann B., Bairoch A., Apweiler R., Blatter M.-C., Estreicher A., Gasteiger E., Martin M.J., Michoud K., O'Donovan C., Phan I., Pilbout S. & Schneider M., The Swiss-Prot Protein Knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31:365-370(2003)
  • Griffiths-Jones S., Grocock R.J., van Dongen S, .Bateman A. & Enright A.J. (2006). miRBase: microRNA sequences, targets and gene nomenclature. Nucleics Acid Research, 34 (Database Issue): D140-D144
  • Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res.32:D277-280
  • Mulder N.J., Apweiler R., Attwood T.K., Bairoch A., Barrell D., Bateman A., Binns D., Biswas M., Bradley P., Bork P., Bucher P., Copley R.R., Courcelle E., Das U., Durbin R., Falquet L., Fleischmann W., Griffiths-Jones S., Haft D., Harte N., Hulo N., Kahn D., Kanapin A., Krestyaninova M., Lopez R., Letunic I., Lonsdale D., Silventoinen V., Orchard S.E., Pagni M., Peyruc D., Ponting C.P., Selengut J.D., Servant F., Sigrist C.J.A., Vaughan R, Zdobnov E.M. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucl. Acids. Res. 31: 315-318.
    Robertson, G., Bilenky, M., Lin, K., He, A., Yuen, W., Dagpinar, M., Varhol, R., Teague, K., Griffith, O.L., Zhang, X. et al. (2006) cisRED: a database system for genome-scale computational discovery of regulatory elements. Nucleic Acids Res, 34, D68-73
  • Wingender, E., Chen, X., Hehl, R., Karas, H., Liebich, I., Matys, V., Meinhardt, T., Prüß, M., Reuter, I. and Schacherer, F. (2000).TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 28, 316-319

example.motor (753 Bytes) Jose Carbonell, 01/15/2010 01:36 pm

example.apoptosis (628 Bytes) Jose Carbonell, 01/15/2010 01:36 pm

example.overinteracting_05.txt - yeast.overinteracting.proteins (8.1 kB) Ana Conesa, 05/12/2010 12:29 am

example.underinteracting_05.txt - yeast.underinteracting.proteins (10.1 kB) Ana Conesa, 05/12/2010 12:30 am