Documentation

Understanding RESTful WS API

General structure of a RESTful call is:

ws.bioinfo.cipf.es/cellbase/rest/ {version} / {species} / {category} / {subcategory} / id / {resource} ? {filters}

Sections in braces are the parameters, so they are variables:

Example:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/feature/gene/BRCA2/transcript

As is explained in this documentation, this REST call will get all the transcripts of the gene BRCA2 of human in the latest version.

Note: All sections in URL must be in lower case!

Version and species

Versions are numbered as v1, v2, v3, ... At this moment the latest stable version is v2. However, the latest stable version will be always coded as latest.

Species available for version v2 are:

Common name Species Short name Long name
Human Homo sapiens hsa hsapiens
Mouse Mus musculus mmu mmusculus
Rat Rattus norvegicus rno rnorvegicus
Zebrafish Danio rerio dre drerio
Fruitfly Drosophila melanogaster dme dmelanogaster
Worm Caenorhabditis elegans cel celegans
Yeast Saccharomyces cerevisiae sce scerevisiae
Dog Canis familiaris cfa cfamiliaris
Pig Sus scrofa ssc sscrofa
Mosquito Anopheles gambiae aga agambiae
Plasmodium Plasmodium falciparum pfa pfalciparum

Short 3 letter code and the long name are be accepted, i.e.:

ws.bioinfo.cipf.es/cellbase/rest/latest/ hsa /genomic/region/3:1000-200000/gene

must give the same result than:

ws.bioinfo.cipf.es/cellbase/rest/latest/ hsapiens /genomic/region/3:1000-200000/gene

Categories and subcategories

There are 4 main categories:

Category Description Subcategories
Genomic Genomic category makes reference to all these coordinates which allow us to position in the genome Position, Variant, Region
Feature Feature category involve all elements which have a defined location on the genome and provides an easy way to retrieve cross references for an ID Gene, SNP, Transcript, Protein, Xref, ...
Regulatory Regulatory category refers to all regulatory interactions involving transcription factors and microRNAs TFBSs, miRNAs
Network Network category makes reference to all types of networks and pathways, including the protein interactome, the regulatory network and Reactome Pathway

Help and metadata

The available main categories are listed at:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/

And available subcategories from a main categorie are listed at:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/feature
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/regulatory
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/network

Subcategories should specify the type of the id field. Subcategories can be different for each category and are described within each of the categories mentioned above.

Each subcategorie usage is shown by writting the subcategorie name with /help, for example:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/feature/gene/help

Id field

It is the query parameter, it is the feature or term about we want to retrieve the information ( resource or action ). Its type must correspond with the subcategory.

NOTE: in order to improve performance, ID lists can be passed together in only one REST call separated by commas. Only 200 IDs are allowed, i.e.:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:1000-200000,X:35-459000,4:2334-555555/gene
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/feature/gene/brca2,bcl2,p53/snp

Resources and actions

Each Category and Subcategory can have different resources and actions allowed. They specify the type of result we want to obtain from the ID, i.e.:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/feature/gene/brca2/ transcript
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:1000-200000/ gene
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:100000-200000/ snp

Resources and actions must always be written in singular.

Filters and extra-options

We can add filters or extra-options to the REST query. Some of them can be applied to all queries, but some of them are specific of each subcategory.

  • General options, all of them are optionals:
  • output format: coded as of, allowed values are: text, json, zip. Default: text. i.e.:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:10-2000/gene? of=json

  • separator: the character to separate column sin results. By default: tab ** user: in case of private version ** password: in case of private version
  • Specific options:

Every resource or action may have different filters or options, which are described in each category and subcategory i.e.:

ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:1000-200000/gene? biotype=protein_coding,mirna_gene
ws.bioinfo.cipf.es/cellbase/rest/latest/hsa/genomic/region/3:100000-200000/snp? consequence_type=non_synonymous_coding,splice_site

Releases and downloads

Current release is v2 and comprises 11 species, to find details about databases versions and new web services click in Release link

Release notes v2.0

To view details about older releases:

All releases

To download the SQL schema, diagram and data files:

Downloads

Perl RESTful WS client

Users can query CellBase using the CLI (Command-line Interface) with Perl. The script can be downloaded here: cellbase_client.pl

Some parameters must be specified for the correct retrieval of the results.

  • --species: Name of the species used in three-letter code (i.e. hsa) or in the abreviated form (i.e. hsapiens). Default: Homo sapiens.
  • --input-type or --i: it corresponds to the nature of your input IDs. According to the category and subcategory structure, the user must indicate the appropriate subcategory.
  • --id: ID or IDs to query. To query more than one identifier, use whitespace ' ' or comma ',' to separate them (i.e: BRCA2 BCL2 CDKN2A).
  • --file or --f: Ids file. One id per line.
  • --get: is the resource or action parameter. Depending on the category and subcategory used different resources are available.
  • --outfile or --o: Name of the output file. If no name is spacified the information is printed in the standard output (STDOUT).
  • --verbose: Print logs and parameters used.
  • --help: Print help.

Below you have some examples:

./cellbase_client.pl --input-type gene --id BRCA2 --get tfbs
./cellbase_client.pl --input-type protein --id BRCA2,BCL2,CDKN2A --get info
./cellbase_client.pl --input-type mirna_gene --id hsa-mir-149 --get disease
./cellbase_client.pl --input-type gene --file my_gene_list.txt --get info
./cellbase_client.pl --species mmu --i id --file mouse_genes.txt --get xref?dbname=ensembl_gene