Projects

  • Babelomics

    Babelomics is a complete suite of web tools for the analysis, integration and interpretation of different types of genomic data.

    Babelomics includes methods for the analysis of gene expression data that include normalization (covering the most common commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors and different clustering methods....

  • Java REST WS Server

    Web Remote API developed using REST Web Services. REST WS provide a clean and light interface to remote data or procedures. This project is developed in Java by using jersey-server REST API to access biological data stored in a SQL cluster

  • BioPax-SQL

    BioPax-SQL is a project whose aim is to be able to store all information of pathways scattered between repositories with BioPax format in an only database. Furthermore, a parhway display has been developed to facilitate the user the visualization and manipulation of the data....

  • CellBase

    During the last years the advances of high-throughput technologies have produced an unprecedented growth of repositories and databases storing relevant biological data. Today there is more biological information than ever but unfortunately the current status of many of these repositories is far from being optimal most of the times. Some of the most common problems are: a) information is spread out in many small repositories and databases, b) lack of standards between different repositories, c) unsupported databases, d) specific and unconnected information, etc. All these problems make very difficult: a) to integrate or join many different sources into only one database to work or analyze experiments; b) to access and query this information in programmatically way....

  • CellBrowser

    Genes, proteins and regulatory elements operate within an intricate network of interactions. A new paradigm has emerged to study these biological systems, this new holistic paradigm aims to understand how the interactions of the components of biological systems give rise to the function and how they participate in phenotypes and diseases. We are interested in developing new algorithms and tools to model and analize these biological networks....

  • Genome Maps

    Genome browsers are extremely useful to represent genomic data, such as SNPs, gene expression, methylation, etc., on the genomic context. Different genome browsers on the web are available, being the most popular the Ensembl and the UCSC. However, with the continuous increase in the available genomic data and metadata along with the limitations derived from extensive data traffic imposed by client/server architecture, such browsers become inevitably slower. Genome Maps is based on the new HTML5 standards, including SVG and Javascript and runs 100% in the in the modern web browsers (in a philosophy similar to Google). This makes unnecessary the installation of any Flash plug-in, Java Applet or any other technology and results in a fast and dynamic response to user requests. Genome Maps allows real-time navigation along chromosomes and karyotypes, representing different types of data over many types of genomic information. There are numerous pre-configured tracks such as genes, transcripts, SNPs, mutations, miRNA targets, conserved regions, TFBS, etc. There are also several DAS sources available, but new ones can easily be added to the system....

  • High Performance Genomics (HPG)

    HPG stands for High Performance Genomics. The goal of this of project is to provide a complete suite of advanced computing solutions to solve the current computational problems in the field of genomics, especially in the field of massive sequencing or NGS. This computing solutions range from High Performance Computing (HPC) or Cloud-based solutions for the processing, analysis o visualization of genome-scale data....

    • HPG Aligner

      As sequencing technologies progress the amount of data produced grows exponentially, shifting the bottleneck of discovery towards the data analysis phase. Here we present an innovative approach that combines reengineering, optimization and parallelization of the algorithms which results in a significant increase of the mapping sensitivity over a wide range of read lengths and substantial shorter runtimes when compared to other NGS mapping solutions currently available. Moreover, our software has been implemented using High-Performance computing (HPC) technologies such as OpenMP for multi-core CPUs, SSE and Nvidia CUDA for GPUs, besides, comparatively, the performance of this approach scales up very efficiently with the number of processors, therefore our implementation is ready to take advantage of new CPUs and GPUs....

    • HPG-fastq tools

      New high-throughput sequencers are able to produce data at an unprecedented scale while sequencing costs are in free fall. Primary sequence data management involves an unavoidable step of quality control and pre-processing and is computationally expensive. There are some solutions available to carry out a quality control check, however they are slow and the report is based only in the partial analysis of the data. We present a High Performance Computing (HPC) solution for quality control check of the widely used standard FASTQ that identifies and exploits the hardware (CPUs and GPUs) available in the computer in which it is running. This solution outperforms 5x any conventional solution based on CPU processors. Moreover, the QC is exhaustive and it also carries out several preprocessing steps on the data....

    • HPG-SW

      HPG-SW is a modern implementation of the Smith-Waterman algorithm based on high-performance computing techniques. HPG-SW uses the OpenMP parallel programming model and the SSE instructions in order to take advantages of the multi-core processors and the SIMD registers of current CPU cores....

    • HPG-Variant

      The massive use of Next Generation Sequencing (NGS) technologies is uncovering an unexpected amount of variability. The functional characterization of such variability, particularly in the most common form of variation found, the Single Nucleotide Variants (SNVs), has become a priority that needs to be addressed in a systematic way. VARIANT (VARIant ANalyis Tool) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the GWAS catalog, OMIM, COSMIC mutations, etc.)...

    • HPG-VCF tools

      Biologists receive so much biological data that they have to spend a lot of time cleaning it up in order to get just the data they are interested in. HPG VCF Tools is a set of tools for preprocessing, filtering and manipulating VCF files. It aims to avoid excessive time consumption in tedious preprocessing tasks....

  • HPG-fastq tools

    New high-throughput sequencers are able to produce data at an unprecedented scale while sequencing costs are in free fall. Primary sequence data management involves an unavoidable step of quality control and pre-processing and is computationally expensive. There are some solutions available to carry out a quality control check, however they are slow and the report is based only in the partial analysis of the data. We present a High Performance Computing (HPC) solution for quality control check of the widely used standard FASTQ that identifies and exploits the hardware (CPUs and GPUs) available in the computer in which it is running. This solution outperforms 5x any conventional solution based on CPU processors. Moreover, the QC is exhaustive and it also carries out several preprocessing steps on the data....

  • Pupasuite

    PupaSuite is a web tool for the selection of SNPs with potential phenotypic effect, oriented to help in the design of large-scale genotyping projects and to the characterization of new SNPs from next generation technologies. PupaSuite uses a collection of data on SNPs from heterogeneous sources and a large number of pre-calculated predictions to offer a flexible and intuitive interface for selecting an optimal set of SNPs. It implements new facilities such as the analysis of user's data to derive haplotypes with functional information. A new estimator of putative effect of polymorphisms has been included that uses evolutionary information. Also SNPeffect database predictions have been included....

  • RENATO

    IMPORTANT Documentation site has been moved to: http://bioinfo.cipf.es/docs/renato/ This site will be unavailable soon.

    RENATO (REgulatory Network Analysis TOol) is a network-based analysis web tool for the interpretation and visualization of transcriptional and post-transcriptional regulatory information, designed to identify common regulatory elements in a list of genes. RENATO maps these genes to the regulatory network, extracts the corresponding regulatory connections and evaluate each regulator for significant over-representation in the list. Ranked gene lists can also be analysed with RENATO....

  • Variant

    VARIANT (VARIant ANalysis Tool) can report the functional properties of any variant in all the human, mouse or rat genes (and soon new model organisms will be added) and the corresponding neighborhoods. Also other non-coding extra-genic regions, such as miRNAs are included in the analysis....

Also available in: Atom