« Previous - Version 10/17 (diff) - Next » - Current version
Ana Conesa, 05/26/2010 12:25 pm

Time dosage methods

maSigPro genes with significant profile differences in single and multiple time (or dose) series gene expression experiments. The Babelomics implementation of maSigPro provides easy access to regression models and graphical output.

maSigPro is a regression-based approach with the following steps:

  • Definition of a regression model which includes continuous (time or dose) and categorical (series) variables. This model considers the different series (i.e. treatments, tissues, strains, etc) using dummy or binary variables: The dummies model the differences between a control profile and a treatment profile, i.e. if the dummy results significant, then there are differences between the different series. In the one series-case, the model just evaluates the gene expression changes along the continuous variable. The model considered in the first regression step is a global one: given a polynomial degree, the model contains all polynomial terms on the continuous variable and interactions with the dummies. For example, for a series experiment with 2 series (control and treatment) and 3 time points, we can consider a quadratic model:

Y = B 0 + B 1 * time + B<sub>2</sub> * time<sup>2</sup> + dummy * B<sub>3</sub> + dummy * B<sub>4</sub> * time + dummy * B<sub>5</sub> * time<sup>2</sup>

This is a model of the gene expression dependent variable (Y) on the time independent variable. The meaning of the coefficients B is:\\
B<sub>0</sub>: general abscissa; if significant the gene has a non-zero expression at time zero\\
B<sub>1</sub>: slope; if significant the gene is induced or repressed along time\\
B<sub>2</sub>: curvature; if significant there is a change in induction or repression along time\\
B<sub>3</sub>: treatment abscissa; if significant there is a difference expression between control and treatment at time zero\\
B<sub>4</sub>: treatment slope; if significant there is a difference in induction or repression between treatment and control along time\\
B<sub>5</sub>: treatment curvature; if significant there is a difference in the change in induction or repression along time between treatment and control
  • Fitting of this global model. If the model is significant, it means that there is SOME change in the expression profile of this gene, either because it changes on time or because it shows different profiles between series. In this step, a gene selection is done according to some significance level. Multiple testing correction is applied prior to the selection of differentially expressed profiles.
  • Fitting a stepwise regression on the selected genes. This step selects, form the global model, those terms which are significant. The result is an optimized model for each gene. The components variables of each of these models are significant and indicate which kind of specific expression changes undergo the corresponding gene. For example, if for gene1 B<sub>2</sub> is significant but B<sub>3</sub> is not, this means that the gene is induced or repressed at a constant rate during the studied course to the time. If gene2 has B<sub>4</sub> significant but B<sub>2</sub> and B<sub>3</sub> not, this indicates that this gene is flat on the control series but has some induction or repression on the treatment series.
  • Final selection of significant genes.After the regression fit, an additional filtering on genes is applied based on the R2 (goodness of fit) of the step-wise regression model. Only genes with a R2 above a given threshold are selected. Additionally, selected genes are grouped by series, i.e. a list of significant genes are generated for each of the possible contrasts between the control and the experimental series, i.e. genes with have different expression profiles between control and each other series. If only one series is provided, then the list of significant genes reports those genes with time/dosage associated changes.
  • Visualizing results.Once a selection of differentially expressed genes is obtain, these are divided into clusters of similar expression patterns and displayed as trajectory plots where the time/dosage evolution of genes expression, across the different series, is displayed.