



Emmanuelle Becker, Alain Guénoche and Christine Brun. Système de Classes Chevauchantes pour la Recherche de Protéines Multifonctionnelles.
Abstract : This work aims at developing a method to detect multifunctional proteins, i.e. proteins performing several apparently unrelated functions. To detect these proteins, we consider a network of binary direct interactions between proteins that we decompose in an overlapping class system using a criteria based on graph topology and extending Newman’s modularity. As a result, some proteins are found in several final classes meaning that they are interacting with several groups of proteins apparently functionally unrelated. These multiply classified proteins are thus good candidates for multifunctionality. In this paper, we will first introduce the concept of multifunctionality, then fully explain the method, and finally present the preliminary results obtained by applying the method to a large human protein interaction network.
Etienne Birmele. Detecting Network Motifs by Local Concentration
Abstract : Biological networks exhibit small over-represented subgraphs, called motifs, some of which are known to have a biological function. Several algorithms exist to detect motifs, most of them being based on time-consuming simulations or leading to many false positives. We propose an efficient and conservative procedure to detect network motifs and apply it on the Yeast gene regulation network.
Chun-Long Chen, Aurélien Rappailles, Lauranne Duquenne, Maxime Huvet, Guillaume Guilbaud, Benjamin Audit, Yves d’Aubenton-Carafa, Alain Arneodo, Olivier Hyrien and Claude Thermes. Single-nucleotide substitution rates increase during the replication S phase of the human genome
Abstract : Naturally occurring mutations in mammalian genomes play a key role in evolution and genetic disease but their causes are still poorly understood. In particular, nucleotide substitutions occur at strongly variable rates along genomes and it is essential to unravel the mechanisms responsible of these fluctuations. A number of evolutionary studies have exhibited complex correlations between substitution rates and parameters like regional or local nucleotide composition, crossover rate or distance to telomeres (1-6). Here, we study the role of replication on neutral substitution rates in the human genome. Using replication timing data determined by massive sequencing of replicating strands, we show that all non-CpG substitution rates correlate with timing : they are minimum in early replicating regions and increase to maximum values in late regions. These correlations are still observed after controlling for nucleotide composition, cross-over rate and distance to telomeres. These data demonstrate for the first time that replication timing plays a key role in shaping the profile of mutations along the genome.
Anne Crumiere. Cellular automata modeling of intercellular genetic regulatory networks
Abstract : Biologists often represent genetic interactions by directed graphs, named genetic regulatory graphs. Vertices represent genes, whereas edges represent regulatory effects from one gene on another. Edges are labelled with a positive sign in the case of an activation and negative for an inhibition. This article deals with relationships between the structure of such graphs and their dynamical properties.
The biologist R.Thomas enounced, thirty years ago, the following two general rules : a necessary condition for multistability is the presence of a positive circuit in the regulatory graph (the sign of a circuit being the product of the signs of its edges) and the existence of a negative circuit is a necessary condition for the existence of an attractive cycle. These rules are about the dynamic of a single cell, and it has given rise to mathematical statements and proofs. This article aims at extending these rules to regulatory interactions spanning within cells and between cells in the discrete formalism.
Hugo Devillers, Hélène Chiapello, Meriem El Karoui and Sophie Schbath. How to measure the robustness of bacterial genome comparisons ?
Lionel Dupuy, Matthieu Vignes, Blair McKenzie and Philip White. Meristematic Waves, a new approach to model root architecture dynamics
Abstract : During their development, plants must develop efficient root architectures to secure access to nutrients and water in soil. A series of expansion and branching mechanisms fulfils this aim in the proximity of root apical meristems where the plant senses the environment and explores immediate regions of soil. We have developed a new approach to study the dynamics of root meristems in soil, using the relationship between the increase in root length density and the root meristem density. Initiated at the seed, the location of root meristems was shown to propagate, wave-like, through the soil, leaving behind a permanent network of roots for the plant to acquire water and nutrients. Models higlighted that the morphologies of the waves of meristems are inherent to individual root developmental processes, namely expansion, lateral root initiation and gravitropic responses. The "meristematic wave" observed on data collected on barley might be a more general and fundamental aspect of plant rooting strategies to access underground resources.
Paul Garcin and Yves Boulard. Construction et analyse d’un modèle tridimensionnel du complexe [(SLR1738-Zn-Fe)2-ADN]
Abstract : Slr1738 is the Peroxide regulon Repressor protein (PerR) of Synechocystis. Active as a dimer, this protein must contain an iron atom to be able to bind DNA molecule and regulates targeted genes. The binding mechanism involves a classic recognition helix inserted in the DNA major groove. But to date there is no three-dimensional structure available for this kind of transcription factor complexed to DNA. As a consequence, both global and specific interactions that lead this protein to bind DNA and to recognize specific ‘Per Box’ sequence are still misunderstood. In order to better define and analyse these interactions, we built in silico the first three-dimensional structure of a [PerR-DNA] complex. This article describes the method used to build the complex and presents an analysis of the contacts between the two partners.
Sandrine Grossetete, Bernard Labedan and Olivier Lespinet. FUNGIpath : a new tool for analysing the evolution of fungal metabolic pathways
Abstract : FUNGIpath is a new tool dedicated to perform in-depth analysis of fungal metabolic pathways. It is freely accessible at http://www.fungipath.u-psud.fr. FUNGIpath consists in a collection of orthologous groups of proteins that have been predicted using complementary methods of detection and further mapped on KEGG and MetaCyc pathways. It allows an easy comparison of the primary and secondary metabolisms afforded by the different fungal species present in the database with the possibility to assess the level of specificity of various pathways at different taxonomic distances. As more and more fungal genomes are expected to be decrypted in the next years, this tool is expected to help to progressively reconstruct what were the primary and secondary metabolisms of the ancestors of the main branches of the fungi tree and to understand how these ancestral fungal metabolisms evolved to various specific derived metabolisms.
Nicolas Lebreton, Christophe Blanchet, Julie Chabalier and Olivier Dameron. Utilisation d’ontologies de tâches et de domaine pour la composition semi-automatique de services Web bioinformatiques
Abstract : Nowadays, bioinformatics tasks typically involve large scale data analysis that require the integration of web services from heterogeneous platforms. In spite of the efforts for improving Web services interoperability, integration remains difficult and still has to be performed manually by users. Improving the composition of Web services requires to analyze what Web services do as well as the nature and the type of their input and output parameters. This work shows that existing technologies support automating the selection, composition and execution of Web services, and that the current limiting factor to a wider use is the lack of precise enough task and domain ontologies.
Aurelie Leduc, Stephane Robin, Philippe Bessieres and Pierre Nicolas. Probabilistic modeling of tiling array expression data
Abstract : For organisms with small genomes such as bacteria, the current microarray technology allows adopting a tiling design where the whole genome is covered by overlapping probes. These arrays permit to measure the transcriptional activity of the whole genome with unprecedented resolution. Model-based approaches currently used to analyze these data remain however very simple, the most popular model being the piecewise constant Gaussian model with a fixed number of breakpoints. Here we present a new approach based on hidden Markov modelling designed for the probabilistic reconstruction the trajectory of a continuous-valued signal. The use of this model does not require the choice of a fixed number of breakpoints and permits to account for subtle effects such as drift in the signal. The model also includes direct correction for the variations of probe affinities via the use of covariates.
Celine Lefebvre, Mariano Alvarez, Presha Rajbhandari, Wei Keat Lim and Andrea Califano. Master regulator analysis reveals key transcription factors for Germinal Center formation
Abstract : We describe a new method for the identification of master regulators of a phenotype of interest. The master regulator analysis identifies transcription factors that are candidate master regulators of a phenotype of interest based on its transcriptional targets. We applied this method for deciphering the regulation of Germinal Center B cell programs, revealing the two transcription factors MYB and FOXM1 as synergistic master regulators
Sébastien Loriot, Frédéric Cazals, Michael Levitt and Julie Bernauer. A geometric knowledge-based coarse-grained scoring potential for structure prediction evaluation
Abstract : Knowledge-based protein folding potentials have proven successful in the recent years. Based on statistics of observed interatomic distances, they generally encode pairwise contact information. In this study we present a method that derives multibody contact potentials from measurements of surface areas using coarse-grained protein models. We show that this construction is able to distinguish native structures from decoys. We tested different potentials from a reference set of 66 protein structures. These functions, encoding up to 5-body contacts are evaluated on the reference set and its 45000 decoys and also on the often used lattice_ssfit set from the decoys’R us database. We show that the most relevant information for discrimination resides in 2- and 3-body contacts. The potentials we have obtained can be used for evaluation of putative structural models ; they could also lead to different type of protein structure refinement that uses multi-body interactions
Christine Martin and Antoine Cornuéjols. Using Frequent and Surprising Item Sets for the Characterization of Protein-Protein Interfaces
Abstract : Numerous research effort have aimed to characterize and predict protein-protein interfaces. This paper introduces a method that rely only on known protein-protein interfaces (positive instances only). It combines frequent item set mining techniques with statistical tests to ensure the selection of interesting features. Starting from a database of known interfaces described with geometrical elements, the method produces the elements and combinations thereof that are \emphcharacteristic of the interfaces. This approach allows one to easily interpret the results, as compared to techniques that operate as ``black-boxes’’ and ensures a satisfactory proportion of reliable item sets. The results obtained on a set of 459 protein-protein interfaces from the DOCKGROUND database confirm that the findings are consistent with current knowledge about protein-protein interfaces.
Marie-José Mhawej and Claude H. Moog. Drug dosage control of the HIV infection dynamics
Abstract : An increasing number of sophisticated control algorithms become available in the current literature to optimize the HIV therapy. Unfortunately, the pharmacokinetics and pharmacodynamics of antiretroviral drugs are ignored and these algorithms remain purely theoretic. This issue is investigated explicitly in this paper. An elementary pharmacodynamics model is combined with a non linear feedback control computed from standard engineering methods. It is shown that it results in the design of a realistic dosage regimen which drives the immunological system close to the healthy equilibrium state. Although the problem is dealt as a single input system, it is argued that the procedure can be extended to a multitherapy design or to any available control law.
Gregory Nuel. Counting patterns in degenerated sequences
Abstract : In this paper, we propose a rigourous method to take into account the uncertainty of sequencing for biological sequences (DNA, Proteins). For example, this method allows to study the distribution of a pattern of interest in a degenerated sequence defined on the standard IUPAC DNA alphabet. We first introduce a Forward-Backward approach to compute the marginal distribution of the constrained sequence and use it both to perform a Expectation-Maximization estimation of parameters, as well as deriving a heterogeneous Markov distribution for the constrained sequence. This distribution is hence used along with known DFA-based pattern approaches to obtain the exact distribution of the pattern count under the constraints. As an illustration, we consider a EST dataset from the EMBL database. Despite the fact that only 1% of the position in this dataset are degenerated, we show that not taking into account these positions might lead to erroneous observations, further proving the interest of our approach.
Nicolas Terrapon, Olivier Gascuel and Laurent Brehelin. Détection de nouveaux domaines protéiques par co-occurrence : Application à P. falciparum
Abstract : Hidden Markov Models (HMMs) have proved to be powerful for protein domain identification. However, numerous domains may be missed in highly divergent proteins. This is the case for the proteins of Plasmodium falciparum, the main causal agent of human malaria. Here, we propose a method that uses domain co-occurrence to increase the sensitivity of the approach while controlling its false discovery rate. Applied to P. falciparum, our method identify (with an error rate below 20%) 482 new domains (versus 3482 in PlasmoDB), which involve 158 new GO annotations.
Raluca Uricaru, Celia Michotey, Laurent Noe, Hélène Chiapello and Eric Rivals. Improved sensitivity and reliability of anchor based genome alignment
Abstract : Whole genome alignment is a challenging problem in computational comparative genomics. It is essential for the functional annotation of genomes, the understanding of their evolution, and for phylogenomics. Many global alignment programs are heuristic variations on the anchor based strategy, which relies on the initial detection of similarities and their selection in an ordered chain. Considering that alignment tools fail to align some pairs of bacterial strains, we investigate whether this is intrinsically due to the strategy or to a lack of sensitivity of the similarity detection method. For this, we implement and compare $6$ programs based on three different detection methods (from exact matches to local alignments) on a large benchmark set. Our results suggest that the sensitivity of well known methods, like MGA or Mauve, can be greatly improved in the case of divergent genomes if one exploits spaced seeds at the detection phase. In other cases, such methods yield alignments that covers nearly the whole genome. Then, we focus on global reliability of alignments : should an aligned pair of segments be included in the global genome alignment ? We investigate this reliability according to both the segment "alignability" and to inclusion of orthologs. Again, we provide evidence that for both close and divergent genomes, one of our program, YH, achieves alignments with sometimes a lower coverage, but a higher inclusion of orthologs. It opens the way to the first reliable alignments for some highly divergent species like Buchnera aphidicola or Prochlorococcus marinus.