PLoS Computational Biology

Abbreviation:
Periodicity:
Monthly

Created by Benjamin Schrauwen, last updated by Benjamin Schrauwen (23 Sep 2008 16:38).
In 3 library(s).

Hot features

iCalendar support | Track your upcoming deadlines with Google Calendar, iCal, Outlook or other calendar software that supports ICS

Widgets | Put your upcoming conferences and deadlines on your web site

RSS aggregation | Follow your journal or conference's news updates with Pregolia

Find deadlines | All search results now highlight upcoming deadlines

Friend suggestions | Suggest items to your friends

Overview   News   Users   Forum   Reviews   Tags   Similar items  

PLoS Computational Biology » Recent news items

The Mechanism of Ubiquitination in the Cullin-RING E3 Ligase Machinery: Conformational Control of Substrate Orientation

02 Oct 2009

Author Summary

The Ubiquitin-Proteasome System regulates protein degradation via several steps. The cullin-RING E3 ligase machinery is involved in one of these. In this step, ubiquitin is transferred from E2 to the substrate protein, labeling the substrate protein for degradation. However, when E3, E3-substrate and E2-ubiquitin crystal structures are modeled together, the distance between ubiquitinated E2 and the substrate binding site is ~50–59Å, raising the question how the E3 machinery bridges the distance and orients the substrate for the ubiquitin transfer. We performed explicit solvent simulations for all nine available substrate binding protein complexes in the PDB, with and without the corresponding E3 components to which they are bound. In all of these nine substrate binding proteins, we noticed a flexible linker that rotates the substrate binding domain to a great extent in the same direction, toward the E2-ubiquin. We further noticed that the flexibility is regulated allosterically by binding events associated with either domain. The results suggest that the flexible linker serves as a hinge to rotate the substrate binding domain and to accurately position the substrate for ubiquitination. As such, the simulations suggest an answer to the question of how the machinery operates to orient the substrate for ubiquitination.

Perturbation-based Markovian Transmission Model for Probing Allosteric Dynamics of Large Macromolecular Assembling: A Study of GroEL-GroES

02 Oct 2009

Author Summary

Biological processes in a cell often require complex molecular machineries with large macromolecular assemblies as components. An example is the chaperone system in the bacterium E. coli, which helps proteins to fold correctly. In these macromolecular machineries, signals are transmitted dynamically in order for biological functions to be carried out. Studying the dynamic process of signal transmission helps us to identify key elements of the macromolecular assemblies that are pivots for dynamic motions, communicators for interfacing with other molecules, and anchors that are key for signal transmission. In this study, we describe a novel computational method that can globally survey the dynamic responses of the macromolecular machinery to perturbation over the full time course by monitoring simultaneously all the elements at the amino acid residue level and at multiple time spans, from the initial perturbation until the system reaches equilibrium. We show that the key residues predicted by our computational method in the chaperone system of E. coli to a large extent are correct, as they often coincide with the ones identified by experimental studies. We also show that this computational method can make novel predictions about the importance of additional amino acid residues previously uncharacterized, which can be further tested in experimental studies. This approach can be applied to study other large macromolecular assemblies such as the virus capsid and ribosomal complex.

Combining Fungal Biopesticides and Insecticide-Treated Bednets to Enhance Malaria Control

02 Oct 2009

Author Summary

It has recently been proposed that mosquito vectors of malaria may be controlled by biopesticide sprays containing spores of fungi that are pathogenic to mosquitoes, causing reduced blood feeding activity and eventual death. This technique has been shown to have strong potential to reduce malaria transmission rates, and may be most effective when combined with other interventions as part of an integrated vector management strategy. I develop a model to quantify the total impact of combined interventions that can affect mosquitoes at different ages and stages in their lifecycle. As a case study, I consider the combined use of fungal biopesticides and insecticide- treated bednets (ITNs), a widespread and important vector control method. The model demonstrates that these interventions combined can have strong effects on malaria transmission even in situations where each intervention acting alone has relatively little impact. In situations difficult for malaria control due to high transmission intensity and widespread insecticide resistance, the performance of the combined interventions is improved by synergistic interactions between the interventions, whereby the ITN intervention improves the performance of the fungal biopesticide intervention. The results suggest that the combined use of ITNs and fungal biopesticides may be an efficient and effective method of malaria control.

The Modular Organization of Protein Interactions in Escherichia coli

02 Oct 2009

Author Summary

Genes and their protein products do not operate in isolation, but form components of highly interconnected biological systems. Identifying the connections between components is therefore critical to understanding how these processes are organized and operate. E. coli is the leading model bacterium; however despite its importance in biological and medical discovery, an accurate atlas of these interactions is still lacking. On the other hand, several computational and experimental procedures have been applied on a high-throughput basis to provide collections of interaction data of varying quality and coverage. Using a sophisticated mathematical framework, we have combined and benchmarked these data to create a single, highly reliable set of interactions that encompasses almost 50% of the E. coli proteome. Organizing these data on the basis of their interactions, we identify groups of proteins representing functionally coordinated modules such as molecular machines (e.g., the flagellum) and biochemical pathways. Finally through examining the organization of E. coli interactions in the context of evolution, we propose a new model of bacterial network evolution that accounts for the integration of foreign genes acquired through horizontal gene transfer mechanisms.

Integration of Evolutionary Features for the Identification of Functionally Important Residues in Major Facilitator Superfamily Transporters

02 Oct 2009

Author Summary

Major Facilitator Superfamily (MFS) transporters are one of the largest families of membrane protein transporters and are ubiquitous to all three kingdoms of life. Structural studies of MFS transporters have revealed that the members of this superfamily share structural homology; however, due to weak sequence similarity, their structural similarity has only been found after structural determination. Even after the structures were solved, painstaking efforts were needed to detect functionally important residues. The identification of functionally important cooperative residues from sequences may provide an alternative way to understanding the function of this important class of proteins. Here, we show that it is possible to identify functionally important residues of MFS transporters by integrating two different evolutionary features, sequence conservation and co-evolutionary information. Our results suggest that the conserved cores of evolutionarily coupled residues are involved in specific substrate recognition and translocation of membrane protein transporters. Also, a subset of the identified residues comprises an interaction network connecting functional sites in the protein structure. The ability to identify functional residues from protein sequences may be helpful for locating potential mutagenesis targets in mechanistic studies of membrane protein transporters.

Chemically Based Mathematical Model for Development of Cerebral Cortical Folding Patterns

25 Sep 2009

Author Summary

The size and shape of the cerebral cortex varies across species. The cortical folding pattern also varies from a smooth surface where no pattern is visible, as observed in the common treeshrew (Tupaia glis) and Eastern mole (Scalopus aquaticus), to an intricate labyrinthine pattern, as observed in humans. One current model, the intermediate progenitor model, describes the creation of a fold through local interactions in the ventricular zone which surrounds the lateral ventricle. Here we extend the local scenario described in the intermediate progenitor model to include global characteristics that differ between species. We approximate the lateral ventricle with a prolate spheroid and examine how patterns on a spheroidal surface change based on size and eccentricity. Our model reveals a direct correlation between pattern formation and lateral ventricular size and shape. This model's significance is that it elucidates the consistency of cortical patterns among individuals within a species and addresses inter-species variability based on global characteristics, such as size and shape of the lateral ventricle, and provides a critical piece to the puzzle of cortical pattern formation.

PLoS Computational Biology Issue Image | Vol. 5(9) September 2009

25 Sep 2009

The signature of stimulus-driven correlations in visual cortical synaptic activity.

This figure illustrates two contrasted epochs of sub-threshold (and sometimes spiking) responses that can be measured intracellularly in the primary visual cortex of a cat during presentation of a drifting grating (left, red trace) followed by the visuo-oculomotor exploration of a natural scene (right, blue trace). The corresponding dynamics of the membrane potential fluctuations exhibit long-range correlations for the low dimension periodic stimulus (the curtain). These correlations are strongly reduced when the visual system is driven by a natural scene (see El Boustani et al., doi:10.1371/journal.pcbi.1000519).

Image Credit: UNIC-CNRS

Brief Overview of Bioinformatics Activities in Singapore

25 Sep 2009

Conserved Expression Patterns Predict microRNA Targets

25 Sep 2009

Author Summary

microRNAs are small RNA molecules that regulate gene expression by controlling the output of proteins and other RNAs. The exact mechanism through which a microRNA binds to its target and how this affects the target is still a subject of much debate. In this article, the authors sought to find a reverse approach to discover the impact of microRNAs on gene expression. Instead of searching for specific targets of a given microRNA, they searched for microRNA signatures: changes in the levels of microRNAs across multiple tissues that impacted significantly the levels of messenger gene expression in these same tissues. Because many core biological functions are conserved between human and mouse, the authors compared these microRNA signatures between these two species. They found that identical microRNA signatures between these organisms could effectively predict microRNA targets and could estimate the global impact of individual microRNAs on gene output. They further demonstrated that many microRNAs act as expression enhancers by inhibiting gene repressors.

A Computational Analysis of ATP Binding of SV40 Large Tumor Antigen Helicase Motor

25 Sep 2009

Author Summary

The Large Tumor antigen (LTag) encoded by Simian Virus 40 (SV40) is a marvelous molecule that is not only a viral oncogene, but also an efficient molecular machine as a helicase that unwinds double helix DNA for genome replication, an essential process in all living organisms. LTag hexameric helicase uses the energy of ATP to power its conformational switch for DNA unwinding. Understanding how the LTag conformational switch is coupled to the energy from ATP usage by LTag to do the mechanical work of unwinding DNA is of great interest to biologists, and yet remains to be established. Based on our previous high-resolution structures of LTag helicase in different conformational states, we simulated an LTag conformational transition pathway in the ATP binding process using the targeted molecular dynamics method. Our simulation results suggest a three-step process for the ATP binding to the nucleotide pocket, in which ATP is eventually “locked” into the pocket by three pairs of “locker” interactions. We have also quantitatively evaluated the energy profile of ATP binding using a special computational simulation technique. Additionally, our simulation study of ATP binding by LTag and the accompanying conformational switches in the context of a hexamer leads to a refined cooperative iris model that may be used for DNA unwinding.

A Novel Scoring Approach for Protein Co-Purification Data Reveals High Interaction Specificity

25 Sep 2009

Author Summary

To understand and model cellular processes, we require accurate descriptions of the interactions occurring between constituent proteins. Large-scale protein interaction maps have typically been measured in two distinct ways. The first detects direct pair-wise associations by testing only two proteins at a time for an interaction. The second detects large groups of proteins that have conglomerated or purified together. With regard to the latter, it is difficult to deduce which pairs of proteins are physically interacting in the purification data, and interaction maps generally appear random and unstructured. We have developed a novel computational method to analyze the purification data (from the second method) and identify which proteins are directly interacting. The resultant protein interaction map is highly modular, meaning that the proteins organize themselves into localized, densely connected regions that likely represent individually functioning units. We also analyzed interaction maps of the first method and propose that their lack of modularity is a consequence of missing interactions that are undetected for unclear reasons. This study provides insights into the differences between the two interaction detection methods as well as the nature of biological organization.

Statistical Use of Argonaute Expression and RISC Assembly in microRNA Target Identification

25 Sep 2009

Author Summary

MicroRNAs are a family of small RNAs that play important roles in the development, physiological function and stress responses of a wide variety of organisms, and if abnormally expressed are associated with multiple types of cancer in humans. Rather than being translated into proteins, members of the family of microRNAs operate by preventing the translation of messenger RNAs to which they have some degree of sequence complementarity. Although sequence-based bioinformatics techniques have yielded large numbers of predicted messenger- and microRNA targeting relationships, verifying these as bona fide has proven practically difficult. We have developed a novel statistical approach based on the system biology of microRNAs in humans to detect such targeting relationships using high-throughput RNA expression data. Because our approach is not based on information from external target pair predictions, it can play a fully independent role in verifying such predictions as well as be used to obtain de novo target pair predictions. Using two separate data studies, we show that our approach is capable of both reproducing previously observed target pairs and verifying putative target pairs predicted from sequence data, at rates substantially better than marginal comparisons of messenger- and microRNA expression levels.

Ten Simple Rules for Chairing a Scientific Session

25 Sep 2009

Integrating Extrinsic and Intrinsic Cues into a Minimal Model of Lineage Commitment for Hematopoietic Progenitors

25 Sep 2009

Author Summary

Complex biomolecular interaction pathways in signaling networks can lead to non-intuitive behaviors that can prove critical for the regulation and robustness of biological processes. In this work, we present a signaling topology that can generate dynamic responses that are particularly pertinent to cell commitment in hematopoiesis. Our minimal model explores fundamental questions of instructive signaling that have persisted in cell-fate decisions. We show that even when lineage commitment decisions are inherently noisy, external cytokine signals, amplified by receptor upregulation, can bias the lineage choices of a progenitor cell. The multipotent progenitor, based on its differentiation potential, can exhibit several layers of memory to provide stability to both intermediate and mature states and can potentially bypass canonical intermediate states in generating mature cell types. Thus, our model provides a computational framework that can accommodate both classical and non-classical commitment paths in hematopoiesis.

Network-State Modulation of Power-Law Frequency-Scaling in Visual Cortical Neurons

25 Sep 2009

Author Summary

Intracellular recording of neocortical neurons provides an opportunity of characterizing the statistical signature of the synaptic bombardment to which it is submitted. Indeed the membrane potential displays intense fluctuations which reflect the cumulative activity of thousands of input neurons. In sensory cortical areas, this measure could be used to estimate the correlational structure of the external drive. We show that changes in the statistical properties of network activity, namely the local correlation between neurons, can be detected by analyzing the power spectrum density (PSD) of the subthreshold membrane potential. These PSD can be fitted by a power-law function 1/fα in the upper temporal frequency range. In vivo recordings in primary visual cortex show that the α exponent varies with the statistics of the sensory input. Most remarkably, the exponent observed in the ongoing activity is indistinguishable from that evoked by natural visual statistics. These results are emulated by models which demonstrate that the exponent α is determined by the local level of correlation imposed in the recurrent network activity. Similar relationships are also reproduced in cortical neurons recorded in vitro with artificial synaptic inputs by controlling in computo the level of correlation in real time.

Bayesian Phylogeography Finds Its Roots

25 Sep 2009

Author Summary

Spreading in time and space, rapidly evolving viruses can accumulate a considerable amount of genetic variation. As a consequence, viral genomes become valuable resources to reconstruct the spatial and temporal processes that are shaping epidemic or endemic dynamics. In molecular epidemiology, spatial inference is often limited to the interpretation of evolutionary histories with respect to the sampling locations of the pathogens. To test hypotheses about the spatial diffusion patterns of viruses, analytical techniques are required that enable us to reconstruct how viruses migrated in the past. Here, we develop a model to infer diffusion processes among discrete locations in timed evolutionary histories in a statistically efficient fashion. Applications to Avian Influenza A H5N1 and Rabies virus in Central and West African dogs demonstrate several advantages of simultaneously inferring spatial and temporal processes from gene sequences.

Disease-Aging Network Reveals Significant Roles of Aging Genes in Connecting Genetic Diseases

25 Sep 2009

Author Summary

Explaining the molecular mechanisms of complex genetic diseases is a crucial step for curing them. Extensive studies have suggested close relationships between the aging process and genetic diseases. As a result, incorporation of the aging process in studying diseases may provide important insights both in biology and medicine. Here we construct a disease-aging network in humans to systematically explore and visualize the intricate relationships between diseases and the aging process. Instead of focusing on a specific disease or a single gene, we put all complex diseases and the aging process together and probe the interactions among the disease genes and aging genes under the network concept. By checking the network topological properties, we reveal that human disease genes are much closer to aging genes than expected by chance. Further analysis categorizes diseases into two types according to their relationships with aging. Our study provides important evidence to associate diseases and the aging process at the system level and helps to further our understanding in the molecular mechanisms of complex diseases.

Chemically Based Mathematical Model for Development of Cerebral Cortical Folding Patterns

25 Sep 2009

Author Summary

The size and shape of the cerebral cortex varies across species. The cortical folding pattern also varies from a smooth surface where no pattern is visible, as observed in the common treeshrew (Tupaia glis) and Eastern mole (Scalopus aquaticus), to an intricate labyrinthine pattern, as observed in humans. One current model, the intermediate progenitor model, describes the creation of a fold through local interactions in the ventricular zone which surrounds the lateral ventricle. Here we extend the local scenario described in the intermediate progenitor model to include global characteristics that differ between species. We approximate the lateral ventricle with a prolate spheroid and examine how patterns on a spheroidal surface change based on size and eccentricity. Our model reveals a direct correlation between pattern formation and lateral ventricular size and shape. This model's significance is that it elucidates the consistency of cortical patterns among individuals within a species and addresses inter-species variability based on global characteristics, such as size and shape of the lateral ventricle, and provides a critical piece to the puzzle of cortical pattern formation.

PLoS Computational Biology Issue Image | Vol. 5(9) September 2009

25 Sep 2009

The signature of stimulus-driven correlations in visual cortical synaptic activity.

This figure illustrates two contrasted epochs of sub-threshold (and sometimes spiking) responses that can be measured intracellularly in the primary visual cortex of a cat during presentation of a drifting grating (left, red trace) followed by the visuo-oculomotor exploration of a natural scene (right, blue trace). The corresponding dynamics of the membrane potential fluctuations exhibit long-range correlations for the low dimension periodic stimulus (the curtain). These correlations are strongly reduced when the visual system is driven by a natural scene (see El Boustani et al., doi:10.1371/journal.pcbi.1000519).

Image Credit: UNIC-CNRS

Brief Overview of Bioinformatics Activities in Singapore

25 Sep 2009

Conserved Expression Patterns Predict microRNA Targets

25 Sep 2009

Author Summary

microRNAs are small RNA molecules that regulate gene expression by controlling the output of proteins and other RNAs. The exact mechanism through which a microRNA binds to its target and how this affects the target is still a subject of much debate. In this article, the authors sought to find a reverse approach to discover the impact of microRNAs on gene expression. Instead of searching for specific targets of a given microRNA, they searched for microRNA signatures: changes in the levels of microRNAs across multiple tissues that impacted significantly the levels of messenger gene expression in these same tissues. Because many core biological functions are conserved between human and mouse, the authors compared these microRNA signatures between these two species. They found that identical microRNA signatures between these organisms could effectively predict microRNA targets and could estimate the global impact of individual microRNAs on gene output. They further demonstrated that many microRNAs act as expression enhancers by inhibiting gene repressors.

A Computational Analysis of ATP Binding of SV40 Large Tumor Antigen Helicase Motor

25 Sep 2009

Author Summary

The Large Tumor antigen (LTag) encoded by Simian Virus 40 (SV40) is a marvelous molecule that is not only a viral oncogene, but also an efficient molecular machine as a helicase that unwinds double helix DNA for genome replication, an essential process in all living organisms. LTag hexameric helicase uses the energy of ATP to power its conformational switch for DNA unwinding. Understanding how the LTag conformational switch is coupled to the energy from ATP usage by LTag to do the mechanical work of unwinding DNA is of great interest to biologists, and yet remains to be established. Based on our previous high-resolution structures of LTag helicase in different conformational states, we simulated an LTag conformational transition pathway in the ATP binding process using the targeted molecular dynamics method. Our simulation results suggest a three-step process for the ATP binding to the nucleotide pocket, in which ATP is eventually “locked” into the pocket by three pairs of “locker” interactions. We have also quantitatively evaluated the energy profile of ATP binding using a special computational simulation technique. Additionally, our simulation study of ATP binding by LTag and the accompanying conformational switches in the context of a hexamer leads to a refined cooperative iris model that may be used for DNA unwinding.

A Novel Scoring Approach for Protein Co-Purification Data Reveals High Interaction Specificity

25 Sep 2009

Author Summary

To understand and model cellular processes, we require accurate descriptions of the interactions occurring between constituent proteins. Large-scale protein interaction maps have typically been measured in two distinct ways. The first detects direct pair-wise associations by testing only two proteins at a time for an interaction. The second detects large groups of proteins that have conglomerated or purified together. With regard to the latter, it is difficult to deduce which pairs of proteins are physically interacting in the purification data, and interaction maps generally appear random and unstructured. We have developed a novel computational method to analyze the purification data (from the second method) and identify which proteins are directly interacting. The resultant protein interaction map is highly modular, meaning that the proteins organize themselves into localized, densely connected regions that likely represent individually functioning units. We also analyzed interaction maps of the first method and propose that their lack of modularity is a consequence of missing interactions that are undetected for unclear reasons. This study provides insights into the differences between the two interaction detection methods as well as the nature of biological organization.

Statistical Use of Argonaute Expression and RISC Assembly in microRNA Target Identification

25 Sep 2009

Author Summary

MicroRNAs are a family of small RNAs that play important roles in the development, physiological function and stress responses of a wide variety of organisms, and if abnormally expressed are associated with multiple types of cancer in humans. Rather than being translated into proteins, members of the family of microRNAs operate by preventing the translation of messenger RNAs to which they have some degree of sequence complementarity. Although sequence-based bioinformatics techniques have yielded large numbers of predicted messenger- and microRNA targeting relationships, verifying these as bona fide has proven practically difficult. We have developed a novel statistical approach based on the system biology of microRNAs in humans to detect such targeting relationships using high-throughput RNA expression data. Because our approach is not based on information from external target pair predictions, it can play a fully independent role in verifying such predictions as well as be used to obtain de novo target pair predictions. Using two separate data studies, we show that our approach is capable of both reproducing previously observed target pairs and verifying putative target pairs predicted from sequence data, at rates substantially better than marginal comparisons of messenger- and microRNA expression levels.

Ten Simple Rules for Chairing a Scientific Session

25 Sep 2009

Integrating Extrinsic and Intrinsic Cues into a Minimal Model of Lineage Commitment for Hematopoietic Progenitors

25 Sep 2009

Author Summary

Complex biomolecular interaction pathways in signaling networks can lead to non-intuitive behaviors that can prove critical for the regulation and robustness of biological processes. In this work, we present a signaling topology that can generate dynamic responses that are particularly pertinent to cell commitment in hematopoiesis. Our minimal model explores fundamental questions of instructive signaling that have persisted in cell-fate decisions. We show that even when lineage commitment decisions are inherently noisy, external cytokine signals, amplified by receptor upregulation, can bias the lineage choices of a progenitor cell. The multipotent progenitor, based on its differentiation potential, can exhibit several layers of memory to provide stability to both intermediate and mature states and can potentially bypass canonical intermediate states in generating mature cell types. Thus, our model provides a computational framework that can accommodate both classical and non-classical commitment paths in hematopoiesis.

Disease-Aging Network Reveals Significant Roles of Aging Genes in Connecting Genetic Diseases

25 Sep 2009

Author Summary

Explaining the molecular mechanisms of complex genetic diseases is a crucial step for curing them. Extensive studies have suggested close relationships between the aging process and genetic diseases. As a result, incorporation of the aging process in studying diseases may provide important insights both in biology and medicine. Here we construct a disease-aging network in humans to systematically explore and visualize the intricate relationships between diseases and the aging process. Instead of focusing on a specific disease or a single gene, we put all complex diseases and the aging process together and probe the interactions among the disease genes and aging genes under the network concept. By checking the network topological properties, we reveal that human disease genes are much closer to aging genes than expected by chance. Further analysis categorizes diseases into two types according to their relationships with aging. Our study provides important evidence to associate diseases and the aging process at the system level and helps to further our understanding in the molecular mechanisms of complex diseases.

Bayesian Phylogeography Finds Its Roots

25 Sep 2009

Author Summary

Spreading in time and space, rapidly evolving viruses can accumulate a considerable amount of genetic variation. As a consequence, viral genomes become valuable resources to reconstruct the spatial and temporal processes that are shaping epidemic or endemic dynamics. In molecular epidemiology, spatial inference is often limited to the interpretation of evolutionary histories with respect to the sampling locations of the pathogens. To test hypotheses about the spatial diffusion patterns of viruses, analytical techniques are required that enable us to reconstruct how viruses migrated in the past. Here, we develop a model to infer diffusion processes among discrete locations in timed evolutionary histories in a statistically efficient fashion. Applications to Avian Influenza A H5N1 and Rabies virus in Central and West African dogs demonstrate several advantages of simultaneously inferring spatial and temporal processes from gene sequences.

Network-State Modulation of Power-Law Frequency-Scaling in Visual Cortical Neurons

25 Sep 2009

Author Summary

Intracellular recording of neocortical neurons provides an opportunity of characterizing the statistical signature of the synaptic bombardment to which it is submitted. Indeed the membrane potential displays intense fluctuations which reflect the cumulative activity of thousands of input neurons. In sensory cortical areas, this measure could be used to estimate the correlational structure of the external drive. We show that changes in the statistical properties of network activity, namely the local correlation between neurons, can be detected by analyzing the power spectrum density (PSD) of the subthreshold membrane potential. These PSD can be fitted by a power-law function 1/fα in the upper temporal frequency range. In vivo recordings in primary visual cortex show that the α exponent varies with the statistics of the sensory input. Most remarkably, the exponent observed in the ongoing activity is indistinguishable from that evoked by natural visual statistics. These results are emulated by models which demonstrate that the exponent α is determined by the local level of correlation imposed in the recurrent network activity. Similar relationships are also reproduced in cortical neurons recorded in vitro with artificial synaptic inputs by controlling in computo the level of correlation in real time.

Transcriptional Profiling of the Dose Response: A More Powerful Approach for Characterizing Drug Activities

18 Sep 2009

Author Summary

Transcriptional profiling is arguably the most powerful hypothesis-free method for investigating biological effects of drugs—so why do the experiments typically use outmoded single-dose designs? Such single-dose experiments will co-mingle effects that can occur with different potency (e.g., effects on the known target versus effects on additional undesired targets). Single-dose experiments have little comparability to the dose-response bioassays, which are now used throughout the drug discovery processes. One reason for the disparity between experimental approaches is that existing analytical methods for dose-response bioassays can't cope with the dimensionality of microarray data: a typical bioassay is optimized for one response, then used to run a screen against thousands of compounds; whereas transcriptional profiling measures thousands of non-optimized responses to a single compound. Conversely, existing methods for microarray data analysis can identify patterns, but provide no quantitative dose-response information. To overcome these problems, we developed novel algorithms and visualization methods that allow anyone to apply transcriptional profiling as a conventional dose-response assay. The approach provides far more information than limited-dose designs, yet is economical (12 arrays/compound). With this new analytical framework, it is now possible to identify distinct transcriptional responses at distinct regions of the dose range, to link these impacts to biological pathways, and to make realistic connections to drug targets and to other bioassays.

Noise Management by Molecular Networks

18 Sep 2009

Author Summary

Within cells, fluctuations in molecule numbers are inevitable, since the synthesis and degradation of molecules are not synchronised. Such molecular noise can be transferred to other molecules through regulatory interactions. Noise in molecular networks, and especially in gene expression, has been studied extensively over the past years, both experimentally and through mathematical modelling. In this work, we present a theoretical framework that merges concepts derived from metabolic control analysis (which was originally developed to describe the control in metabolic pathways) with linear noise approximation (a concept from statistical physics). This framework is useful to analyse how noise propagates through molecular networks, how noise can be managed within the networks and how different network designs reduce or enhance noise. The present theory makes use of the natural, hierarchical organization of regulatory networks and makes their noise management more understandable in terms of network structure. Within this paper, we apply the framework to signaling and regulatory cascades, and analyse how feedback and time scale separation influence noise propagation in molecular networks.

Human miRNA Precursors with Box H/ACA snoRNA Features

18 Sep 2009

Author Summary

The major functions known for RNA were long believed to be either messenger RNAs, which function as intermediates between genes and proteins, or ribosomal RNAs and transfer RNAs which carry out the translation process. In recent years, however, newly discovered classes of small RNAs have been shown to play important cellular roles. These include microRNAs (miRNAs), which can regulate the production of specific proteins, and small nucleolar RNAs (snoRNAs), which recognise and chemically modify specific sequences in ribosomal RNA. Although miRNAs and snoRNAs are currently believed to be generated by different cellular pathways and to function in different cellular compartments, members of these two types of small RNAs display numerous genomic similarities, and a small number of snoRNAs have been shown to encode miRNAs in several organisms. Here we systematically investigate a possible evolutionary relationship between snoRNAs and miRNAs. Using computational analysis, we identify twenty genomic regions encoding miRNAs with highly significant similarity to snoRNAs, both on the level of their surrounding genomic context as well as their predicted folded structure. A subset of these miRNAs display functional snoRNA characteristics, strengthening the possibility that these miRNA molecules might have evolved from snoRNAs.

A Statistical Model to Identify Differentially Expressed Proteins in 2D PAGE Gels

18 Sep 2009

Author Summary

Many researchers use two dimensional polyacrylamide gel electrophoresis (2D PAGE) to identify proteins with different concentrations under different conditions. Several statistical methods have been used to identify these proteins, ranging from standard statistical tests to complex image analysis. Most of these methods fail to address the limitation of this technology, which is that when the concentration of a protein is too low, 2D PAGE is unable to detect this particular protein. Standard methodologies implemented in most software packages ignore these proteins completely. We propose an alternative approach based on the likelihood framework, which takes into account when the concentration of protein is above the detection level and below the threshold. Our results show that this model allows us to identify more proteins with different concentration levels under different conditions than the standard statistical approaches.

Predicting the Evolution of Sex on Complex Fitness Landscapes

18 Sep 2009

Author Summary

One of the biggest open questions in evolutionary biology is why sexual reproduction is so common despite its manifold costs. Many hypotheses have been proposed that can potentially explain the emergence and maintenance of sexual reproduction in nature, and currently the biggest challenge in the field is assessing their plausibility. Theoretical work has identified the conditions under which sexual reproduction is expected. However, these conditions were typically derived, making strongly simplifying assumptions about the relationship between organisms' genotype and fitness, known as the fitness landscape. Building onto previous theoretical work, we here propose different population properties that can be used to predict when sex will be beneficial. We then use simulations across a range of simple and complex fitness landscapes to test if such predictors generate accurate predictions of evolutionary outcomes. We find that one of the simplest predictors, related to variation of genetic distance between sequences, is also the most accurate one across our simulations. However, stochastic effects occurring in small populations compromise the accuracy of all predictors. Our study both illustrates the limitations of various predictors and suggests directions in which to search for new, experimentally attainable predictors.

Estimating the Continuous-Time Dynamics of Energy and Fat Metabolism in Mice

18 Sep 2009

Author Summary

The unrelenting obesity epidemic has resulted in intensive basic scientific investigation into the molecular mechanisms of body weight regulation—with the mouse being the organism of choice for such studies. We know that any mechanism of body weight regulation must exert its effect by influencing food intake, energy output, fuel selection, or some combination of these factors over extended time scales (~weeks for mice). While food intake and body weight can be frequently measured in mice, current methods prohibit corresponding measurements of energy output or fuel selection on such long time scales. We address this deficiency by developing a mathematical method that quantitatively relates measurements of food intake, body weight and body fat to calculate the dynamic changes of energy output and net fat oxidation rates during the development of obesity and weight loss in male C57BL/6 mice. The mathematical model is based on the law of energy conservation, makes very few assumptions, and provides the first continuous-time estimates of energy output and fuel selection over periods lasting many weeks. Application of our methodology to various mouse models of obesity will improve our understanding of body weight regulation by placing molecular mechanisms in their whole-body physiological context.

Transcriptional Profiling of the Dose Response: A More Powerful Approach for Characterizing Drug Activities

18 Sep 2009

Author Summary

Transcriptional profiling is arguably the most powerful hypothesis-free method for investigating biological effects of drugs—so why do the experiments typically use outmoded single-dose designs? Such single-dose experiments will co-mingle effects that can occur with different potency (e.g., effects on the known target versus effects on additional undesired targets). Single-dose experiments have little comparability to the dose-response bioassays, which are now used throughout the drug discovery processes. One reason for the disparity between experimental approaches is that existing analytical methods for dose-response bioassays can't cope with the dimensionality of microarray data: a typical bioassay is optimized for one response, then used to run a screen against thousands of compounds; whereas transcriptional profiling measures thousands of non-optimized responses to a single compound. Conversely, existing methods for microarray data analysis can identify patterns, but provide no quantitative dose-response information. To overcome these problems, we developed novel algorithms and visualization methods that allow anyone to apply transcriptional profiling as a conventional dose-response assay. The approach provides far more information than limited-dose designs, yet is economical (12 arrays/compound). With this new analytical framework, it is now possible to identify distinct transcriptional responses at distinct regions of the dose range, to link these impacts to biological pathways, and to make realistic connections to drug targets and to other bioassays.

Noise Management by Molecular Networks

18 Sep 2009

Author Summary

Within cells, fluctuations in molecule numbers are inevitable, since the synthesis and degradation of molecules are not synchronised. Such molecular noise can be transferred to other molecules through regulatory interactions. Noise in molecular networks, and especially in gene expression, has been studied extensively over the past years, both experimentally and through mathematical modelling. In this work, we present a theoretical framework that merges concepts derived from metabolic control analysis (which was originally developed to describe the control in metabolic pathways) with linear noise approximation (a concept from statistical physics). This framework is useful to analyse how noise propagates through molecular networks, how noise can be managed within the networks and how different network designs reduce or enhance noise. The present theory makes use of the natural, hierarchical organization of regulatory networks and makes their noise management more understandable in terms of network structure. Within this paper, we apply the framework to signaling and regulatory cascades, and analyse how feedback and time scale separation influence noise propagation in molecular networks.

Estimating the Continuous-Time Dynamics of Energy and Fat Metabolism in Mice

18 Sep 2009

Author Summary

The unrelenting obesity epidemic has resulted in intensive basic scientific investigation into the molecular mechanisms of body weight regulation—with the mouse being the organism of choice for such studies. We know that any mechanism of body weight regulation must exert its effect by influencing food intake, energy output, fuel selection, or some combination of these factors over extended time scales (~weeks for mice). While food intake and body weight can be frequently measured in mice, current methods prohibit corresponding measurements of energy output or fuel selection on such long time scales. We address this deficiency by developing a mathematical method that quantitatively relates measurements of food intake, body weight and body fat to calculate the dynamic changes of energy output and net fat oxidation rates during the development of obesity and weight loss in male C57BL/6 mice. The mathematical model is based on the law of energy conservation, makes very few assumptions, and provides the first continuous-time estimates of energy output and fuel selection over periods lasting many weeks. Application of our methodology to various mouse models of obesity will improve our understanding of body weight regulation by placing molecular mechanisms in their whole-body physiological context.

Human miRNA Precursors with Box H/ACA snoRNA Features

18 Sep 2009

Author Summary

The major functions known for RNA were long believed to be either messenger RNAs, which function as intermediates between genes and proteins, or ribosomal RNAs and transfer RNAs which carry out the translation process. In recent years, however, newly discovered classes of small RNAs have been shown to play important cellular roles. These include microRNAs (miRNAs), which can regulate the production of specific proteins, and small nucleolar RNAs (snoRNAs), which recognise and chemically modify specific sequences in ribosomal RNA. Although miRNAs and snoRNAs are currently believed to be generated by different cellular pathways and to function in different cellular compartments, members of these two types of small RNAs display numerous genomic similarities, and a small number of snoRNAs have been shown to encode miRNAs in several organisms. Here we systematically investigate a possible evolutionary relationship between snoRNAs and miRNAs. Using computational analysis, we identify twenty genomic regions encoding miRNAs with highly significant similarity to snoRNAs, both on the level of their surrounding genomic context as well as their predicted folded structure. A subset of these miRNAs display functional snoRNA characteristics, strengthening the possibility that these miRNA molecules might have evolved from snoRNAs.

A Statistical Model to Identify Differentially Expressed Proteins in 2D PAGE Gels

18 Sep 2009

Author Summary

Many researchers use two dimensional polyacrylamide gel electrophoresis (2D PAGE) to identify proteins with different concentrations under different conditions. Several statistical methods have been used to identify these proteins, ranging from standard statistical tests to complex image analysis. Most of these methods fail to address the limitation of this technology, which is that when the concentration of a protein is too low, 2D PAGE is unable to detect this particular protein. Standard methodologies implemented in most software packages ignore these proteins completely. We propose an alternative approach based on the likelihood framework, which takes into account when the concentration of protein is above the detection level and below the threshold. Our results show that this model allows us to identify more proteins with different concentration levels under different conditions than the standard statistical approaches.

Predicting the Evolution of Sex on Complex Fitness Landscapes

18 Sep 2009

Author Summary

One of the biggest open questions in evolutionary biology is why sexual reproduction is so common despite its manifold costs. Many hypotheses have been proposed that can potentially explain the emergence and maintenance of sexual reproduction in nature, and currently the biggest challenge in the field is assessing their plausibility. Theoretical work has identified the conditions under which sexual reproduction is expected. However, these conditions were typically derived, making strongly simplifying assumptions about the relationship between organisms' genotype and fitness, known as the fitness landscape. Building onto previous theoretical work, we here propose different population properties that can be used to predict when sex will be beneficial. We then use simulations across a range of simple and complex fitness landscapes to test if such predictors generate accurate predictions of evolutionary outcomes. We find that one of the simplest predictors, related to variation of genetic distance between sequences, is also the most accurate one across our simulations. However, stochastic effects occurring in small populations compromise the accuracy of all predictors. Our study both illustrates the limitations of various predictors and suggests directions in which to search for new, experimentally attainable predictors.

Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures

11 Sep 2009

Author Summary

The successful mapping of high-throughput sequencing (HTS) reads to reference genomes largely depends on the accuracy of both the sequencing technologies and reference genomes. Current mapping algorithms focus on mapping with mismatches but largely neglect insertions and deletions—regardless of whether they are caused by sequencing errors or genomic variation. Furthermore, trailing contaminations by primers and declining read qualities can be cumbersome for programs that allow a maximum number of mismatches. We have developed and implemented a new approach for short read mapping that, in a first step, computes exact matches of the read and the reference genome. The exact matches are then modified by a limited number of mismatches, insertions and deletions. From the set of exact and inexact matches, we select those with minimum score-based E-values. This gives a set of regions in the reference genome which is aligned to the read using Myers bitvector algorithm [1]. Our method utilizes enhanced suffix arrays [2] to quickly find the exact and inexact matches. It maps more reads and achieves higher recall rates than previous methods. This consistently holds for reads produced by 454 as well as Illumina sequencing technologies.

Eradication of Chronic Myeloid Leukemia Stem Cells: A Novel Mathematical Model Predicts No Therapeutic Benefit of Adding G-CSF to Imatinib

11 Sep 2009

Author Summary

Imatinib mesylate (Gleevec) is currently the standard treatment for chronic myeloid leukemia (CML) and elicits a large reduction in leukemic cell burden in most patients. However, strong evidence suggests that imatinib does not cure the disease; approximately 20% of patients relapse within three years, and discontinuation of imatinib therapy often leads to a rebound of the leukemic cell burden. Laboratory studies have suggested that there exists a subpopulation of “quiescent” leukemia cells (i.e., cells that do not divide) that may be insensitive to imatinib treatment. It has been postulated that the disease outcome may be improved by administering imatinib in conjunction with the Granulocyte-Colony Stimulating Factor (G-CSF), a growth factor which “wakes up” the quiescent stem cells and sensitizes them to imatinib. In this study, we design a novel mathematical model of stem cell quiescence to investigate the treatment response to imatinib and G-CSF. We find that adding G-CSF to an imatinib treatment protocol leads to observable effects only if the majority of leukemic stem cells are quiescent. Our model also predicts that adding G-CSF leads to a higher risk of resistance, since it increases the number of leukemic stem cell divisions and thus the probability of acquiring a resistance mutation.

Evaluation of Objective Uncertainty in the Visual System

11 Sep 2009

Author Summary

Most work in vision science focuses on the question of why we perceive what we do, and we now have many models explaining what physical properties of a stimulus make us see depth, colour, etc. Here we ask instead what makes us feel confident in our visual perception: in the context of a visual task, what are the physical properties of the stimulus that will make us think we are doing the task well? The mathematical framework of Bayesian statistics provides an elegant way to frame the problem, by assuming that the visual system is trying to estimate physical properties of the world from incomplete, sometimes unreliable visual information. Objective uncertainty will therefore depend on the quality of the information available in the stimulus. In our experiments we compare objective uncertainty—as computed using the Bayesian framework—with subjective uncertainty, the confidence observers report about their visual percepts. To this end, we use a visual task with well-defined statistical properties, discrimination under noise. We report a surprising degree of agreement between objective and subjective uncertainty, and discuss possible computational models that could explain this ability of the visual system.

Species Tree Inference by Minimizing Deep Coalescences

11 Sep 2009

Author Summary

Inferring the evolutionary history of a set of species, known as the species tree, is a task of utmost significance in biology and beyond. The traditional approach to accomplishing this task from molecular sequences entails sequencing a gene in the set of species under consideration, reconstructing the gene's evolutionary history, and declaring it to be the species tree. However, recent analyses of multiple gene data sets, made available thanks to advances in sequencing technologies, have indicated that gene trees in the same group of species may disagree with each other, as well as with the species tree. Therefore, the development of methods for inferring the species tree despite such disagreements is imperative.

In this paper, we propose such a method, which seeks the tree that minimizes the amount of disagreement between the input set of gene trees and the inferred one. We have implemented our method and studied its performance, in terms of accuracy and computational efficiency, on two biological data sets and a large number of simulated data sets. Our analyses, of both the biological and synthetic data sets, indicate high accuracy of the method, as well as computationally efficient solutions in practice. Hence, our method makes a good candidate for inferring accurate species trees, despite gene tree disagreements, at a genomic scale.

Parallel Computational Subunits in Dentate Granule Cells Generate Multiple Place Fields

11 Sep 2009

Author Summary

Neurons were originally divided into three morphologically distinct compartments: the dendrites receive the synaptic input, the soma integrates it and communicates the output of the cell to other neurons via the axon. Although several lines of evidence challenged this oversimplified view, neurons are still considered to be the basic information processing units of the nervous system as their output reflects the computations performed by the entire dendritic tree. In the present study, the authors build a simplified computational model and calculate that, in certain neurons, relatively small dendritic branches are able to independently trigger somatic firing. Therefore, in these cells, an action potential mirrors the activity of a small dendritic subunit rather than the input arriving to the whole dendritic tree. These neurons can be regarded as a network of a few independent integrator units connected to a common output unit. The authors demonstrate that a moderately branched dendritic tree of hippocampal granule cells may be optimized for these parallel computations. Finally the authors show that these parallel dendritic computations could explain some aspects of the location dependent activity of hippocampal granule cells.

Evaluation of Objective Uncertainty in the Visual System

11 Sep 2009

Author Summary

Most work in vision science focuses on the question of why we perceive what we do, and we now have many models explaining what physical properties of a stimulus make us see depth, colour, etc. Here we ask instead what makes us feel confident in our visual perception: in the context of a visual task, what are the physical properties of the stimulus that will make us think we are doing the task well? The mathematical framework of Bayesian statistics provides an elegant way to frame the problem, by assuming that the visual system is trying to estimate physical properties of the world from incomplete, sometimes unreliable visual information. Objective uncertainty will therefore depend on the quality of the information available in the stimulus. In our experiments we compare objective uncertainty—as computed using the Bayesian framework—with subjective uncertainty, the confidence observers report about their visual percepts. To this end, we use a visual task with well-defined statistical properties, discrimination under noise. We report a surprising degree of agreement between objective and subjective uncertainty, and discuss possible computational models that could explain this ability of the visual system.

Eradication of Chronic Myeloid Leukemia Stem Cells: A Novel Mathematical Model Predicts No Therapeutic Benefit of Adding G-CSF to Imatinib

11 Sep 2009

Author Summary

Imatinib mesylate (Gleevec) is currently the standard treatment for chronic myeloid leukemia (CML) and elicits a large reduction in leukemic cell burden in most patients. However, strong evidence suggests that imatinib does not cure the disease; approximately 20% of patients relapse within three years, and discontinuation of imatinib therapy often leads to a rebound of the leukemic cell burden. Laboratory studies have suggested that there exists a subpopulation of “quiescent” leukemia cells (i.e., cells that do not divide) that may be insensitive to imatinib treatment. It has been postulated that the disease outcome may be improved by administering imatinib in conjunction with the Granulocyte-Colony Stimulating Factor (G-CSF), a growth factor which “wakes up” the quiescent stem cells and sensitizes them to imatinib. In this study, we design a novel mathematical model of stem cell quiescence to investigate the treatment response to imatinib and G-CSF. We find that adding G-CSF to an imatinib treatment protocol leads to observable effects only if the majority of leukemic stem cells are quiescent. Our model also predicts that adding G-CSF leads to a higher risk of resistance, since it increases the number of leukemic stem cell divisions and thus the probability of acquiring a resistance mutation.

Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures

11 Sep 2009

Author Summary

The successful mapping of high-throughput sequencing (HTS) reads to reference genomes largely depends on the accuracy of both the sequencing technologies and reference genomes. Current mapping algorithms focus on mapping with mismatches but largely neglect insertions and deletions—regardless of whether they are caused by sequencing errors or genomic variation. Furthermore, trailing contaminations by primers and declining read qualities can be cumbersome for programs that allow a maximum number of mismatches. We have developed and implemented a new approach for short read mapping that, in a first step, computes exact matches of the read and the reference genome. The exact matches are then modified by a limited number of mismatches, insertions and deletions. From the set of exact and inexact matches, we select those with minimum score-based E-values. This gives a set of regions in the reference genome which is aligned to the read using Myers bitvector algorithm [1]. Our method utilizes enhanced suffix arrays [2] to quickly find the exact and inexact matches. It maps more reads and achieves higher recall rates than previous methods. This consistently holds for reads produced by 454 as well as Illumina sequencing technologies.

Parallel Computational Subunits in Dentate Granule Cells Generate Multiple Place Fields

11 Sep 2009

Author Summary

Neurons were originally divided into three morphologically distinct compartments: the dendrites receive the synaptic input, the soma integrates it and communicates the output of the cell to other neurons via the axon. Although several lines of evidence challenged this oversimplified view, neurons are still considered to be the basic information processing units of the nervous system as their output reflects the computations performed by the entire dendritic tree. In the present study, the authors build a simplified computational model and calculate that, in certain neurons, relatively small dendritic branches are able to independently trigger somatic firing. Therefore, in these cells, an action potential mirrors the activity of a small dendritic subunit rather than the input arriving to the whole dendritic tree. These neurons can be regarded as a network of a few independent integrator units connected to a common output unit. The authors demonstrate that a moderately branched dendritic tree of hippocampal granule cells may be optimized for these parallel computations. Finally the authors show that these parallel dendritic computations could explain some aspects of the location dependent activity of hippocampal granule cells.

Species Tree Inference by Minimizing Deep Coalescences

11 Sep 2009

Author Summary

Inferring the evolutionary history of a set of species, known as the species tree, is a task of utmost significance in biology and beyond. The traditional approach to accomplishing this task from molecular sequences entails sequencing a gene in the set of species under consideration, reconstructing the gene's evolutionary history, and declaring it to be the species tree. However, recent analyses of multiple gene data sets, made available thanks to advances in sequencing technologies, have indicated that gene trees in the same group of species may disagree with each other, as well as with the species tree. Therefore, the development of methods for inferring the species tree despite such disagreements is imperative.

In this paper, we propose such a method, which seeks the tree that minimizes the amount of disagreement between the input set of gene trees and the inferred one. We have implemented our method and studied its performance, in terms of accuracy and computational efficiency, on two biological data sets and a large number of simulated data sets. Our analyses, of both the biological and synthetic data sets, indicate high accuracy of the method, as well as computationally efficient solutions in practice. Hence, our method makes a good candidate for inferring accurate species trees, despite gene tree disagreements, at a genomic scale.

Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning

04 Sep 2009

Author Summary

Engineering proteins to acquire or enhance a particular useful function is at the core of many biomedical problems. This paper presents Most Informative Positive (MIP) active learning, a novel integrated computational/biological approach designed to help guide biological discovery of novel and informative positive mutants. A classifier, together with modeled structure-based features, helps guide biological experiments and so accelerates protein engineering studies. MIP reduces the number of expensive biological experiments needed to achieve novel and informative positive results. We used the MIP method to discover novel p53 cancer rescue mutants. p53 is a tumor suppressor protein, and destructive p53 mutations have been implicated in half of all human cancers. Second-site cancer rescue mutations restore p53 activity and eventually may facilitate rational design of better cancer drugs. This paper shows that, even in the first round of in vivo experiments, MIP significantly increased the discovery rate of novel and informative positive mutants.

A Combinatorial Approach to Detect Coevolved Amino Acid Networks in Protein Families of Variable Divergence

04 Sep 2009

Author Summary

Fine analyses of families of protein sequences reveal the existence of networks of coevolved amino acids. These networks are clusters of residues often entering in physical contact one with the other, and they relate residues which are located far apart on the three dimensional structure. Coevolved residues often play a major biological role in the protein, and the nature of their interactions might be multiple, spanning among binding specificity, allosteric regulation and conformational change of the protein. By carefully tracing the way residues evolved within the phylogenetic tree of sequences of a protein family, the Maximal SubTree Method captures the transition along the time scale evolution of a conserved position to a coevolved position, and provides a numerical evaluation of the degree of coevolution of pairs of coevolved residues in a protein. This combinatorial approach drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed, and it can be applied with high accuracy to families of protein sequences with variable divergence.

Thermodynamic Selection of Steric Zipper Patterns in the Amyloid Cross-β Spine

04 Sep 2009

Author Summary

Accumulation of amyloid fibrils is a salient feature of various protein misfolding diseases. Recent advances in precision experiments have begun to reveal their atomistic structures. Quantitative elucidation of how the observed structures are selected over other possible filament patterns would provide much insight into the formation and properties of amyloid fibrils. Using computer simulations and structural modeling, we demonstrate that the most stable filament pattern corresponds to the experimentally observed structure, and molecular polymorphism, selection of two or more patterns, is possible when there are more than one most stable structures. Ability to predict the structure allows for more detailed analysis, so that, for example, we can identify the most important residue for stabilizing the structure that could be therapeutically targeted. Our analysis will be useful for comparing different amyloid structures formed by the same protein or when delineating roles of different intermolecular forces in filament formation.

The Role of Ongoing Dendritic Oscillations in Single-Neuron Dynamics

04 Sep 2009

Author Summary

A central issue in biology is how local processes yield global consequences. This is especially relevant for neurons since these spatially extended cells process local synaptic inputs to generate global action potential output. The dendritic tree of a neuron, which receives most of the inputs, expresses ion channels that can generate nonlinear dynamics. A prominent phenomenon resulting from such ion channels are voltage oscillations. The distribution of the active membrane channels throughout the cell is often highly non-uniform. This can turn the dendritic tree into a network of sparsely spaced local oscillators. Here we analyze whether local dendritic oscillators can produce cell-wide voltage oscillations. Our mathematical theory shows that indeed even when the dendritic oscillators are weakly coupled, they lock their phases and give global oscillations. We show how the biophysical properties of the dendrites determine the global locking and how it can be controlled by synaptic inputs. As a consequence of global locking, even individual synaptic inputs can affect the timing of action potentials. In fact, dendrites locking in synchrony can lead to sustained firing of the cell. We show that dendritic trees can be bistable, with dendrites locking in either synchrony or asynchrony, which may provide a novel mechanism for single cell-based memory.

Googling Food Webs: Can an Eigenvector Measure Species' Importance for Coextinctions?

04 Sep 2009

Author Summary

Predicting the consequences of species' extinction is a crucial problem in ecology. Species are not isolated, but connected to each others in tangled networks of relationships known as food webs. In this work we want to determine which species are critical as they support many other species. The fact that species are not independent, however, makes the problem difficult to solve. Moreover, the number of possible “importance'” rankings for species is too high to allow a solution by enumeration. Here we take a “reverse engineering” approach: we study how we can make biodiversity collapse in the most efficient way in order to investigate which species cause the most damage if removed. We show that adapting the algorithm Google uses for ranking web pages always solves this seemingly intractable problem, finding the most efficient route to collapse. The algorithm works in this sense better than all the others previously proposed and lays the foundation for a complete analysis of extinction risk in ecosystems.

Interrogating and Predicting Tolerated Sequence Diversity in Protein Folds: Application to E. elaterium Trypsin Inhibitor-II Cystine-Knot Miniprotein

04 Sep 2009

Author Summary

The use of engineered proteins in medicine and biotechnology has surged in recent years. An emerging approach for developing novel proteins is to use a naturally-occurring protein as a molecular framework, or scaffold, wherein amino acid mutations are introduced to elicit new properties, such as the ability to recognize a specific target molecule. Successful protein engineering with this strategy requires a dependable and customizable scaffold that tolerates modifications without compromising structure. An important consideration for scaffold utility is whether existing loops can be replaced with loops of different lengths and amino acid sequences without disrupting the protein framework. This paper offers a rigorous study of the effects of modifying the exposed loops of Ecballium elaterium trypsin inhibitor II (EETI), a member of a family of promising scaffold proteins called knottins. Through our work, we identified sequence patterns of modified EETI loops that are structurally tolerated. Using bioinformatics tools, we established molecular guidelines for designing peptides for substitution into EETI and successfully predicted loop-substituted EETI variants that retain the correct protein fold. This study provides a basis for understanding the versatility of the knottin scaffold as a protein engineering platform and can be applied for predictive interrogation of other scaffold proteins.

Influence of Sequence Changes and Environment on Intrinsically Disordered Proteins

04 Sep 2009

Author Summary

Intrinsically disordered proteins, proteins that exist as conformational ensembles without time-invariant residue positions, have emerged as an important and common class of proteins in all kingdoms of life. Disordered proteins are characterized by distinct amino acid preferences, distinct mechanisms of binding, distinct substitution patterns and rates of evolution, and functional roles predominantly related to signaling and regulation. In recent years, disordered proteins have also been linked to human disease, both through conformational diseases or via host-pathogen interactions. However, despite increased importance, most studies of disordered proteins do not consider the environmental context in which the protein is found or the level of sequence change that would strongly influence the property of being disordered. To address this, we studied and quantified the variability of intrinsically disordered protein regions under different external conditions, such as temperature or pH, and compared them to the variability introduced by small sequence changes. We found that both have a strong impact on the existence of disordered regions, thus potentially regulating protein function by environmental factors or facilitating evolutionary change.

Global Motions of the Nuclear Pore Complex: Insights from Elastic Network Models

04 Sep 2009

Author Summary

The nuclear pore complex (NPC) serves as the sole gateway to the cell nucleus, and its proper functioning is therefore crucial for gene expression and many vital signaling pathways. Although it is typically circular, the overall structure of the NPC has been observed to change in response to the presence of cargo. Recently, the molecular architecture of the yeast NPC, including the shapes and relative positions of its constituent proteins, has been resolved. These new structural data provide us with a first opportunity to construct an accurate dynamical model of a macromolecular machine containing hundreds of proteins. By modeling the NPC as a network of masses connected by springs, we investigate its probable large-scale dynamics. We start from a very coarse model and gradually refine it, observing how the structural details influence the calculated dynamics. We find that the NPC dynamics are quite similar to those of a flexible toroid with an uneven mass distribution, and that the 8-fold symmetry that is universally observed in NPCs enables them to undergo certain collective motions that are inaccessible to structures of other symmetries.

A Structured Model of Video Reproduces Primary Visual Cortical Organisation

04 Sep 2009

Author Summary

When we look at a visual scene, neurons in our eyes “fire” short, electrical pulses in a pattern that encodes information about the visual world. This pattern passes through a series of processing stages within the brain, eventually leading to cells whose firing encodes high-level aspects of the scene, such as the identity of a visible object regardless of its position, apparent size or angle. Remarkably, features of these firing patterns, at least at the earlier stages of the pathway, can be predicted by building “efficient” codes for natural images: that is, codes based on models of the statistical properties of the environment. In this study, we have taken a first step towards extending this theoretical success to describe later stages of processing, building a model that extracts a structured representation in much the same way as does the visual system. The model describes discrete, persistent visual elements, whose appearance varies over time—a simplified version of a world built of objects that move and rotate. We show that when fit to natural image sequences, features of the “code” implied by this model match many aspects of processing in the first cortical stage of the visual system, including: the individual firing patterns of types of cells known as “simple” and “complex”; the distribution of coding properties over these cells; and even how these properties depend on the cells' physical proximity. The model thus brings us closer to understanding the functional principles behind the organisation of the visual system.

Googling Food Webs: Can an Eigenvector Measure Species' Importance for Coextinctions?

04 Sep 2009

Author Summary

Predicting the consequences of species' extinction is a crucial problem in ecology. Species are not isolated, but connected to each others in tangled networks of relationships known as food webs. In this work we want to determine which species are critical as they support many other species. The fact that species are not independent, however, makes the problem difficult to solve. Moreover, the number of possible “importance'” rankings for species is too high to allow a solution by enumeration. Here we take a “reverse engineering” approach: we study how we can make biodiversity collapse in the most efficient way in order to investigate which species cause the most damage if removed. We show that adapting the algorithm Google uses for ranking web pages always solves this seemingly intractable problem, finding the most efficient route to collapse. The algorithm works in this sense better than all the others previously proposed and lays the foundation for a complete analysis of extinction risk in ecosystems.

The Role of Ongoing Dendritic Oscillations in Single-Neuron Dynamics

04 Sep 2009

Author Summary

A central issue in biology is how local processes yield global consequences. This is especially relevant for neurons since these spatially extended cells process local synaptic inputs to generate global action potential output. The dendritic tree of a neuron, which receives most of the inputs, expresses ion channels that can generate nonlinear dynamics. A prominent phenomenon resulting from such ion channels are voltage oscillations. The distribution of the active membrane channels throughout the cell is often highly non-uniform. This can turn the dendritic tree into a network of sparsely spaced local oscillators. Here we analyze whether local dendritic oscillators can produce cell-wide voltage oscillations. Our mathematical theory shows that indeed even when the dendritic oscillators are weakly coupled, they lock their phases and give global oscillations. We show how the biophysical properties of the dendrites determine the global locking and how it can be controlled by synaptic inputs. As a consequence of global locking, even individual synaptic inputs can affect the timing of action potentials. In fact, dendrites locking in synchrony can lead to sustained firing of the cell. We show that dendritic trees can be bistable, with dendrites locking in either synchrony or asynchrony, which may provide a novel mechanism for single cell-based memory.

Thermodynamic Selection of Steric Zipper Patterns in the Amyloid Cross-β Spine

04 Sep 2009

Author Summary

Accumulation of amyloid fibrils is a salient feature of various protein misfolding diseases. Recent advances in precision experiments have begun to reveal their atomistic structures. Quantitative elucidation of how the observed structures are selected over other possible filament patterns would provide much insight into the formation and properties of amyloid fibrils. Using computer simulations and structural modeling, we demonstrate that the most stable filament pattern corresponds to the experimentally observed structure, and molecular polymorphism, selection of two or more patterns, is possible when there are more than one most stable structures. Ability to predict the structure allows for more detailed analysis, so that, for example, we can identify the most important residue for stabilizing the structure that could be therapeutically targeted. Our analysis will be useful for comparing different amyloid structures formed by the same protein or when delineating roles of different intermolecular forces in filament formation.

A Combinatorial Approach to Detect Coevolved Amino Acid Networks in Protein Families of Variable Divergence

04 Sep 2009

Author Summary

Fine analyses of families of protein sequences reveal the existence of networks of coevolved amino acids. These networks are clusters of residues often entering in physical contact one with the other, and they relate residues which are located far apart on the three dimensional structure. Coevolved residues often play a major biological role in the protein, and the nature of their interactions might be multiple, spanning among binding specificity, allosteric regulation and conformational change of the protein. By carefully tracing the way residues evolved within the phylogenetic tree of sequences of a protein family, the Maximal SubTree Method captures the transition along the time scale evolution of a conserved position to a coevolved position, and provides a numerical evaluation of the degree of coevolution of pairs of coevolved residues in a protein. This combinatorial approach drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed, and it can be applied with high accuracy to families of protein sequences with variable divergence.

A Structured Model of Video Reproduces Primary Visual Cortical Organisation

04 Sep 2009

Author Summary

When we look at a visual scene, neurons in our eyes “fire” short, electrical pulses in a pattern that encodes information about the visual world. This pattern passes through a series of processing stages within the brain, eventually leading to cells whose firing encodes high-level aspects of the scene, such as the identity of a visible object regardless of its position, apparent size or angle. Remarkably, features of these firing patterns, at least at the earlier stages of the pathway, can be predicted by building “efficient” codes for natural images: that is, codes based on models of the statistical properties of the environment. In this study, we have taken a first step towards extending this theoretical success to describe later stages of processing, building a model that extracts a structured representation in much the same way as does the visual system. The model describes discrete, persistent visual elements, whose appearance varies over time—a simplified version of a world built of objects that move and rotate. We show that when fit to natural image sequences, features of the “code” implied by this model match many aspects of processing in the first cortical stage of the visual system, including: the individual firing patterns of types of cells known as “simple” and “complex”; the distribution of coding properties over these cells; and even how these properties depend on the cells' physical proximity. The model thus brings us closer to understanding the functional principles behind the organisation of the visual system.

Global Motions of the Nuclear Pore Complex: Insights from Elastic Network Models

04 Sep 2009

Author Summary

The nuclear pore complex (NPC) serves as the sole gateway to the cell nucleus, and its proper functioning is therefore crucial for gene expression and many vital signaling pathways. Although it is typically circular, the overall structure of the NPC has been observed to change in response to the presence of cargo. Recently, the molecular architecture of the yeast NPC, including the shapes and relative positions of its constituent proteins, has been resolved. These new structural data provide us with a first opportunity to construct an accurate dynamical model of a macromolecular machine containing hundreds of proteins. By modeling the NPC as a network of masses connected by springs, we investigate its probable large-scale dynamics. We start from a very coarse model and gradually refine it, observing how the structural details influence the calculated dynamics. We find that the NPC dynamics are quite similar to those of a flexible toroid with an uneven mass distribution, and that the 8-fold symmetry that is universally observed in NPCs enables them to undergo certain collective motions that are inaccessible to structures of other symmetries.

Interrogating and Predicting Tolerated Sequence Diversity in Protein Folds: Application to E. elaterium Trypsin Inhibitor-II Cystine-Knot Miniprotein

04 Sep 2009

Author Summary

The use of engineered proteins in medicine and biotechnology has surged in recent years. An emerging approach for developing novel proteins is to use a naturally-occurring protein as a molecular framework, or scaffold, wherein amino acid mutations are introduced to elicit new properties, such as the ability to recognize a specific target molecule. Successful protein engineering with this strategy requires a dependable and customizable scaffold that tolerates modifications without compromising structure. An important consideration for scaffold utility is whether existing loops can be replaced with loops of different lengths and amino acid sequences without disrupting the protein framework. This paper offers a rigorous study of the effects of modifying the exposed loops of Ecballium elaterium trypsin inhibitor II (EETI), a member of a family of promising scaffold proteins called knottins. Through our work, we identified sequence patterns of modified EETI loops that are structurally tolerated. Using bioinformatics tools, we established molecular guidelines for designing peptides for substitution into EETI and successfully predicted loop-substituted EETI variants that retain the correct protein fold. This study provides a basis for understanding the versatility of the knottin scaffold as a protein engineering platform and can be applied for predictive interrogation of other scaffold proteins.

Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning

04 Sep 2009

Author Summary

Engineering proteins to acquire or enhance a particular useful function is at the core of many biomedical problems. This paper presents Most Informative Positive (MIP) active learning, a novel integrated computational/biological approach designed to help guide biological discovery of novel and informative positive mutants. A classifier, together with modeled structure-based features, helps guide biological experiments and so accelerates protein engineering studies. MIP reduces the number of expensive biological experiments needed to achieve novel and informative positive results. We used the MIP method to discover novel p53 cancer rescue mutants. p53 is a tumor suppressor protein, and destructive p53 mutations have been implicated in half of all human cancers. Second-site cancer rescue mutations restore p53 activity and eventually may facilitate rational design of better cancer drugs. This paper shows that, even in the first round of in vivo experiments, MIP significantly increased the discovery rate of novel and informative positive mutants.

Influence of Sequence Changes and Environment on Intrinsically Disordered Proteins

04 Sep 2009

Author Summary

Intrinsically disordered proteins, proteins that exist as conformational ensembles without time-invariant residue positions, have emerged as an important and common class of proteins in all kingdoms of life. Disordered proteins are characterized by distinct amino acid preferences, distinct mechanisms of binding, distinct substitution patterns and rates of evolution, and functional roles predominantly related to signaling and regulation. In recent years, disordered proteins have also been linked to human disease, both through conformational diseases or via host-pathogen interactions. However, despite increased importance, most studies of disordered proteins do not consider the environmental context in which the protein is found or the level of sequence change that would strongly influence the property of being disordered. To address this, we studied and quantified the variability of intrinsically disordered protein regions under different external conditions, such as temperature or pH, and compared them to the variability introduced by small sequence changes. We found that both have a strong impact on the existence of disordered regions, thus potentially regulating protein function by environmental factors or facilitating evolutionary change.

Evolutionary Triplet Models of Structured RNA

28 Aug 2009

Author Summary

A number of leading methods for bioinformatics analysis of structural RNAs use probabilistic grammars as models for pairs of homologous RNAs. We show that any such pairwise grammar can be extended to an entire phylogeny by treating the pairwise grammar as a machine (a “transducer”) that models a single ancestor-descendant relationship in the tree, transforming one RNA structure into another. In addition to phylogenetic enhancement of current applications, such as RNA genefinding, homology detection, alignment and secondary structure prediction, this should enable probabilistic phylogenetic reconstruction of RNA sequences that are ancestral to present-day genes. We describe statistical inference algorithms, software implementations, and a simulation-based comparison of three-taxon maximum likelihood alignment to several other methods for aligning three sibling RNAs. In the Discussion we consider how the three-taxon RNA alignment-reconstruction-folding algorithm, which is currently very computationally-expensive, might be made more efficient so that larger phylogenies could be considered.

The Origins of Lactase Persistence in Europe

28 Aug 2009

Author Summary

Most adults worldwide do not produce the enzyme lactase and so are unable to digest the milk sugar lactose. However, most people in Europe and many from other populations continue to produce lactase throughout their life (lactase persistence). In Europe, a single genetic variant, −13,910*T, is strongly associated with lactase persistence and appears to have been favoured by natural selection in the last 10,000 years. Since adult consumption of fresh milk was only possible after the domestication of animals, it is likely that lactase persistence coevolved with the cultural practice of dairying, although it is not known when lactase persistence first arose in Europe or what factors drove its rapid spread. To address these questions, we have developed a simulation model of the spread of lactase persistence, dairying, and farmers in Europe, and have integrated genetic and archaeological data using newly developed statistical approaches. We infer that lactase persistence/dairying coevolution began around 7,500 years ago between the central Balkans and central Europe, probably among people of the Linearbandkeramik culture. We also find that lactase persistence was not more favoured in northern latitudes through an increased requirement for dietary vitamin D. Our results illustrate the possibility of integrating genetic and archaeological data to address important questions on human evolution.

Pushing Structural Information into the Yeast Interactome by High-Throughput Protein Docking Experiments

28 Aug 2009

Author Summary

Proteins are the main perpetrators of most biological processes. However, they seldom act alone, and most cellular functions are, in fact, carried out by large macromolecular complexes and regulated through intricate protein-protein interaction networks. Consequently, large efforts have been devoted to unveil protein interrelationships in a high-throughput manner, and the last several years have seen the consecution of the first interactome drafts for several model organisms. Unfortunately, these studies only reveal whether two proteins interact, but not the molecular bases of these interactions. A full comprehension of how proteins bind and form complexes can only come from high-resolution, three-dimensional (3D) structures, since they provide the key quasi-atomic details necessary to understand how the individual components in a complex or pathway are assembled and coordinated to function as a molecular unit. Here, we use protein docking experiments, in a high-throughput manner, to predict the 3D structure of over 3,000 interactions in yeast, which will be used to complement the complex structures obtained within the 3D-Repertoire pan-European initiative (http://www.3drepertoire.org).

Interpreting Expression Data with Metabolic Flux Models: Predicting Mycobacterium tuberculosis Mycolic Acid Production

28 Aug 2009

Author Summary

The ability of cells to survive and grow depends on their ability to metabolize nutrients and create products vital for cell function. This is done through a complex network of reactions controlled by many genes. Changes in cellular metabolism play a role in a wide variety of diseases. However, despite the availability of genome sequences and of genome-scale expression data, which give information about which genes are present and how active they are, our ability to use these data to understand changes in cellular metabolism has been limited. We present a new approach to this problem, linking gene expression data with models of cellular metabolism. We apply the method to predict the effects of drugs and agents on Mycobacterium tuberculosis (M. tb). Virulence, growth in human hosts, and drug resistance are all related to changes in M. tb's metabolism. We predict the effects of a variety of conditions on the production of mycolic acids, essential cell wall components. Our method successfully identifies seven of the eight known mycolic acid inhibitors in a compendium of 235 conditions, and identifies the top anti-TB drugs in this dataset. We anticipate that the method will have a range of applications in metabolic engineering, the characterization of disease states, and drug discovery.

Hierarchical Modeling of Activation Mechanisms in the ABL and EGFR Kinase Domains: Thermodynamic and Mechanistic Catalysts of Kinase Activation by Cancer Mutations

28 Aug 2009

Author Summary

Mutations in protein kinases are implicated in many cancers, and an important goal of cancer research is to elucidate molecular effects of mutated kinase genes that contribute to tumorigenesis. We present a comprehensive computational study of molecular mechanisms of kinase activation by cancer-causing mutations. Using a battery of computational approaches, we have systematically investigated the effects of clinically important cancer mutants on dynamics of the ABL and EGFR kinase domains and regulatory multi-protein complexes. The results of this study have illuminated common and specific features of the activation mechanism in the normal and oncogenic forms of ABL and EGFR. We have found that mutants with the higher oncogenic activity may cause a partial destabilization of the inactive structure, while simultaneously facilitating activating transitions and the enhanced stabilization of the active conformation. Our results provided useful insights into thermodynamic and mechanistic aspects of the activation mechanism and highlighted the role of structurally distinct conformational states in kinase regulation. Ultimately, molecular signatures of activation mechanisms in the normal and oncogenic states may aid in the correlation of mutational effects with clinical outcomes and facilitate the development of therapeutic strategies to combat kinase mutation-dependent tumorigenesis.

Stable, Precise, and Reproducible Patterning of Bicoid and Hunchback Molecules in the Early Drosophila Embryo

28 Aug 2009

Author Summary

For developing embryos, the precise, position-specific regulation of molecular processes is of fatal importance. As the mechanism of such regulation, widely accepted has been the notion of the intraembryonic distribution of regulatory molecules called “morphogens”. One of the best-studied morphogens is Bicoid in the early developmental stage of the Drosophila embryo. Synthesized around the anterior pole of the embryo, Bicoid forms an exponential gradient of concentration to initiate expression of a target gene, hunchback, in nuclei at the periphery of the embryo. This invariably forms a concentration boundary of the product protein Hunchback at around 49% embryo length. Remarkably, the embryo-embryo variability in the boundary position is less than 5%. Reactions in embryos, however, should be intrinsically noisy because the number of molecules involved is small, and those reactions are governed by randomly diffusing molecules. The mechanisms to generate the invariable Hunchback distribution by filtering the intense noise remain mysterious, and here we construct models to shed light on this problem. Stochastic simulations show that the slow diffusion of Hunchback averages out the intense noise, so that the coordinated rates of diffusion and transport of input Bicoid and output Hunchback play decisive roles in suppressing fluctuations.

FLORA: A Novel Method to Predict Protein Function from Structure in Diverse Superfamilies

28 Aug 2009

Author Summary

Understanding how the three-dimensional (3D) molecular structure of proteins influences their function can provide insights into the workings of biological systems. Structural Genomics Initiatives have been set up to investigate these structures on a large scale and make the data available to the wider biological research community. However, in a significant number of cases, there is little known about the functions of the structures that are solved. To address this, computational methods can be used as a predictive tool to guide future experimental investigations. One such approach is to exploit global structural comparison to assign the protein in question to an evolutionary family, which has already been functionally characterised. However, this is problematic in some large evolutionary families, which contain a number of different functional sub-families. We have developed a new method (FLORA) which is able to calculate 3D “motifs” which are specific to each of these sub-families. Any new protein structure can then be compared against these motifs to make a more accurate prediction of its function. Our paper shows that FLORA substantially outperforms other standard approaches for predicting function from structure. We use our method to make confident functional predictions for a set of proteins solved by the structural genomics projects, which could not have been assigned reliably by global structure comparison.

PLoS Computational Biology Issue Image | Vol. 5(8) August 2009

28 Aug 2009

Map of Europe showing a lactose molecule, Linearbandkeramik pottery, and the inferred origin location of lactase persistence/dairying coevolution.

Lactase persistence, the genetic trait that enables adult humans to digest the milk sugar lactose, is thought to have coevolved with the culturally transmitted practice of dairying. In this work the authors use computer simulations, conditioned on archaeological and genetic data, to infer that this coevolution process began about 7,500 years ago in a region between the northern Balkans and Central Europe, probably in association with the Neolithic Linearbandkeramik culture (see Itan et al., doi:10.1371/journal.pcbi.1000491).

Image Credit: Yael Pinchevsky, Yuval Itan, Joachim Burger, and Mark G. Thomas. Photograph credit: Sabine Schade-Lindig.

A Quick Guide to Teaching R Programming to Computational Biology Students

28 Aug 2009

Detection of Functional Modes in Protein Dynamics

28 Aug 2009

Author Summary

Proteins are flexible nanomachines that frequently accomplish their biological function by collective atomic motions. Such motions may be characterized by hinge, shear, or rotational motions of entire protein domains, loop movements, or subtle rearrangements of amino acid side chains. In many cases it is far from obvious how collective motions are related to a particular biological task. Therefore, we propose a novel technique termed “functional mode analysis” that, based on an ensemble of structures, aims to detect a collective motion that is directly related to a particular protein function. From the given set of protein structures, together with a “functional quantity”, the technique seeks the collective motion that is maximally correlated to the functional quantity. The chosen functional quantity can be quite general; typical examples could include the openness of a channel, active site geometry, or cleft solvent accessibility. Because the proposed framework is highly general, we expect the approach to be useful to a wide range of applications. To illustrate the new technique, we apply functional mode analysis to molecular dynamics trajectories of a polyalanine-helix, bacteriophage T4 lysozyme, Trp-cage, and Leucine-binding protein.

Autocatalytic Loop, Amplification and Diffusion: A Mathematical and Computational Model of Cell Polarization in Neural Chemotaxis

28 Aug 2009

Author Summary

The ability of cells to respond to chemical signals present in the environment is of upmost importance for life. In the developing embryo, cells crawl along graded fields of chemical cues to aggregate into organized patterns. This process is an example of chemotaxis. It is a complex phenomenon, where external signals are transduced into internal chemical pathways leading to directional movement. Differential reorganization of the internal structure is called polarization, and it involves regulatory proteins as well as cytoskeletal elements. In this work, we propose a mathematical and computational model for the quantitative study of chemotactic pathfinding in neural cells. Our starting point is the recent finding that, for such cells, an early polarization event is the redistribution on the membrane of cue–ligated receptors, transported by the cytoskeletal structures, which act as a sort of conveyor belt. We show that this proposed mechanism, connecting in a closed loop cue sensing and cytoskeleton dynamics, is qualitatively and quantitatively adequate to produce polarization. We also investigate the role of the internal biochemical chain in producing signal amplification and its tight interlacing with polarization. An extension of the model is used to study chemotactic behaviors as the attractive/repulsive response of axons exposed to the same cue.

Strong Inference for Systems Biology

28 Aug 2009

Bioinformatics in Malaysia: Hope, Initiative, Effort, Reality, and Challenges

28 Aug 2009

Computation of Conformational Coupling in Allosteric Proteins

28 Aug 2009

Author Summary

A common means of biological regulation is allostery, in which an effector molecule binds to one site on a protein and induces a conformational change which changes activity at a distant active site. Frequently high resolution structures are determined for one state of an allosteric protein but not the other. To probe the allosteric conformational changes in such cases, we describe a computational method for predicting the structure of one allosteric state of a protein starting with knowledge of another. Our method also provides a detailed map of the free energy landscape traversed in an allosteric transition and reveals the coupling between interacting residue pairs that underlies the transition.

A Self-Organizing Algorithm for Modeling Protein Loops

21 Aug 2009

Author Summary

Protein loops play an important role in protein function, such as ligand binding, recognition, and allosteric regulation. However, due to their flexibility, it is notoriously difficult to determine their 3D structures using traditional experimental techniques. As a result, one can often find protein structures with missing loops in the Protein Data Bank. Their sequence variability also presents a particular challenge for homology modeling methods, which can only yield good overall structures given sufficient sequence identity and good experimental reference structures. Despite extensive research, the construction of protein loop 3D structures remains an open problem, since a sensible conformation should seamlessly bridge the anchor points without introducing steric clashes within the loop itself or between the loop and its surroundings environment. Here, we present a conceptually simple, mathematically straightforward, numerically robust and computationally efficient approach for building protein loop conformations that simultaneously satisfy end-point, steric, planar and chiral constraints. More importantly, additional constraints derived from experimental sources can be incorporated in a straightforward manner, allowing the processing of more complex structures involving multiple interlocking loops.

Can Molecular Motors Drive Distance Measurements in Injured Neurons?

21 Aug 2009

Author Summary

Neurons have extremely long axonal processes that can reach lengths of up to 1 meter in human peripheral nerves. The neuronal cell body response to nerve injury is dependent on signals carried by molecular motors from the lesion site in the axon. The distance between the injury site and the cell body influences the type of response, suggesting that neurons must be able to estimate the distance of an axonal injury site, although how they do this is unknown. We have used a computational approach to model intracellular distance measurement after nerve injury. The models show the feasibility of a mechanism based on a rapid, near instantaneous, signal carried by action potentials in the nerve, followed by multiple slower signals carried on molecular motors. Such a mechanism can enable a neuron to discriminate between distances as close as 10% of total axon length. The model provides insights on retrograde injury signaling in neurons, including the biological relevance of the mechanism over different scales of nerves and organisms. Moreover, if similar mechanisms function in synapse to nucleus signaling in uninjured neurons, this could enable estimation of relative process lengths, thus guiding metabolic output from cell bodies to axons.

Amyloidogenic Regions and Interaction Surfaces Overlap in Globular Proteins Related to Conformational Diseases

21 Aug 2009

Author Summary

The aggregation of proteins in tissues is associated with the pathogenesis of more than 40 human diseases. The polypeptides underlying disorders such as Alzheimer's and Parkinson's are devoid of any regular structure, whereas the polypeptides causing familial amyotrophic lateral sclerosis or nonneuropathic systemic amyloidosis correspond to globular proteins. Little is known about the mechanism by which globular proteins under physiological conditions aggregate from their initially folded and soluble conformations. Interestingly, several of these pathogenic proteins display quaternary structure or are bound to other proteins in their physiological context. In the present work, we show that protein-protein interaction surfaces and regions with high aggregation propensity significantly overlap in these polypeptides. This suggests that the formation of native complexes and self-aggregation reactions probably compete in the cell, explaining why point mutations affecting the interface or the stability of the protein complex lead in many cases to the formation of toxic aggregates. This study proposes general strategies to fight against diseases associated with the deposition of globular polypeptides.

Accurate Prediction of DnaK-Peptide Binding via Homology Modelling and Experimental Data

21 Aug 2009

Author Summary

Molecular chaperones are essential elements of the protein quality control machinery that governs translocation and folding of nascent polypeptides, refolding and degradation of misfolded proteins, and activation of a wide range of client proteins. This variety of functions results from the existence of multiple chaperones with different structures. Chaperones bind to exposed regions of proteins to fulfil their function. The chaperone must hereby recognise a certain signal sequence on the substrate protein. The nature of the sequence that is exposed will determine the types of chaperones that can interact with it, and in the end will also determine the fate of the substrate protein: refolding, translocation, degradation or activation. Knowledge of the binding sequence determinants of molecular chaperones will shed more light on the mechanism of how each chaperone contributes to the cellular protein quality control system.

In this study we have made an algorithm which accurately predicts binding sites for the well studied E. coli Hsp70 chaperone, DnaK, which is implicated in folding efficiency and prevention of aggregation. The ability to detect and design high-affinity DnaK binding sites enhances our understanding of chaperone-substrate recognition and opens great opportunities to enhance protein solubility using protein-DnaK binding motif fusions.

A Mapping of Drug Space from the Viewpoint of Small Molecule Metabolism

21 Aug 2009

Author Summary

All humans, plants, and animals use enzymes to metabolize food for energy, build and maintain the body, and get rid of toxins. Drugs used to clear infections or cure cancer often target enzymes in bacteria or cancer cells, but the drugs can interfere with the proper function of human enzymes as well. Recent studies have mapped drugs to enzymes and many other targets in humans and other organisms, but have not focused on metabolism. In this study, we present a new method to predict what enzymes drugs might affect based on the chemical similarity between classes of drugs and the natural chemicals used by enzymes. We have applied the method to 246 known drug classes and a collection of 385 organisms (including 65 National Institutes of Health Priority Pathogens) to create maps of potential drug action in metabolism. We also show how the predicted connections can be used to find new ways to kill pathogens and to avoid unintentionally interfering with human enzymes.

Microarray Comparative Genomic Hybridisation Analysis Incorporating Genomic Organisation, and Application to Enterobacterial Plant Pathogens

21 Aug 2009

Author Summary

We describe the first use of a method for the analysis of bacterial microarray comparative genomic hybridisation (aCGH) that includes information about the spatial organisation of genes in the reference bacterium. We demonstrate that using this information improves predictive performance over standard bacterial aCGH methods in discriminating between genes from the reference organism that either do or do not have putative orthologues in the comparator organism. Our approach produces good results on more distantly related bacteria than can successfully be analysed by the standard methods. We apply our analysis to comparisons between four commercially-significant plant pathogenic bacteria, and identify several regions of the genome that are likely to contribute to their ability to cause disease, and to proliferate in the environment, generating hypotheses for future experiments.

On the Accessibility of Adaptive Phenotypes of a Bacterial Metabolic Network

21 Aug 2009

Author Summary

Adaptation involves the discovery by mutation and spread through populations of traits (or “phenotypes”) that have high fitness under prevailing environmental conditions. While the spread of adaptive phenotypes through populations is mediated by natural selection, the likelihood of their discovery by mutation depends primarily on the relationship between genetic information and phenotypes (the genotype-phenotype mapping, or GPM). Elucidating the factors that influence the structure of the GPM is therefore critical to understanding the adaptation process. We investigated the influence of environmental quality on GPM structure for a well-studied model of Escherichia coli's metabolism. Our results suggest that the GPM is more rugged in qualitatively poorer environments and, therefore, the discovery of adaptive phenotypes may be intrinsically less likely in such environments. Nevertheless, we found that the GPM contains large neutral networks in all studied environments, suggesting that populations adapting to these environments could circumvent the frequent “hill descents” that would otherwise be required by a rugged GPM. Moreover, we demonstrated that adaptation proceeds faster in environments for which the GPM transmits information about phenotype differences more efficiently, providing a connection between information theory and evolutionary theory. These results have implications for understanding constraints on adaptation in nature.

Temporal Variability and Social Heterogeneity in Disease Transmission: The Case of SARS in Hong Kong

21 Aug 2009

Author Summary

Recent epidemics have shown that healthcare workers may be overrepresented among cases and how critical it is to protect them. For example, during the 2002–2003 severe acute respiratory syndrome (SARS) epidemics in Hong Kong, 27%of cases were healthcare workers when they were <1% of the population. Better means of protection require understanding how healthcare workers were infected and assessing their role in disease transmission. Here, we describe a method for estimating the temporal profile of the risk of infection and probability of transmission in the community and hospitals. The 2002–2003 SARS outbreak in Hong Kong is used as an example. For the SARS epidemic, we show that the risk of infection in the community and hospitals decreased with time down to zero in hospitals but remained larger in the community. This observation suggests that public health measures and behavioural changes most effectively reduced transmission in hospitals. Besides, we find that the large number of cases observed among healthcare workers is more likely a result of large and sustained exposure to hospitalized cases than to transmission among healthcare workers. These results are of interest to design control measures in the event of an influenza pandemic.

Investigating CTL Mediated Killing with a 3D Cellular Automaton

21 Aug 2009

Author Summary

The immune response mediated by cytotoxic T lymphocytes (CTLs), which kill infected cells, is thought to be essential to control viral infections. Experiments offer data which allow one to address the efficacy of this cell population in vivo and to estimate characterizing parameters. However, it is unclear which mathematical description reflects the experimental situation best and leads to reliable parameter estimates that quantify CTL efficacy. We simulate the spatial interaction of CTLs and infected cells in a 3-dimensional computer model to examine different mathematical descriptions of the experimental situation, independently of experimental data. Thereby we find an appropriate mathematical term to describe the killing process. Estimates obtained so far describe CTL efficacy on a population level. By varying the individual properties of simulated CTLs, such as the velocity, we find that the time a CTL needs to kill an infected cell is probably the key factor limiting CTL killing efficacy. Our analysis identifies additional experimental directions which could advance our quantitative understanding of CTL killing for different diseases.

A Condensation-Ordering Mechanism in Nanoparticle-Catalyzed Peptide Aggregation

14 Aug 2009

Author Summary

Protein misfolding and aggregation are associated with a wide variety of human disorders, which include Alzheimer's and Parkinson's diseases and late onset diabetes. It has been recently realised that the process of aggregation may be triggered by the presence of nanoparticles. We use here molecular dynamics simulations to characterise the molecular mechanism by which such nanoparticles are capable of enhancing the rate of formation of peptide aggregates. Our findings indicate that nanoparticle surfaces act as a catalyst that increases the local concentration of peptides, thus facilitating their subsequent assembly into stable fibrillar structures. The approach that we present, in addition to providing a description of the process of aggregation of peptides in the presence of nanoparticles, will enable the study of the mechanism of action of a variety of other potential aggregation-promoting agents present in living organisms, including lipid membranes and other cellular components.

Model-Based Deconvolution of Cell Cycle Time-Series Data Reveals Gene Expression Details at High Resolution

14 Aug 2009

Author Summary

Time-series analyses of cellular regulatory processes have successfully drawn attention to the importance of temporal regulation in biological systems. A number of model systems can be synchronized such that data collected on cell populations better reflect the dynamic properties of the individual cell. However, experimental synchronization is never perfect, and the degree of synchrony that does exist at the outset of an experiment is quickly lost over time as cells grow at different rates and enter different developmental or physiological states on cell division. Thus, data collected from a population of synchronized cells can lead to incorrect models of temporal regulation. Here we demonstrate that the problem of relating population data to the individual cell can be resolved with a computational method that effectively removes the effects of both imperfect synchrony and time-dependent loss of synchrony. Application of this deconvolution algorithm to a cell cycle time-series data set from the model bacterium Caulobacter crescentus uncovers critical temporal details in the expression of essential genes that are not evident in the raw population-based data. The deconvolution routine presented here is a robust and general tool for extracting biochemical parameters of the average single cell from population time-series data.

Red Queen Dynamics with Non-Standard Fitness Interactions

14 Aug 2009

Author Summary

The Red Queen has become an eponym for rapid and perpetual evolutionary arms races between hosts and parasites. The Red Queen also lends her name to the idea that such arms races are at the core of the question of why sexual reproduction is so widespread among higher-level organisms. According to this view, recombination provides the hosts with an advantage that allows faster adaptation to the parasite population. To date, mathematical models trying to quantify Red Queen dynamics and the Red Queen hypothesis for the evolution of sex have generally made several simplifying assumptions about how host and parasite genotypes interact with each other (i.e., how they influence each other's fitness). In this article we present a model that allows for arbitrary patterns of fitness interactions between both parties. We demonstrate that the degree of ‘antagonicity’ in these interactions is decisive for whether Red Queen dynamics are observed, and assess the robustness of various previous results concerning the Red Queen hypothesis with respect to fitness interactions. Our results also make clear how difficult predictions of coevolutionary dynamics and selection for recombination are likely to be in real host-parasite systems.

Nash Equilibria in Multi-Agent Motor Interactions

14 Aug 2009

Author Summary

Human motor interactions range from adversarial activities like judo and arm wrestling to more cooperative activities like tandem riding and tango dancing. In this study, we design a new methodology to study human sensorimotor interactions quantitatively based on game theory. We develop two motor tasks based on the prisoner's dilemma and the rope-pulling game in which we introduce an intrinsic cost related to effort rather than the typical monetary outcome used in cognitive game theory. We find that continuous motor interactions converged to game theoretic outcomes similar to the interaction dynamics reported for other dynamical systems in biology ranging in scale from microorganisms to population dynamics.

Accelerated Immunodeficiency by Anti-CCR5 Treatment in HIV Infection

14 Aug 2009

Author Summary

HIV has caused over 30 million deaths. The virus is so fatal because it infects and depletes CD4+ T cells, “helper” immune cells critical for orchestrating and stimulating the overall immune response. No one understands why, in about 50% of HIV infections, a more deadly strain emerges late in infection. The new HIV strain, known as X4, differs from its predecessor, known as R5, because X4 only infects CD4+ T cells displaying the receptor CXCR4, while R5 only infects CD4+ T cells displaying the receptor CCR5. Because CXCR4 and CCR5 are found on different CD4+ T cells, X4 depletes a second set of critical immune cells, accelerating immunodeficiency and death. Recently, the FDA began approving drugs that selectively block R5, and some researchers have touted anti-R5 therapy alone as a potentially safer alternative to current anti-HIV drugs. But an open question is whether anti-R5 treatments push HIV toward the more deadly X4 variant earlier. To understand how X4 emerges and how anti-R5 treatments affect X4, we apply a combination of mathematical analysis and simulation. An important medical result of our work is that anti-R5 treatment alone can accelerate X4 emergence and immunodeficiency. Our results suggest that anti-R5 treatment only be used with anti-X4 treatment or anti-HIV drug “cocktails,” which combat R5 and X4 equally.

A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes

14 Aug 2009

Author Summary

Even though there is only a single large biological network within any cell and all pathways are to some extent connected, the partition of the entire cellular network into smaller units (e.g., KEGG pathways) is extremely important for understanding biological processes. Biological pathway reconstruction, therefore, is essential for understanding the biological functions that a newly sequenced genome encodes and recently for studying the functionality of a natural environment via metagenomics. The common practice of pathway reconstruction in metagenomics first identifies functions encoded by the metagenomic sequences and then reconstructs pathways from the annotated functions by mapping the functions to reference pathways. To address the issues of both incomplete data (e.g., metagenomes, unlike individual genomes, are most likely incomplete) and pathway redundancy (e.g., the same function is involved in multiple pathway units), we formulate a parsimony version of the pathway reconstruction/inference problem, called MinPath (Minimal set of Pathways): given a set of reference pathways and a set of functions that can be mapped to one or more pathways, MinPath aims at finding a minimum number of pathways that can explain all functions. MinPath achieves a more conservative, yet more faithful, estimation of the biological pathways encoded by genomes and metagenomes.

Recognizing Sequences of Sequences

14 Aug 2009

Author Summary

Despite tremendous advances in neuroscience, we cannot yet build machines that recognize the world as effortlessly as we do. One reason might be that there are computational approaches to recognition that have not yet been exploited. Here, we demonstrate that the ability to recognize temporal sequences might play an important part. We show that an artificial decoding device can extract natural speech sounds from sound waves if speech is generated as dynamic and transient sequences of sequences. In principle, this means that artificial recognition can be implemented robustly and online using dynamic systems theory and Bayesian inference.

Temporal Controls of the Asymmetric Cell Division Cycle in Caulobacter crescentus

14 Aug 2009

Author Summary

Because of its small genome size and the ease by which it can be manipulated genetically and biochemically, Caulobacter crescentus provides unique opportunities to study the molecular circuitry controlling the asymmetric cell division cycle of bacteria. A large amount of experimental data accumulated on this model organism in recent years needs to be quantitatively reconciled and analyzed in order to generate a full description of the process. Here, from these experimental clues, we suggest a mechanism for the principal molecular interactions that control DNA synthesis and asymmetric cell division in Caulobacter and construct a quantitative (mathematical) model of the mechanism in order to analyze the temporal dynamics of the control system. The model is centered around three “master regulator” proteins, whose timing of expression is tightly controlled by the progression of DNA replication. The model has been validated against observed phenotypes of wild-type cells and relevant mutants, and predicts phenotypes of novel mutants and of known mutants under novel experimental conditions. It provides a rigorous account of current intuitive ideas of bacterial cell cycle control and advances our understanding of bacterial cell division.

Log in

Create an account or log in to make full use of Pregolia.

 bioinformatics   computational biology   genetics   genomics   molecular biology   systems biology 

Add your own tags

Pregolia © 2008-2009  |  Bridging academic communities  |  Concept and realization by Anthony Liekens  |  Pregolia is licensed under a Creative Commons 3.0 License