Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 5:00 PM ET (12:00 - 21:00 UTC). We apologize for the inconvenience.
By Topic

Computational Biology and Bioinformatics, IEEE/ACM Transactions on

Issue 4 • Date July-Aug. 2013

Filter Results

Displaying Results 1 - 25 of 31
  • [Front Cover]

    Publication Year: 2013 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (2073 KB)  
    Freely Available from IEEE
  • [front inside cover]

    Publication Year: 2013 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (320 KB)  
    Freely Available from IEEE
  • Guest Editorial for Special Section on BSB 2012

    Publication Year: 2013 , Page(s): 817 - 818
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Extending the Algebraic Formalism for Genome Rearrangements to Include Linear Chromosomes

    Publication Year: 2013 , Page(s): 819 - 831
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1410 KB) |  | HTML iconHTML  

    Algebraic rearrangement theory, as introduced by Meidanis and Dias, focuses on representing the order in which genes appear in chromosomes, and applies to circular chromosomes only. By shifting our attention to genome adjacencies, we introduce the adjacency algebraic theory, extending the original algebraic theory to linear chromosomes in a very natural way, also allowing the original algebraic distance formula to be used to the general multichromosomal case, with both linear and circular chromosomes. The resulting distance, which we call algebraic distance here, is very similar to, but not quite the same as, double-cut-and-join distance. We present linear time algorithms to compute it and to sort genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. A thorough discussion on the relationship between the chromosomal and adjacency representation is also given, and we show how all classic rearrangement operations can be modeled using the algebraic theory. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2D Meets 4G: G-Quadruplexes in RNA Secondary Structure Prediction

    Publication Year: 2013 , Page(s): 832 - 844
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1491 KB) |  | HTML iconHTML  

    G-quadruplexes are abundant locally stable structural elements in nucleic acids. The combinatorial theory of RNA structures and the dynamic programming algorithms for RNA secondary structure prediction are extended here to incorporate G-quadruplexes using a simple but plausible energy model. With preliminary energy parameters, we find that the overwhelming majority of putative quadruplex-forming sequences in the human genome are likely to fold into canonical secondary structures instead. Stable G-quadruplexes are strongly enriched, however, in the 5Ê1UTR of protein coding mRNAs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proximity Measures for Clustering Gene Expression Microarray Data: A Validation Methodology and a Comparative Analysis

    Publication Year: 2013 , Page(s): 845 - 857
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (829 KB) |  | HTML iconHTML  

    Cluster analysis is usually the first step adopted to unveil information from gene expression microarray data. Besides selecting a clustering algorithm, choosing an appropriate proximity measure (similarity or distance) is of great importance to achieve satisfactory clustering results. Nevertheless, up to date, there are no comprehensive guidelines concerning how to choose proximity measures for clustering microarray data. Pearson is the most used proximity measure, whereas characteristics of other ones remain unexplored. In this paper, we investigate the choice of proximity measures for the clustering of microarray data by evaluating the performance of 16 proximity measures in 52 data sets from time course and cancer experiments. Our results support that measures rarely employed in the gene expression literature can provide better results than commonly employed ones, such as Pearson, Spearman, and euclidean distance. Given that different measures stood out for time course and cancer data evaluations, their choice should be specific to each scenario. To evaluate measures on time-course data, we preprocessed and compiled 17 data sets from the microarray literature in a benchmark along with a new methodology, called Intrinsic Biological Separation Ability (IBSA). Both can be employed in future research to assess the effectiveness of new measures for gene time-course data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Closed-Loop Control Scheme for Steering Steady States of Glycolysis and Glycogenolysis Pathway

    Publication Year: 2013 , Page(s): 858 - 868
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (954 KB) |  | HTML iconHTML  

    Biochemical networks normally operate in the neighborhood of one of its multiple steady states. It may reach from one steady state to other within a finite time span. In this paper, a closed-loop control scheme is proposed to steer states of the glycolysis and glycogenolysis (GG) pathway from one of its steady states to other. The GG pathway is modeled in the synergism and saturation system formalism, known as S-system. This S-system model is linearized into the controllable Brunovsky canonical form using a feedback linearization technique. For closed-loop control, the linear-quadratic regulator (LQR) and the linear-quadratic gaussian (LQG) regulator are invoked to design a controller for tracking prespecified steady states. In the feedback linearization technique, a global diffeomorphism function is proposed that facilitates in achieving the regulation requirement. The robustness of the regulated GG pathway is studied considering input perturbation and with measurement noise. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Divide and Conquer Approach for Construction of Large-Scale Signaling Networks from PPI and RNAi Data Using Linear Programming

    Publication Year: 2013 , Page(s): 869 - 883
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3039 KB) |  | HTML iconHTML  

    Inference of topology of signaling networks from perturbation experiments is a challenging problem. Recently, the inference problem has been formulated as a reference network editing problem and it has been shown that finding the minimum number of edit operations on a reference network to comply with perturbation experiments is an NP-complete problem. In this paper, we propose an integer linear optimization (ILP) model for reconstruction of signaling networks from RNAi data and a reference network. The ILP model guarantees the optimal solution; however, is practical only for small signaling networks of size 10-15 genes due to computational complexity. To scale for large signaling networks, we propose a divide and conquer-based heuristic, in which a given reference network is divided into smaller subnetworks that are solved separately and the solutions are merged together to form the solution for the large network. We validate our proposed approach on real and synthetic data sets, and comparison with the state of the art shows that our proposed approach is able to scale better for large networks while attaining similar or better biological accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Knowledge-Based Multiple-Sequence Alignment Algorithm

    Publication Year: 2013 , Page(s): 884 - 896
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1192 KB) |  | HTML iconHTML  

    A common and cost-effective mechanism to identify the functionalities, structures, or relationships between species is multiple-sequence alignment, in which DNA/RNA/protein sequences are arranged and aligned so that similarities between sequences are clustered together. Correctly identifying and aligning these sequence biological similarities help from unwinding the mystery of species evolution to drug design. We present our knowledge-based multiple sequence alignment (KB-MSA) technique that utilizes the existing knowledge databases such as SWISSPROT, GENBANK, or HOMSTRAD to provide a more realistic and reliable sequence alignment. We also provide a modified version of this algorithm (CB-MSA) that utilizes the sequence consistency information when sequence knowledge databases are not available. Our benchmark tests on BAliBASE, PREFAB, HOMSTRAD, and SABMARK references show accuracy improvements up to 10 percent on twilight data sets against many leading alignment tools such as ISPALIGN, PADT, CLUSTALW, MAFFT, PROBCONS, and T-COFFEE. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Two-Phase Bio-NER System Based on Integrated Classifiers and Multiagent Strategy

    Publication Year: 2013 , Page(s): 897 - 904
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (785 KB) |  | HTML iconHTML  

    Biomedical named entity recognition (Bio-NER) is a fundamental step in biomedical text mining. This paper presents a two-phase Bio-NER model targeting at JNLPBA task. Our two-phase method divides the task into two subtasks: named entity detection (NED) and named entity classification (NEC). The NED subtask is accomplished based on the two-layer stacking method in the first phase, where named entities (NEs) are distinguished from nonnamed-entities (NNEs) in biomedical literatures without identifying their types. Then six classifiers are constructed by four toolkits (CRF++, YamCha, maximum entropy, Mallet) with different training methods and integrated based on the two-layer stacking method. In the second phase for the NEC subtask, the multiagent strategy is introduced to determine the correct entity type for entities identified in the first phase. The experiment results show that the presented approach can achieve an F-score of 76.06 percent, which outperforms most of the state-of-the-art systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies

    Publication Year: 2013 , Page(s): 905 - 913
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1644 KB) |  | HTML iconHTML  

    Scaffold filling is a new combinatorial optimization problem in genome sequencing. The one-sided scaffold filling problem can be described as given an incomplete genome I and a complete (reference) genome G, fill the missing genes into I such that the number of common (string) adjacencies between the resulting genome I' and G is maximized. This problem is NP-complete for genome with duplicated genes and the best known approximation factor is 1.33, which uses a greedy strategy. In this paper, we prove a better lower bound of the optimal solution, and devise a new algorithm by exploiting the maximum matching method and a local improvement technique, which improves the approximation factor to 1.25. For genome with gene repetitions, this is the only known NP-complete problem which admits an approximation with a small constant factor (less than 1.5). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Optimization Rule for In Silico Identification of Targeted Overproduction in Metabolic Pathways

    Publication Year: 2013 , Page(s): 914 - 926
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1678 KB) |  | HTML iconHTML  

    In an extension of previous work, here we introduce a second-order optimization method for determining optimal paths from the substrate to a target product of a metabolic network, through which the amount of the target is maximum. An objective function for the said purpose, along with certain linear constraints, is considered and minimized. The basis vectors spanning the null space of the stoichiometric matrix, depicting the metabolic network, are computed, and their convex combinations satisfying the constraints are considered as flux vectors. A set of other constraints, incorporating weighting coefficients corresponding to the enzymes in the pathway, are considered. These weighting coefficients appear in the objective function to be minimized. During minimization, the values of these weighting coefficients are estimated and learned. These values, on minimization, represent an optimal pathway, depicting optimal enzyme concentrations, leading to overproduction of the target. The results on various networks demonstrate the usefulness of the methodology in the domain of metabolic engineering. A comparison with the standard gradient descent and the extreme pathway analysis technique is also performed. Unlike the gradient descent method, the present method, being independent of the learning parameter, exhibits improved results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algebraic Representation of Asynchronous Multiple-Valued Networks and Its Dynamics

    Publication Year: 2013 , Page(s): 927 - 938
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (893 KB) |  | HTML iconHTML  

    In this paper, dynamics of asynchronous multiple-valued networks (AMVNs) are investigated based on linear representation. By semitensor product of matrices, we convert AMVNs into the discrete-time linear representation. A general formula to calculate all of network transition matrices of a specific AMVN is achieved. A necessary and sufficient algebraic criterion to determine whether a given state belongs to loose attractors of length s is proposed. Formulas for the numbers of attractors in AMVNs are provided. Finally, algorithms are presented to detect all of the attractors and basins. Examples are shown to demonstrate the feasibility of the proposed scheme. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithms for Genome-Scale Phylogenetics Using Gene Tree Parsimony

    Publication Year: 2013 , Page(s): 939 - 956
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (543 KB) |  | HTML iconHTML  

    The use of genomic data sets for phylogenetics is complicated by the fact that evolutionary processes such as gene duplication and loss, or incomplete lineage sorting (deep coalescence) cause incongruence among gene trees. One well-known approach that deals with this complication is gene tree parsimony, which, given a collection of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, a lack of efficient algorithms has limited the use of this approach. Here, we present efficient algorithms for SPR and TBR-based local search heuristics for gene tree parsimony under the 1) duplication, 2) loss, 3) duplication-loss, and 4) deep coalescence reconciliation costs. These novel algorithms improve upon the time complexities of previous algorithms for these problems by a factor of n, where n is the number of species in the collection of gene trees. Our algorithms provide a substantial improvement in runtime and scalability compared to previous implementations and enable large-scale gene tree parsimony analyses using any of the four reconciliation costs. Our algorithms have been implemented in the software packages DupTree and iGTP, and have already been used to perform several compelling phylogenetic studies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analytical Solution of Steady-State Equations for Chemical Reaction Networks with Bilinear Rate Laws

    Publication Year: 2013 , Page(s): 957 - 969
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB) |  | HTML iconHTML  

    True steady states are a rare occurrence in living organisms, yet their knowledge is essential for quasi-steady-state approximations, multistability analysis, and other important tools in the investigation of chemical reaction networks (CRN) used to describe molecular processes on the cellular level. Here, we present an approach that can provide closed form steady-state solutions to complex systems, resulting from CRN with binary reactions and mass-action rate laws. We map the nonlinear algebraic problem of finding steady states onto a linear problem in a higher-dimensional space. We show that the linearized version of the steady-state equations obeys the linear conservation laws of the original CRN. We identify two classes of problems for which complete, minimally parameterized solutions may be obtained using only the machinery of linear systems and a judicious choice of the variables used as free parameters. We exemplify our method, providing explicit formulae, on CRN describing signal initiation of two important types of RTK receptor-ligand systems, VEGF and EGF-ErbB1. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing the Topology of Probabilistic Biological Networks

    Publication Year: 2013 , Page(s): 970 - 983
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4249 KB) |  | HTML iconHTML  

    Biological interactions are often uncertain events, that may or may not take place with some probability. This uncertainty leads to a massive number of alternative interaction topologies for each such network. The existing studies analyze the degree distribution of biological networks by assuming that all the given interactions take place under all circumstances. This strong and often incorrect assumption can lead to misleading results. In this paper, we address this problem and develop a sound mathematical basis to characterize networks in the presence of uncertain interactions. Using our mathematical representation, we develop a method that can accurately describe the degree distribution of such networks. We also take one more step and extend our method to accurately compute the joint-degree distributions of node pairs connected by edges. The number of possible network topologies grows exponentially with the number of uncertain interactions. However, the mathematical model we develop allows us to compute these degree distributions in polynomial time in the number of interactions. Our method works quickly even for entire protein-protein interaction (PPI) networks. It also helps us find an adequate mathematical model using MLE. We perform a comparative study of node-degree and joint-degree distributions in two types of biological networks: the classical deterministic networks and the more flexible probabilistic networks. Our results confirm that power-law and log-normal models best describe degree distributions for both probabilistic and deterministic networks. Moreover, the inverse correlation of degrees of neighboring nodes shows that, in probabilistic networks, nodes with large number of interactions prefer to interact with those with small number of interactions more frequently than expected. We also show that probabilistic networks are more robust for node-degree distribution computation than the deterministic ones. Availability: all the data sets used, the so- tware implemented and the alignments found in this paper are available at >http://bioinformatics.cise.ufl.edu/projects/probNet/. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Decomposition of Flux Distributions into Metabolic Pathways

    Publication Year: 2013 , Page(s): 984 - 993
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (647 KB)  

    Genome-scale reconstructions are often used for studying relationships between fundamental components of a metabolic system. In this study, we develop a novel computational method for analyzing predicted flux distributions for metabolic reconstructions. Because chemical reactions may have multiple reactants and products, a directed hypergraph where hyperarcs may have multiple tail vertices and head vertices is a more appropriate representation of the metabolic network than a conventional network. We use this view to represent predicted flux distributions by maximum generalized flows on hypergraphs. We then demonstrate that the generalized hyperflow problem may be transformed to an equivalent network flow problem with side constraints. This transformation allows a flux to be decomposed into chains of reactions. Subsequent analysis of these chains helps to characterize active pathways in a flux distribution. Such characterizations facilitate comparisons of flux distributions for different environmental conditions. The proposed method is applied to compare predicted flux distributions for Salmonella typhimurium to study changes in metabolism that cause enhanced virulence during a space flight. The differences between flux distributions corresponding to normal and enhanced virulence states confirm previous observations concerning infection mechanisms and suggest new pathways for exploration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing Template-Free Predictor for Targeting Protein-Ligand Binding Sites with Classifier Ensemble and Spatial Clustering

    Publication Year: 2013 , Page(s): 994 - 1008
    Cited by:  Papers (1)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2088 KB)  

    Accurately identifying the protein-ligand binding sites or pockets is of significant importance for both protein function analysis and drug design. Although much progress has been made, challenges remain, especially when the 3D structures of target proteins are not available or no homology templates can be found in the library, where the template-based methods are hard to be applied. In this paper, we report a new ligand-specific template-free predictor called TargetS for targeting protein-ligand binding sites from primary sequences. TargetS first predicts the binding residues along the sequence with ligand-specific strategy and then further identifies the binding sites from the predicted binding residues through a recursive spatial clustering algorithm. Protein evolutionary information, predicted protein secondary structure, and ligand-specific binding propensities of residues are combined to construct discriminative features; an improved AdaBoost classifier ensemble scheme based on random undersampling is proposed to deal with the serious imbalance problem between positive (binding) and negative (nonbinding) samples. Experimental results demonstrate that TargetS achieves high performances and outperforms many existing predictors. TargetS web server and data sets are freely available at: http://www.csbio.sjtu.edu.cn/bioinf/TargetS/ for academic use. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GeneOnEarth: Fitting Genetic PC Plots on the Globe

    Publication Year: 2013 , Page(s): 1009 - 1016
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (863 KB) |  | HTML iconHTML  

    Principal component (PC) plots have become widely used to summarize genetic variation of individuals in a sample. The similarity between genetic distance in PC plots and geographical distance has shown to be quite impressive. However, in most situations, individual ancestral origins are not precisely known or they are heterogeneously distributed; hence, they are hardly linked to a geographical area. We have developed GeneOnEarth, a user-friendly web-based tool to help geneticists to understand whether a linear isolation-by-distance model may apply to a genetic data set; thus, genetic distances among a set of individuals resemble geographical distances among their origins. Its main goal is to allow users to first apply a by-view Procrustes method to visually learn whether this model holds. To do that, the user can choose the exact geographical area from an on line 2D or 3D world map by using, respectively, Google Maps or Google Earth, and rotate, flip, and resize the images. GeneOnEarth can also compute the optimal rotation angle using Procrustes analysis and assess statistical evidence of similarity when a different rotation angle has been chosen by the user. An online version of GeneOnEarth is available for testing and using purposes at >http://bios.ugr.es/GeneOnEarth. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Identification of DNA-Binding and Protein-Binding Proteins Using Enhanced Graph Wavelet Features

    Publication Year: 2013 , Page(s): 1017 - 1031
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4445 KB) |  | HTML iconHTML  

    Interactions between biomolecules play an essential role in various biological processes. For predicting DNA-binding or protein-binding proteins, many machine-learning-based techniques have used various types of features to represent the interface of the complexes, but they only deal with the properties of a single atom in the interface and do not take into account the information of neighborhood atoms directly. This paper proposes a new feature representation method for biomolecular interfaces based on the theory of graph wavelet. The enhanced graph wavelet features (EGWF) provides an effective way to characterize interface feature through adding physicochemical features and exploiting a graph wavelet formulation. Particularly, graph wavelet condenses the information around the center atom, and thus enhances the discrimination of features of biomolecule binding proteins in the feature space. Experiment results show that EGWF performs effectively for predicting DNA-binding and protein-binding proteins in terms of Matthew's correlation coefficient (MCC) score and the area value under the receiver operating characteristic curve (AUC). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pareto Optimality in Organelle Energy Metabolism Analysis

    Publication Year: 2013 , Page(s): 1032 - 1044
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3102 KB) |  | HTML iconHTML  

    In low and high eukaryotes, energy is collected or transformed in compartments, the organelles. The rich variety of size, characteristics, and density of the organelles makes it difficult to build a general picture. In this paper, we make use of the Pareto-front analysis to investigate the optimization of energy metabolism in mitochondria and chloroplasts. Using the Pareto optimality principle, we compare models of organelle metabolism on the basis of single- and multiobjective optimization, approximation techniques (the Bayesian Automatic Relevance Determination), robustness, and pathway sensitivity analysis. Finally, we report the first analysis of the metabolic model for the hydrogenosome of Trichomonas vaginalis, which is found in several protozoan parasites. Our analysis has shown the importance of the Pareto optimality for such comparison and for insights into the evolution of the metabolism from cytoplasmic to organelle bound, involving a model order reduction. We report that Pareto fronts represent an asymptotic analysis useful to describe the metabolism of an organism aimed at maximizing concurrently two or more metabolite concentrations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Protein Function Prediction Using Multilabel Ensemble Classification

    Publication Year: 2013 , Page(s): 1045 - 1057
    Cited by:  Papers (2)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1493 KB) |  | HTML iconHTML  

    High-throughput experimental techniques produce several kinds of heterogeneous proteomic and genomic data sets. To computationally annotate proteins, it is necessary and promising to integrate these heterogeneous data sources. Some methods transform these data sources into different kernels or feature representations. Next, these kernels are linearly (or nonlinearly) combined into a composite kernel. The composite kernel is utilized to develop a predictive model to infer the function of proteins. A protein can have multiple roles and functions (or labels). Therefore, multilabel learning methods are also adapted for protein function prediction. We develop a transductive multilabel classifier (TMC) to predict multiple functions of proteins using several unlabeled proteins. We also propose a method called transductive multilabel ensemble classifier (TMEC) for integrating the different data sources using an ensemble approach. The TMEC trains a graph-based multilabel classifier on each single data source, and then combines the predictions of the individual classifiers. We use a directed birelational graph to capture the relationships between pairs of proteins, between pairs of functions, and between proteins and functions. We evaluate the effectiveness of the TMC and TMEC to predict the functions of proteins on three benchmarks. We show that our approaches perform better than recently proposed protein function prediction methods on composite and multiple kernels. The code, data sets used in this paper and supplemental material are available at https://sites.google.com/site/guoxian85/tmec. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temporal Logics for Phylogenetic Analysis via Model Checking

    Publication Year: 2013 , Page(s): 1058 - 1070
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (944 KB) |  | HTML iconHTML  

    The need for general-purpose algorithms for studying biological properties in phylogenetics motivates research into formal verification frameworks. Researchers can focus their efforts exclusively on evolution trees and property specifications. To this end, model checking, a mature automated verification technique originating in computer science, is applied to phylogenetic analysis. Our approach is based on three cornerstones: a logical modeling of the evolution with transition systems; the specification of both phylogenetic properties and trees using flexible temporal logic formulas; and the verification of the latter by means of automated computer tools. The most conspicuous result is the inception of a formal framework which allows for a symbolic manipulation of biological data (based on the codification of the taxa). Additionally, different logical models of evolution can be considered, complex properties can be specified in terms of the logical composition of others, and the refinement of unfulfilled properties as well as the discovery of new properties can be undertaken by exploiting the verification results. Some experimental results using a symbolic model verifier support the feasibility of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application of Dempster-Schafer Method in Family-Based Association Studies

    Publication Year: 2013 , Page(s): 1071 - 1075
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1156 KB) |  | HTML iconHTML  

    In experiments designed for family-based association studies, methods such as transmission disequilibrium test require large number of trios to identify single-nucleotide polymorphisms associated with the disease. However, unavailability of a large number of trios is the Achilles' heel of many complex diseases, especially for late-onset diseases. In this paper, we propose a novel approach to this problem by means of the Dempster-Shafer method. The simulation studies show that the Dempster-Shafer method has a promising overall performance, in identifying single-nucleotide polymorphisms in the correct association class, as it has 90 percent accuracy even with 60 trios. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hamiltonian Walks of Phylogenetic Treespaces

    Publication Year: 2013 , Page(s): 1076 - 1079
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (192 KB) |  | HTML iconHTML  

    We answer Bryant's combinatorial challenge on minimal walks of phylogenetic treespace under the nearest-neighbor interchange (NNI) metric. We show that the shortest path through the NNI-treespace of n-leaf trees is Hamiltonian for all n. That is, there is a minimal path that visits all binary trees exactly once, under NNI moves. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

This bimonthly publishes archival research results related to the algorithmic, mathematical, statistical, and computational methods that are central in bioinformatics and computational biology.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Ying Xu
University of Georgia
xyn@bmb.uga.edu

Associate Editor-in-Chief
Dong Xu
University of Missouri
xudong@missouri.edu