• Editorial from the Editor-in-Chief

• Guest Editors’ Introduction: Selected Papers from ACM-BCB 2013

• MRFy: Remote Homology Detection for Beta-Structural Proteins Using Markov Random Fields and Stochastic Search

We introduce MRFy, a tool for protein remote homology detection that captures beta-strand dependencies in the Markov random field. Over a set of 11 SCOP beta-structural superfamilies, MRFy shows a 14 percent improvement in mean Area Under the Curve for the motif recognition problem as compared to HMMER, 25 percent improvement as compared to RAPTOR, 14 percent improvement as compared to HHPred, and... View full abstract»

• RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information

We introduce RLIMS-P version 2.0, an enhanced rule-based information extraction (IE) system for mining kinase, substrate, and phosphorylation site information from scientific literature. Consisting of natural language processing and IE modules, the system has integrated several new features, including the capability of processing full-text articles and generalizability towards different post-trans... View full abstract»

• Phenotype-Dependent Coexpression Gene Clusters: Application to Normal and Premature Ageing

Hutchinson Gilford progeria syndrome (HGPS) is a rare genetic disease with symptoms of aging at a very early age. Its molecular basis is not entirely clear, although profound gene expression changes have been reported, and there are some known and other presumed overlaps with normal aging process. Identification of genes with agingor HGPS-associated expression changes is thus an important problem.... View full abstract»

• Global Network Alignment in the Context of Aging

Analogous to sequence alignment, network alignment (NA) can be used to transfer biological knowledge across species between conserved network regions. NA faces two algorithmic challenges: 1) Which cost function to use to capture “similarities” between nodes in different networks? 2) Which alignment strategy to use to rapidly identify “high-scoring” alignments from all p... View full abstract»

• Reachability Analysis in Probabilistic Biological Networks

Extra-cellular molecules trigger a response inside the cell by initiating a signal at special membrane receptors (i.e., sources), which is then transmitted to reporters (i.e., targets) through various chains of interactions among proteins. Understanding whether such a signal can reach from membrane receptors to reporters is essential in studying the cell response to extra-cellular events. This pro... View full abstract»

• GLProbs: Aligning Multiple Sequences Adaptively

This paper introduces a simple and effective approach to improve the accuracy of multiple sequence alignment. We use a natural measure to estimate the similarity of the input sequences, and based on this measure, we align the input sequences differently. For example, for inputs with high similarity, we consider the whole sequences and align them globally, while for those with moderately low simila... View full abstract»

• Prediction and Informative Risk Factor Selection of Bone Diseases

With the booming of healthcare industry and the overwhelming amount of electronic health records (EHRs) shared by healthcare institutions and practitioners, we take advantage of EHR data to develop an effective disease risk management model that not only models the progression of the disease, but also predicts the risk of the disease for early disease control or prevention. Existing models for ans... View full abstract»

• Capturing Uncertainty by Modeling Local Transposon Insertion Frequencies Improves Discrimination of Essential Genes

Transposon mutagenesis experiments enable the identification of essential genes in bacteria. Deep-sequencing of mutant libraries provides a large amount of high-resolution data on essentiality. Statistical methods developed to analyze this data have traditionally assumed that the probability of observing a transposon insertion is the same across the genome. This assumption, however, is inconsisten... View full abstract»

• A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction

Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that h... View full abstract»

• A Hierarchical Clustering Method of Selecting Kernel SNP to Unify Informative SNP and Tag SNP

Various strategies can be used to select representative single nucleotide polymorphisms (SNPs) from a large number of SNPs, such as tag SNP for haplotype coverage and informative SNP for haplotype reconstruction, respectively. Representative SNPs are not only instrumental in reducing the cost of genotyping, but also serve an important function in narrowing the combinatorial space in epistasis anal... View full abstract»

• A Maximum A Posteriori Probability and Time-Varying Approach for Inferring Gene Regulatory Networks from Time Course Gene Microarray Data

Unlike most conventional techniques with static model assumption, this paper aims to estimate the time-varying model parameters and identify significant genes involved at different timepoints from time course gene microarray data. We first formulate the parameter identification problem as a new maximum a posteriori probability estimation problem so that prior information can be incorporated as reg... View full abstract»

• An Algorithm for Motif Discovery with Iteration on Lengths of Motifs

Analysis of DNA sequence motifs is becoming increasingly important in the study of gene regulation, and the identification of motif in DNA sequences is a complex problem in computational biology. Motif discovery has attracted the attention of more and more researchers, and varieties of algorithms have been proposed. Most existing motif discovery algorithms fix the motif's length as one of the inpu... View full abstract»

• Discovering Binding Cores in Protein-DNA Binding Using Association Rule Mining with Statistical Measures

Understanding binding cores is of fundamental importance in deciphering Protein-DNA (TF-TFBS) binding and for the deep understanding of gene regulation. Traditionally, binding cores are identified in resolved high-resolution 3D structures. However, it is expensive, labor-intensive and time-consuming to obtain these structures. Hence, it is promising to discover binding cores computationally on a l... View full abstract»

• Gene Tree Diameter for Deep Coalescence

The deep coalescence cost accounts for discord caused by deep coalescence between a gene tree and a species tree. It is a major concern that the diameter of a gene tree (the tree's maximum deep coalescence cost across all species trees) depends on its topology, which can largely obfuscate phylogenetic studies. While this bias can be compensated by normalizing the deep coalescence cost using diamet... View full abstract»

• Heterogeneous Cloud Framework for Big Data Genome Sequencing

The next generation genome sequencing problem with short (long) reads is an emerging field in numerous scientific and big data research domains. However, data sizes and ease of access for scientific researchers are growing and most current methodologies rely on one acceleration approach and so cannot meet the requirements imposed by explosive data scales and complexities. In this paper, we propose... View full abstract»

• Identification of Protein Complexes Using Weighted PageRank-Nibble Algorithm and Core-Attachment Structure

Protein complexes play a significant role in understanding the underlying mechanism of most cellular functions. Recently, many researchers have explored computational methods to identify protein complexes from protein-protein interaction (PPI) networks. One group of researchers focus on detecting local dense subgraphs which correspond to protein complexes by considering local neighbors. The drawba... View full abstract»

• Identifying Affinity Classes of Inorganic Materials Binding Sequences via a Graph-Based Model

Rapid advances in bionanotechnology have recently generated growing interest in identifying peptides that bind to inorganic materials and classifying them based on their inorganic material affinities. However, there are some distinct characteristics of inorganic materials binding sequence data that limit the performance of many widely-used classification methods when applied to this problem. In th... View full abstract»

• Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware

Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures. Due to the ever increasing sizes of sequence databases,... View full abstract»

• Predicting Protein Function Using Multiple Kernels

High-throughput experimental techniques provide a wide variety of heterogeneous proteomic data sources. To exploit the information spread across multiple sources for protein function prediction, these data sources are transformed into kernels and then integrated into a composite kernel. Several methods first optimize the weights on these kernels to produce a composite kernel, and then train a clas... View full abstract»

• Tractable Cases of $(*,2)$ -Bounded Parsimony Haplotyping

Parsimony haplotyping is the problem of finding a set of haplotypes of minimum cardinality that explains a given set of genotypes, where a genotype is explained by two haplotypes if it can be obtained as a combination of the two. This problem is NP-complete in the general case, but polynomially solvable for (k, l)-bounded instances for certain k and l. Here, k denotes the maximum number of ambiguo... View full abstract»

• 2014 Reviewers List*

