• ### EIC Editorial

Publication Year: 2011, Page(s): 1
HTML
• ### A Fast Algorithm for Computing Geodesic Distances in Tree Space

Publication Year: 2011, Page(s):2 - 13
Cited by:  Papers (37)
HTML

Comparing and computing distances between phylogenetic trees are important biological problems, especially for models where edge lengths play an important role. The geodesic distance measure between two phylogenetic trees with edge lengths is the length of the shortest path between them in the continuous tree space introduced by Billera, Holmes, and Vogtmann. This tree space provides a powerful to... View full abstract»

• ### A General Framework for Analyzing Data from Two Short Time-Series Microarray Experiments

Publication Year: 2011, Page(s):14 - 26
Cited by:  Papers (3)
HTML Media

We propose a general theoretical framework for analyzing differentially expressed genes and behavior patterns from two homogenous short time-course data. The framework generalizes the recently proposed Hilbert-Schmidt Independence Criterion (HSIC)-based framework adapting it to the time-series scenario by utilizing tensor analysis for data transformation. The proposed framework is effective in yie... View full abstract»

• ### Efficient Formulations for Exact Stochastic Simulation of Chemical Systems

Publication Year: 2011, Page(s):27 - 35
Cited by:  Papers (20)
HTML

One can generate trajectories to simulate a system of chemical reactions using either Gillespie's direct method or Gibson and Bruck's next reaction method. Because one usually needs many trajectories to understand the dynamics of a system, performance is important. In this paper, we present new formulations of these methods that improve the computational complexity of the algorithms. We present op... View full abstract»

• ### Estimating Haplotype Frequencies by Combining Data from Large DNA Pools with Database Information

Publication Year: 2011, Page(s):36 - 44
Cited by:  Papers (5)
HTML

We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, ... View full abstract»

• ### $F^2$Dock: Fast Fourier Protein-Protein Docking

Publication Year: 2011, Page(s):45 - 58
Cited by:  Papers (23)
HTML

The functions of proteins are often realized through their mutual interactions. Determining a relative transformation for a pair of proteins and their conformations which form a stable complex, reproducible in nature, is known as docking. It is an important step in drug design, structure determination, and understanding function and structure relationships. In this paper, we extend our nonuniform ... View full abstract»

• ### Fast Surface-Based Travel Depth Estimation Algorithm for Macromolecule Surface Shape Description

Publication Year: 2011, Page(s):59 - 68
Cited by:  Papers (4)
HTML

Travel Depth, introduced by Coleman and Sharp in 2006, is a physical interpretation of molecular depth, a term frequently used to describe the shape of a molecular active site or binding site. Travel Depth can be seen as the physical distance a solvent molecule would have to travel from a point of the surface, i.e., the Solvent-Excluded Surface (SES), to its convex hull. Existing algorithms provid... View full abstract»

• ### Finding Significant Matches of Position Weight Matrices in Linear Time

Publication Year: 2011, Page(s):69 - 79
Cited by:  Papers (4)
HTML

Position weight matrices are an important method for modeling signals or motifs in biological sequences, both in DNA and protein contexts. In this paper, we present fast algorithms for the problem of finding significant matches of such matrices. Our algorithms are of the online type, and they generalize classical multipattern matching, filtering, and superalphabet techniques of combinatorial strin... View full abstract»

• ### Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set

Publication Year: 2011, Page(s):80 - 93
Cited by:  Papers (3)
HTML

Obtaining satisfactory results with neural networks depends on the availability of large data samples. The use of small training sets generally reduces performance. Most classical Quantitative Structure-Activity Relationship (QSAR) studies for a specific enzyme system have been performed on small data sets. We focus on the neuro-fuzzy prediction of biological activities of HIV-1 protease inhibitor... View full abstract»

• ### Genetic Networks and Soft Computing

Publication Year: 2011, Page(s):94 - 107
Cited by:  Papers (24)
HTML

The analysis of gene regulatory networks provides enormous information on various fundamental cellular processes involving growth, development, hormone secretion, and cellular communication. Their extraction from available gene expression profiles is a challenging problem. Such reverse engineering of genetic networks offers insight into cellular activity toward prediction of adverse effects of new... View full abstract»

• ### Identification and Modeling of Genes with Diurnal Oscillations from Microarray Time Series Data

Publication Year: 2011, Page(s):108 - 121
Cited by:  Papers (5)
HTML

Behavior of living organisms is strongly modulated by the day and night cycle giving rise to a cyclic pattern of activities. Such a pattern helps the organisms to coordinate their activities and maintain a balance between what could be performed during the "day” and what could be relegated to the "night.” This cyclic pattern, called the "Circadian Rhythm,” is a biological phen... View full abstract»

• ### Improving the Computational Efficiency of Recursive Cluster Elimination for Gene Selection

Publication Year: 2011, Page(s):122 - 129
Cited by:  Papers (13)
HTML

The gene expression data are usually provided with a large number of genes and a relatively small number of samples, which brings a lot of new challenges. Selecting those informative genes becomes the main issue in microarray data analysis. Recursive cluster elimination based on support vector machine (SVM-RCE) has shown the better classification accuracy on some microarray data sets than recursiv... View full abstract»

• ### Influence of Prior Knowledge in Constraint-Based Learning of Gene Regulatory Networks

Publication Year: 2011, Page(s):130 - 142
Cited by:  Papers (9)
HTML

Constraint-based structure learning algorithms generally perform well on sparse graphs. Although sparsity is not uncommon, there are some domains where the underlying graph can have some dense regions; one of these domains is gene regulatory networks, which is the main motivation to undertake the study described in this paper. We propose a new constraint-based algorithm that can both increase the ... View full abstract»

• ### Information-Theoretic Model of Evolution over Protein Communication Channel

Publication Year: 2011, Page(s):143 - 151
Cited by:  Papers (3)
HTML

In this paper, we propose a communication model of evolution and investigate its information-theoretic bounds. The process of evolution is modeled as the retransmission of information over a protein communication channel, where the transmitted message is the organism's proteome encoded in the DNA. We compute the capacity and the rate distortion functions of the protein communication system for the... View full abstract»

• Regular Papers
• ### Learning Genetic Regulatory Network Connectivity from Time Series Data

Publication Year: 2011, Page(s):152 - 165
Cited by:  Papers (8)
HTML

Recent experimental advances facilitate the collection of time series data that indicate which genes in a cell are expressed. This information can be used to understand the genetic regulatory network that generates the data. Typically, Bayesian analysis approaches are applied which neglect the time series nature of the experimental data, have difficulty in determining the direction of causality, a... View full abstract»

• ### Model Reduction Using Piecewise-Linear Approximations Preserves Dynamic Properties of the Carbon Starvation Response in Escherichia coli

Publication Year: 2011, Page(s):166 - 181
Cited by:  Papers (12)
HTML Media

The adaptation of the bacterium Escherichia coli to carbon starvation is controlled by a large network of biochemical reactions involving genes, mRNAs, proteins, and signalling molecules. The dynamics of these networks is difficult to analyze, notably due to a lack of quantitative information on parameter values. To overcome these limitations, model reduction approaches based on quasi-steady-state... View full abstract»

• ### New Methods for Inference of Local Tree Topologies with Recombinant SNP Sequences in Populations

Publication Year: 2011, Page(s):182 - 193
Cited by:  Papers (1)
HTML

Large amount of population-scale genetic variation data are being collected in populations. One potentially important biological problem is to infer the population genealogical history from these genetic variation data. Partly due to recombination, genealogical history of a set of DNA sequences in a population usually cannot be represented by a single tree. Instead, genealogy is better represented... View full abstract»

• ### Pairwise Statistical Significance of Local Sequence Alignment Using Sequence-Specific and Position-Specific Substitution Matrices

Publication Year: 2011, Page(s):194 - 205
Cited by:  Papers (12)
HTML

Pairwise sequence alignment is a central problem in bioinformatics, which forms the basis of various other applications. Two related sequences are expected to have a high alignment score, but relatedness is usually judged by statistical significance rather than by alignment score. Recently, it was shown that pairwise statistical significance gives promising results as an alternative to database st... View full abstract»

• ### Predicting Metabolic Fluxes Using Gene Expression Differences As Constraints

Publication Year: 2011, Page(s):206 - 216
Cited by:  Papers (25)
HTML

A standard approach to estimate intracellular fluxes on a genome-wide scale is flux-balance analysis (FBA), which optimizes an objective function subject to constraints on (relations between) fluxes. The performance of FBA models heavily depends on the relevance of the formulated objective function and the completeness of the defined constraints. Previous studies indicated that FBA predictions can... View full abstract»

• ### Probabilistic Analysis of Probe Reliability in Differential Gene Expression Studies with Short Oligonucleotide Arrays

Publication Year: 2011, Page(s):217 - 225
Cited by:  Papers (17)
HTML Media

Probe defects are a major source of noise in gene expression studies. While existing approaches detect noisy probes based on external information such as genomic alignments, we introduce and validate a targeted probabilistic method for analyzing probe reliability directly from expression data and independently of the noise source. This provides insights into the various sources of probe-level nois... View full abstract»

• ### Topology Improves Phylogenetic Motif Functional Site Predictions

Publication Year: 2011, Page(s):226 - 233
Cited by:  Papers (3)
HTML

Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that o... View full abstract»

• ### Twin Removal in Genetic Algorithms for Protein Structure Prediction Using Low-Resolution Model

Publication Year: 2011, Page(s):234 - 245
Cited by:  Papers (22)
HTML

This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Wi... View full abstract»

• ### A Weighted Principal Component Analysis and Its Application to Gene Expression Data

Publication Year: 2011, Page(s):246 - 252
Cited by:  Papers (13)  |  Patents (1)
HTML

In this work, we introduce in the first part new developments in Principal Component Analysis (PCA) and in the second part a new method to select variables (genes in our application). Our focus is on problems where the values taken by each variable do not all have the same importance and where the data may be contaminated with noise and contain outliers, as is the case with microarray data. The us... View full abstract»

