• ### Bi-level error correction for PacBio long reads

The latest sequencing technologies such as the Pacific Biosciences (PacBio) and Oxford Nanopore machines can generate long reads at the length of thousands of nucleic bases which is much longer than the reads at the length of hundreds generated by Illumina machines. However, these long reads are prone to much higher error rates, for example 15%, making downstream analysis and applications very dif... View full abstract»

• ### Detecting Population-differentiation Copy Number Variants in Human Population Tree by Sparse Group Selection

Copy-number variants (CNVs) account for a substantial proportion of human genetic variations. Understanding the CNV diversities across populations is a computational challenge because CNV patterns are often present in several related populations and only occur in a subgroup of individuals within each of the population. This paper introduces a tree-guided sparse group selection algorithm (treeSGS) ... View full abstract»

• ### A sparse regression method for group-wise feature selection with false discovery rate control

The method of Sorted L-One Penalized Estimation, or SLOPE, is a sparse regression method recently introduced by Bogdan et. al. [1]. It can be used to identify significant predictor variables in a linear model that may have more unknown parameters than observations. When the correlations between predictor variables are small, the SLOPE method is shown to successfully control the false discovery rat... View full abstract»

• ### Bayesian Network Construction and Genotype-Phenotype Inference Using GWAS Statistics

Genome-wide association studies (GWASs) have received an increasing attention to understand how genetic variation affects different human traits. In this paper, we study whether and to what extend exploiting the GWAS statistics can be used for inferring private information about a human individual. We first provide a method to construct a three-layered Bayesian network explicitly revealing the con... View full abstract»

• ### Estrogenic active stilbene derivatives as anti-cancer agents: A DFT and QSAR study

Exploring different quantum chemical quantities for lead compounds is ongoing approach in identifying crucial structural features in their biological activities. Herein, quantum chemical calculations are reported for selected estrogenic stilbene derivatives using density functional theory (DFT) with B3LYP functional and 6-311++G** basis set. In addition, specific activity-related geo... View full abstract»

• ### A Feature Sampling Strategy for Analysis of High Dimensional Genomic Data

With the development of high throughput technology, it has become feasible and common to profile tens of thousands of gene activities simultaneously. These genomic data typically have sample size of hundreds or fewer, which is much less than the feature size (number of genes). In addition, the genes, in particular the ones from the same pathway, are often highly correlated. These issues impose gre... View full abstract»

• ### An RJMCMC-Based Method for Tracking and Resolving Collisions of Drosophila Larvae

Drosophila melanogaster is an important model organism for ongoing research in neuro- and behavioral biology. Especially the locomotion analysis has become an integral part of such studies and thus elaborated automated tracking systems have been proposed in the past. However, most of these approaches share the inability to precisely segment the contours of colliding animals leading to the absence ... View full abstract»

• ### The Robust Classification Model Based on Combinatorial Features

Analyzing the disease data from the view of combinatorial features may better characterize the disease phenotype. In this study, a novel method is proposed to construct feature combinations and a classification model (CFC-CM) by mining key feature relationships. CFC-CM iteratively tests for differences in the feature relationship between different groups. To do this, it uses a modified View full abstract»

• ### Theory and A Heuristic for the Minimum Path Flow Decomposition Problem

Motivated by multiple genome assembly problems and other applications, we study the following minimum path flow decomposition problem: given a directed acyclic graph $G=(V,E)$ with source $s$ and sink $t$ and a flow $f$, compute a set of $s-t$ paths ... View full abstract»

• ### Constructing Pathway-based Priors Within a Gaussian Mixture Model for Bayesian Regression and Classification

Gene-expression-based classification and regression are major concerns in translational genomics. If the feature-label distribution is known, then an optimal classifier can be derived. If the regressor-target distribution is known, then an optimal regression function can be derived. In practice, neither is known, data must be employed, and, for small samples, prior knowledge concerning the feature... View full abstract»

• ### Protein-protein interaction identification using a similarity-constrained graph model

Protein-protein interaction (PPI) identification is an important task in text mining. Most PPI detection systems make predictions solely based on evidence within a single sentence and often suffer from the heavy burden of manual annotation. This paper approaches PPI detection task from a different paradigm by investigating the context of protein pairs collected from a large corpus and their relati... View full abstract»

• ### Computing Minimum Reaction Modifications in a Boolean Metabolic Network

In metabolic network modification, we newly add enzymes or/and knock-out genes to maximize the biomass production with minimum side-effect. Although this problem has been studied for various problem settings via mathematical models including flux balance analysis, elementary mode, and Boolean models, some important problem settings still remain to be studied. In this paper, we consider Boolean Rea... View full abstract»

• ### Early Diagnosis of Alzheimer's Disease Based on Resting-State Brain Networks and Deep Learning

Computerized healthcare has undergone rapid development thanks to the advances in medical imaging and machine learning technologies. Especially, recent progress on deep learning opens a new era for multimedia based clinical decision support. In this paper, we use deep learning with brain network and clinical relevant text information to make early diagnosis of Alzheimer's Disease (AD). The ... View full abstract»

• ### Meta-path methods for prioritizing candidate disease miRNAs

MicroRNAs (miRNAs) play critical roles in regulating gene expression at post-transcriptional levels. Predicting potential miRNAdisease association is beneficial not only to explore the pathogenesis of diseases, but also to understand biological processes. In this work, we propose two methods that can effectively predict potential miRNAdisease associations using our reconstructed miRNA and disease ... View full abstract»

• ### Efficient Detection of Communities in Biological Bipartite Networks

Methods to efficiently uncover community structures are required in a number of biological applications where observing tightly-knit groups of vertices (“communities”) can offer insights into the structural and functional building blocks. Classical applications of community detection have largely focused on unipartite networks. However, due to increased availability of biological dat... View full abstract»

• ### Elucidating Genome-Wide Protein-RNA Interactions using Differential Evolution

RNA-binding proteins (RBPs) play an important role in the post-transcriptional control of RNAs, such as splicing, polyadenylation, mRNA stabilization, mRNA localization, and translation. Thanks to the recent breakthrough, non-negative matrix factorization (NMF) has been developed to combine multiple data sources to discover non-overlapping and class-specific RNA binding patterns. However, several ... View full abstract»

• ### DNRLMF-MDA:Predicting microRNA-disease associations based on similarities of microRNAs and diseases

Discovering miRNA-disease associations is beneficial to understanding disease mechanisms, developing drugs, and treating complex diseases. We all know that discovering the miRNA-disease associations via biological experiments is a time-consuming and expensive process. Alternatively, computational models could provide a low-cost and high-efficiency way for predicting miRNA-disease associations. In ... View full abstract»

• ### Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling

As computational modeling, simulation, and predictions are becoming integral parts of biomedical pipelines, it behooves us to emphasize the reliability of the computational protocol. For any reported quantity of interest (QOI), one must also compute and report a measure of the uncertainty or error associated with the QOI. This is especially important in molecular modeling, since in most practical ... View full abstract»

• ### Moment-Based Parameter Estimation for Stochastic Reaction Networks in Equilibrium

Calibrating parameters is a crucial problem within quantitative modeling approaches to reaction networks. Existing methods for stochastic models rely either on statistical sampling or can only be applied to small systems. Here we present an inference procedure for stochastic models in equilibrium that is based on a moment matching scheme with optimal weighting and that can be used with high-throug... View full abstract»

• ### Human pathway-based disease network

Constructing disease-disease similarity network is important in elucidating the associations between the origin and molecular mechanism of diseases, and in researching disease function and medical research. In this paper, we use a high-quality protein interaction network and a collection of pathway databases to construct a Human Pathway-based Disease Network (HPDN) to explore the relationship betw... View full abstract»

• ### RF-NR: Random forest based approach for improved classification of Nuclear Receptors

The Nuclear Receptor (NR) superfamily plays an important role in key biological, developmental and physiological processes. Developing a method for the classification of NR proteins is an important step towards understanding the structure and functions of the newly discovered NR protein. The recent studies on NR classification are either unable to achieve optimum accuracy or are not designed for a... View full abstract»

• ### Sequence-based prediction of putative transcription factor binding sites in DNA sequences of any length

A transcription factor (TF) is a protein that regulates gene expression by binding to specific DNA sequences. Despite the recent advances in experimental techniques for identifying transcription factor binding sites (TFBS) in DNA sequences, a large number of TFBS are to be unveiled in many species. Several computational methods developed for predicting TFBS in DNA are tissue- or species-specific m... View full abstract»

• ### ASSA-PBN: A Toolbox for Probabilistic Boolean Networks

As a well-established computational framework, probabilistic Boolean networks (PBNs) are widely used for modelling, simulation, and analysis of biological systems. To analyse the steady-state dynamics of PBNs is of crucial importance to explore the characteristics of biological systems. However, the analysis of large PBNs, which often arise in systems biology, is prone to the infamous state-space ... View full abstract»

• ### Quasi-Newton Stochastic Optimization Algorithm for Parameter Estimation of a Stochastic Model of the Budding Yeast Cell Cycle

Parameter estimation in discrete or continuous deterministic cell cycle models is challenging for several reasons, including the nature of what can be observed, and the accuracy and quantity of those observations. The challenge is even greater for stochastic models, where the number of simulations and amount of empirical data must be even larger to obtain statistically valid parameter estimates. T... View full abstract»

• ### A Bi-objective RNN Model to Reconstruct Gene Regulatory Network: A Modified Multi-objective Simulated Annealing Approach

Gene Regulatory Network (GRN) is a virtual network in a cellular context of an organism, comprising a set of genes and their internal relationships to regulate protein production rate (gene expression level) of each other through coded proteins. Computational Reconstruction of GRN from gene expression data is a widely-applied research area. Recurrent Neural Network (RNN) is a useful modeling schem... View full abstract»

