• ### Editorial from the Editor-in-Chief

Publication Year: 2014, Page(s):1 - 4
| |PDF (304 KB)
• ### Guest Editorial for the International Conference on Genome Informatics (GIW 2013)

Publication Year: 2014, Page(s):5 - 6
Cited by:  Papers (1)
| |PDF (92 KB) | HTML
• ### Coupling Graphs, Efficient Algorithms and B-Cell Epitope Prediction

Publication Year: 2014, Page(s):7 - 16
Cited by:  Papers (1)
| |PDF (973 KB) | HTML

Coupling graphs are newly introduced in this paper to meet many application needs particularly in the field of bioinformatics. A coupling graph is a two-layer graph complex, in which each node from one layer of the graph complex has at least one connection with the nodes in the other layer, and vice versa. The coupling graph model is sufficiently powerful to capture strong and inherent association... View full abstract»

• ### Quantifying Significance of MHC II Residues

Publication Year: 2014, Page(s):17 - 25
| |PDF (1219 KB) | HTML

The major histocompatibility complex (MHC), a cell-surface protein mediating immune recognition, plays important roles in the immune response system of all higher vertebrates. MHC molecules are highly polymorphic and they are grouped into serotypes according to the specificity of the response. It is a common belief that a protein sequence determines its three dimensional structure and function. He... View full abstract»

• ### Residue-Specific Side-Chain Polymorphisms via Particle Belief Propagation

Publication Year: 2014, Page(s):33 - 41
Cited by:  Papers (2)
| |PDF (781 KB) | HTML

Protein side chains populate diverse conformational ensembles in crystals. Despite much evidence that there is widespread conformational polymorphism in protein side chains, most of the X-ray crystallography data are modeled by single conformations in the Protein Data Bank. The ability to extract or to predict these conformational polymorphisms is of crucial importance, as it facilitates deeper un... View full abstract»

• ### A New Unsupervised Binning Approach for Metagenomic Sequences Based on N-grams and Automatic Feature Weighting

Publication Year: 2014, Page(s):42 - 54
Cited by:  Papers (6)
| |PDF (2338 KB) | HTML

The rapid development of high-throughput technologies enables researchers to sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these sequence reads into different species or taxonomical classes is a crucial step for metagenomic analysis, which is referred to as binning of metagenomic data. Most traditional binning methods rely on known ... View full abstract»

• ### Intelligent Consensus Modeling for Proline Cis-Trans Isomerization Prediction

Publication Year: 2014, Page(s):26 - 32
Cited by:  Papers (2)
| |PDF (930 KB) | HTML

Proline cis-trans isomerization (CTI) plays a key role in the rate-determining steps of protein folding. Accurate prediction of proline CTI is of great importance for the understanding of protein folding, splicing, cell signaling, and transmembrane active transport in both the human body and animals. Our goal is to develop a state-of-the-art proline CTI predictor based on a biophysically motivated... View full abstract»

• ### Gene Name Disambiguation Using Multi-Scope Species Detection

Publication Year: 2014, Page(s):55 - 62
| |PDF (1132 KB) | HTML

Species detection is an important topic in the text mining field. According to the importance of the research topics (e.g., species assignment to genes and document focus species detection), some studies are dedicated to an individual topic. However, no researcher to date has discussed species detection as a general problem. Therefore, we developed a multi-scope species detection model to identify... View full abstract»

• ### Reliable and Fast Estimation of Recombination Rates by Convergence Diagnosis and Parallel Markov Chain Monte Carlo

Publication Year: 2014, Page(s):63 - 72
Cited by:  Papers (1)
| |PDF (928 KB) | HTML Media

Genetic recombination is an essential event during the process of meiosis resulting in an exchange of segments between paired chromosomes. Estimating recombination rate is crucial for understanding the process of recombination. Experimental methods are normally difficult and limited to small scale estimations. Thus statistical methods using population genetics data are important for large-scale an... View full abstract»

• ### A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data

Publication Year: 2014, Page(s):95 - 115
Cited by:  Papers (14)
| |PDF (2787 KB) | HTML Media

DNA microarray is a powerful technology that can simultaneously determine the levels of thousands of transcripts (generated, for example, from genes/miRNAs) across different experimental conditions or tissue samples. The motto of differential expression analysis is to identify the transcripts whose expressions change significantly across different types of samples or experimental conditions. A num... View full abstract»

• ### Identifying Cis-Regulatory Elements and Modules Using Conditional Random Fields

Publication Year: 2014, Page(s):73 - 82
Cited by:  Papers (2)
| |PDF (726 KB) | HTML

Accurate identification of cis-regulatory elements and their correlated modules is essential for analysis of transcriptional regulation, which is a challenging problem in computational biology. Unsupervised learning has the advantage of compensating for missing annotated data, and is thus promising to be effective to identify cis-regulatory elements and modules. We introduced a Conditional Random ... View full abstract»

• ### Evolution and Controllability of Cancer Networks: A Boolean Perspective

Publication Year: 2014, Page(s):83 - 94
Cited by:  Papers (8)
| |PDF (1419 KB) | HTML

Cancer forms a robust system capable of maintaining stable functioning (cell sustenance and proliferation) despite perturbations. Cancer progresses as stages over time typically with increasing aggressiveness and worsening prognosis. Characterizing these stages and identifying the genes driving transitions between them is critical to understand cancer progression and to develop effective anti-canc... View full abstract»

• ### A New Path Based Hybrid Measure for Gene Ontology Similarity

Publication Year: 2014, Page(s):116 - 127
Cited by:  Papers (11)
| |PDF (1114 KB) | HTML

Gene Ontology (GO) consists of a controlled vocabulary of terms, annotating a gene or gene product, structured in a directed acyclic graph. In the graph, semantic relations connect the terms, that represent the knowledge of functional description and cellular component information of gene products. GO similarity gives us a numerical representation of biological relationship between a gene set, whi... View full abstract»

• ### Constructing a Gene Team Tree in Almost $O$$(n; {rm lg}; n)$ Time

Publication Year: 2014, Page(s):142 - 153
Cited by:  Papers (1)
| |PDF (2028 KB) | HTML

An important model of a conserved gene cluster is called the gene team model, in which a chromosome is defined to be a permutation of distinct genes and a gene team is defined to be a set of genes that appear in two or more species, with the distance between adjacent genes in the team for each chromosome always no more than a certain threshold δ. A gene team tree is a succinct way to repres... View full abstract»

• ### CAMS-RS: Clustering Algorithm for Large-Scale Mass Spectrometry Data Using Restricted Search Space and Intelligent Random Sampling

Publication Year: 2014, Page(s):128 - 141
Cited by:  Papers (6)
| |PDF (2005 KB) | HTML

High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many of them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present an efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra using Restricted Space and S... View full abstract»

• ### Detecting Differentially Coexpressed Genes from Labeled Expression Data: A Brief Review

Publication Year: 2014, Page(s):154 - 167
Cited by:  Papers (6)
| |PDF (2373 KB) | HTML

We review methods for capturing differential coexpression, which can be divided into two cases by the size of gene sets: 1) two paired genes and 2) multiple genes. In the first case, two genes are positively and negatively correlated with each other under one and the other conditions, respectively. In the second case, multiple genes are coexpressed and randomly expressed under one and the other co... View full abstract»

• ### DNA Copy Number Selection Using Robust Structured Sparsity-Inducing Norms

Publication Year: 2014, Page(s):168 - 181
Cited by:  Papers (2)
| |PDF (2855 KB) | HTML

Array comparative genomic hybridization (aCGH) is a newly introduced method for the detection of copy number abnormalities associated with human diseases with special focus on cancer. Specific patterns in DNA copy number variations (CNVs) can be associated with certain disease types and can facilitate prognosis and progress monitoring of the disease. Machine learning techniques have been used to m... View full abstract»

• ### Improved and Promising Identification of Human MicroRNAs by Incorporating a High-Quality Negative Set

Publication Year: 2014, Page(s):192 - 201
Cited by:  Papers (59)
| |PDF (1037 KB) | HTML Media

MicroRNA (miRNA) plays an important role as a regulator in biological processes. Identification of (pre-) miRNAs helps in understanding regulatory processes. Machine learning methods have been designed for pre-miRNA identification. However, most of them cannot provide reliable predictive performances on independent testing data sets. We assumed this is because the training sets, especially the neg... View full abstract»

• ### HIV Haplotype Inference Using a Propagating Dirichlet Process Mixture Model

Publication Year: 2014, Page(s):182 - 191
Cited by:  Papers (15)
| |PDF (897 KB) | HTML Media

This paper presents a new computational technique for the identification of HIV haplotypes. HIV tends to generate many potentially drug-resistant mutants within the HIV-infected patient and being able to identify these different mutants is important for efficient drug administration. With the view of identifying the mutants, we aim at analyzing short deep sequencing data called reads. From a stati... View full abstract»

• ### Incorporation of Biological Pathway Knowledge in the Construction of Priors for Optimal Bayesian Classification

Publication Year: 2014, Page(s):202 - 218
Cited by:  Papers (18)
| |PDF (1596 KB) | HTML Media

Small samples are commonplace in genomic/proteomic classification, the result being inadequate classifier design and poor error estimation. The problem has recently been addressed by utilizing prior knowledge in the form of a prior distribution on an uncertainty class of feature-label distributions. A critical issue remains: how to incorporate biological knowledge into the prior distribution. For ... View full abstract»

• ### Local Exact Pattern Matching for Non-Fixed RNA Structures

Publication Year: 2014, Page(s):219 - 230
Cited by:  Papers (3)
| |PDF (784 KB) | HTML

Detecting local common sequence-structure regions of RNAs is a biologically important problem. Detecting such regions allows biologists to identify functionally relevant similarities between the inspected molecules. We developed dynamic programming algorithms for finding common structure-sequence patterns between two RNAs. The RNAs are given by their sequence and a set of potential base pairs with... View full abstract»

• ### Multiple Sequence Alignment with Hidden Markov Models Learned by Random Drift Particle Swarm Optimization

Publication Year: 2014, Page(s):243 - 257
Cited by:  Papers (7)
| |PDF (2800 KB) | HTML

Hidden Markov Models (HMMs) are powerful tools for multiple sequence alignment (MSA), which is known to be an NP-complete and important problem in bioinformatics. Learning HMMs is a difficult task, and many meta-heuristic methods, including particle swarm optimization (PSO), have been used for that. In this paper, a new variant of PSO, called the random drift particle swarm optimization (RDPSO) al... View full abstract»

• ### Maximizing Deep Coalescence Cost

Publication Year: 2014, Page(s):231 - 242
Cited by:  Papers (4)
| |PDF (1622 KB) | HTML

The minimizing deep coalescence (MDC) problem seeks a species tree that reconciles the given gene trees with the minimum number of deep coalescence events, called deep coalescence (DC) cost. To better assess MDC species trees we investigate into a basic mathematical property of the DC cost, called the diameter. Given a gene tree, a species tree, and a leaf labeling function that assigns leaf-genes... View full abstract»

