• Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection

Publication Year: 2016, Page(s):971 - 989
Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relatively small sample size. Because learning algorithms usually do not work well with this kind of data, a ... View full abstract»

• MiRTDL: A Deep Learning Approach for miRNA Target Prediction

Publication Year: 2016, Page(s):1161 - 1169
MicroRNAs (miRNAs) regulate genes that are associated with various diseases. To better understand miRNAs, the miRNA regulatory mechanism needs to be investigated and the real targets identified. Here, we present miRTDL, a new miRNA target prediction algorithm based on convolutional neural network (CNN). The CNN automatically extracts essential information from the input data rather than completely... View full abstract»

• A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization

Publication Year: 2017, Page(s):76 - 84
An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algor... View full abstract»

• Emerging Security Mechanisms for Medical Cyber Physical Systems

Publication Year: 2016, Page(s):401 - 416
The following decade will witness a surge in remote health-monitoring systems that are based on body-worn monitoring devices. These Medical Cyber Physical Systems (MCPS) will be capable of transmitting the acquired data to a private or public cloud for storage and processing. Machine learning algorithms running in the cloud and processing this data can provide decision support to healthcare profes... View full abstract»

• Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification

Publication Year: 2017, Page(s):64 - 75
Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing. Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features. This kind of problem is called cost-based feature selection. However, most exis... View full abstract»

• Symbiosis-Based Alternative Learning Multi-Swarm Particle Swarm Optimization

Publication Year: 2017, Page(s):4 - 14
Inspired by the ideas from the mutual cooperation of symbiosis in natural ecosystem, this paper proposes a new variant of PSO, named Symbiosis-based Alternative Learning Multi-swarm Particle Swarm Optimization (SALMPSO). A learning probability to select one exemplar out of the center positions, the local best position, and the historical best position including the experience of internal and exter... View full abstract»

• A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

Publication Year: 2012, Page(s):1106 - 1119
A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filte... View full abstract»

• A New Approach for Feature Selection from Microarray Data Based on Mutual Information

Publication Year: 2016, Page(s):1004 - 1015
Mutual information (MI) is a powerful concept for correlation-centric applications. It has been used for feature selection from microarray gene expression data in many works. One of the merits of MI is that, unlike many other heuristic methods, it is based on a mature theoretic foundation. When applied to microarray data, however, it faces some challenges. First, due to the large number of feature... View full abstract»

• A Simple but Powerful Heuristic Method for Accelerating $k$ -Means Clustering of Large-Scale Data in Life Science

Publication Year: 2014, Page(s):681 - 692
K-means clustering has been widely used to gain insight into biological systems from large-scale life science data. To quantify the similarities among biological data sets, Pearson correlation distance and standardized Euclidean distance are used most frequently; however, optimization methods have been largely unexplored. These two distance measurements are equivalent in the sense that they yield ... View full abstract»

• A Cooperative Framework for Fireworks Algorithm

Publication Year: 2017, Page(s):27 - 41
This paper presents a cooperative framework for fireworks algorithm (CoFFWA). A detailed analysis of existing fireworks algorithm (FWA) and its recently developed variants has revealed that (i) the current selection strategy has the drawback that the contribution of the firework with the best fitness (denoted as core firework) overwhelms the contributions of all other fireworks (non-core fireworks... View full abstract»

• Solving NP-Hard Problems with Physarum-Based Ant Colony System

Publication Year: 2017, Page(s):108 - 120
NP-hard problems exist in many real world applications. Ant colony optimization (ACO) algorithms can provide approximate solutions for those NP-hard problems, but the performance of ACO algorithms is significantly reduced due to premature convergence and weak robustness, etc. With these observations in mind, this paper proposes a Physarum-based pheromone matrix optimization strategy in ant colony ... View full abstract»

• Graphical Representation and Similarity Analysis of Protein Sequences Based on Fractal Interpolation

Publication Year: 2017, Page(s):182 - 192
A new graphical representation of protein sequences is introduced in this paper. Nine main physicochemical properties of amino acids were used to obtain a 2D discrete point set for protein sequences by applying principal component analysis. The fractal method was then employed to interpolate discrete points in constructing a graphical representation of protein sequences. Fractal dimension of the p... View full abstract»

• A New Method to Predict RNA Secondary Structure Based on RNA Folding Simulation

Publication Year: 2016, Page(s):990 - 995
RNA plays an important role in various biological processes; hence, it is essential when determining the functions of RNA to research its secondary structures. So far, the accuracy of RNA secondary structure prediction remains an area in need of improvement. This paper presents a novel method for predicting RNA secondary structure based on an RNA folding simulation model. This model assumes that t... View full abstract»

• Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach

Publication Year: 2015, Page(s):928 - 937
Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integ... View full abstract»

• Biclustering algorithms for biological data analysis: a survey

Publication Year: 2004, Page(s):24 - 45
A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when cluste... View full abstract»

• A Gene Selection Method for Microarray Data Based on Binary PSO Encoding Gene-to-Class Sensitivity Information

Publication Year: 2017, Page(s):85 - 96
Traditional gene selection methods for microarray data mainly considered the features' relevance by evaluating their utility for achieving accurate predication or exploiting data variance and distribution, and the selected genes were usually poorly explicable. To improve the interpretability of the selected genes as well as prediction accuracy, an improved gene selection method based on binary par... View full abstract»

• Identification of Essential Proteins Based on Edge Clustering Coefficient

Publication Year: 2012, Page(s):1070 - 1080
Identification of essential proteins is key to understanding the minimal requirements for cellular life and important for drug design. The rapid increase of available protein-protein interaction (PPI) data has made it possible to detect protein essentiality on network level. A series of centrality measures have been proposed to discover essential proteins based on network topology. However, most o... View full abstract»

• A Novel Method to Detect Functional microRNA Regulatory Modules by Bicliques Merging

Publication Year: 2016, Page(s):549 - 556
MicroRNAs (miRNAs) are post-transcriptional regulators that repress the expression of their targets. They are known to work cooperatively with genes and play important roles in numerous cellular processes. Identification of miRNA regulatory modules (MRMs) would aid deciphering the combinatorial effects derived from the many-to-many regulatory relationships in complex cellular systems. Here, we dev... View full abstract»

• bLARS: An Algorithm to Infer Gene Regulatory Networks

Publication Year: 2016, Page(s):301 - 314
| | PDF (821 KB) | HTML

Inferring gene regulatory networks (GRNs) from high-throughput gene-expression data is an important and challenging problem in systems biology. Several existing algorithms formulate GRN inference as a regression problem. The available regression based algorithms are based on the assumption that all regulatory interactions are linear. However, nonlinear transcription regulation mechanisms are commo... View full abstract»

• Fast and Scalable Feature Selection for Gene Expression Data Using Hilbert-Schmidt Independence Criterion

Publication Year: 2017, Page(s):167 - 181
Goal: In computational biology, selecting a small subset of informative genes from microarray data continues to be a challenge due to the presence of thousands of genes. This paper aims at quantifying the dependence between gene expression data and the response variables and to identifying a subset of the most informative genes using a fast and scalable multivariate algorithm. ... View full abstract»

• Network-Based Method for Inferring Cancer Progression at the Pathway Level from Cross-Sectional Mutation Data

Publication Year: 2016, Page(s):1036 - 1044
Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve th... View full abstract»

• Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins

Publication Year: 2007, Page(s):441 - 446
An algorithm called bidirectional long short-term memory networks (BLSTM) for processing sequential data is introduced. This supervised learning method trains a special recurrent neural network to use very long-range symmetric sequence context using a combination of nonlinear processing elements and linear feedback loops for storing long-range context. The algorithm is applied to the sequence-base... View full abstract»

• A Comparative Study for Identifying the Chromosome-Wide Spatial Clusters from High-Throughput Chromatin Conformation Capture data

Publication Year: 2017, Page(s): 1
In the past years, the high-throughput sequencing technologies have enabled massive insights into genomic annotations. In contrast, the full-scale three-dimensional arrangements of genomic regions are relatively unknown. Thanks to the recent breakthroughs in High-throughput Chromosome Conformation Capture (Hi-C) techniques, non-negative matrix factorization (NMF) has been adopted to identify local... View full abstract»

• Private Data Analytics on Biomedical Sensing Data via Distributed Computation

Publication Year: 2016, Page(s):431 - 444
Advances in biomedical sensors and mobile communication technologies have fostered the rapid growth of mobile health (mHealth) applications in the past years. Users generate a high volume of biomedical data during health monitoring, which can be used by the mHealth server for training predictive models for disease diagnosis and treatment. However, the biomedical sensing data raise serious privacy ... View full abstract»

• A New Magnetotactic Bacteria Optimization Algorithm Based on Moment Migration

Publication Year: 2017, Page(s):15 - 26
Magnetotactic bacteria is a kind of polyphyletic group of prokaryotes with the characteristics of magnetotaxis that make them orient and swim along geomagnetic field lines. Its distinct biology characteristics are useful to design new optimization technology. In this paper, a new bionic optimization algorithm named Magnetotactic Bacteria Moment Migration Algorithm (MBMMA) is proposed. In the propo... View full abstract»

• Improving the Prediction of Clinical Outcomes from Genomic Data Using Multiresolution Analysis

Publication Year: 2012, Page(s):1442 - 1450
The prediction of patient's future clinical outcome, such as Alzheimer's and cardiac disease, using only genomic information is an open problem. In cases when genome-wide association studies (GWASs) are able to find strong associations between genomic predictors (e.g., SNPs) and disease, pattern recognition methods may be able to predict the disease well. Furthermore, by using signal processing me... View full abstract»

• Constructing DNA Barcode Sets based on Particle Swarm Optimization

Publication Year: 2017, Page(s): 1
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets ... View full abstract»

• A review on methods for detecting SNP interactions in high-dimensional genomic data

Publication Year: 2017, Page(s): 1
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleot... View full abstract»

• An Organelle Correlation-Guided Feature Selection Approach for Classifying Multi-Label Subcellular Bio-images

Publication Year: 2017, Page(s): 1
Nowadays, with the advances in microscopic imaging, accurate classification of bioimage-based protein subcellular location pattern has attracted as much attention as ever. One of the basic challenging problems is how to select the useful feature components among thousands of potential features to describe the images. This is not an easy task especially considering there is a high ratio of multi-lo... View full abstract»

• Three-Dimensional Path Planning for Uninhabited Combat Aerial Vehicle Based on Predator-Prey Pigeon-Inspired Optimization in Dynamic Environment

Publication Year: 2017, Page(s):97 - 107
Three-dimension path planning of uninhabited combat aerial vehicle (UCAV) is a complicated optimal problem, which mainly focused on optimizing the flight route considering the different types of constrains under complex combating environment. A novel predator-prey pigeon-inspired optimization (PPPIO) is proposed to solve the UCAV three-dimension path planning problem in dynamic environment. Pigeon... View full abstract»

• Feature Selection for Optimized High-dimensional Biomedical Data using the Improved Shuffled Frog Leaping Algorithm

Publication Year: 2016, Page(s): 1
High dimensional biomedical datasets contain thousands of features which can be used in molecular diagnosis of disease, however, such datasets contain many irrelevant or weak correlation features which influence the predictive accuracy of diagnosis. Without a feature selection algorithm, it is difficult for the existing classification techniques to accurately identify patterns in the features. The... View full abstract»

• Fast prediction of protein methylation sites using a sequence-based feature selection technique

Publication Year: 2017, Page(s): 1
Protein methylation, an important post-translational modification, plays crucial roles in many cellular processes. The accurate prediction of protein methylation sites is fundamentally important for revealing the molecular mechanisms undergoing methylation. In recent years, computational prediction based on machine learning algorithms has emerged as a powerful and robust approach for identifying m... View full abstract»

• A Characterization of Minimum Spanning Tree-Like Metric Spaces

Publication Year: 2017, Page(s):468 - 471
Recent years have witnessed a surge of biological interest in the minimum spanning tree (MST) problem for its relevance to automatic model construction using the distances between data points. Despite the increasing use of MST algorithms for this purpose, the goodness-of-fit of an MST to the data is often elusive because no quantitative criteria have been developed to measure it. Motivated by this... View full abstract»

• CAVER: Algorithms for Analyzing Dynamics of Tunnels in Macromolecules

Publication Year: 2016, Page(s):505 - 517
The biological function of a macromolecule often requires that a small molecule or ion is transported through its structure. The transport pathway often leads through void spaces in the structure. The properties of transport pathways change significantly in time; therefore, the analysis of a trajectory from molecular dynamics rather than of a single static structure is needed for understanding the... View full abstract»

• Cancer Progression Prediction Using Gene Interaction Regularized Elastic Net

Publication Year: 2017, Page(s):145 - 154
Different types of genomic aberration may simultaneously contribute to tumorigenesis. To obtain a more accurate prognostic assessment to guide therapeutic regimen choice for cancer patients, the heterogeneous multi-omics data should be integrated harmoniously, which can often be difficult. For this purpose, we propose a Gene Interaction Regularized Elastic Net (GIREN) model that predicts clinical ... View full abstract»

• Regularized Non-negative Matrix Factorization for Identifying Differential Genes and Clustering Samples: a Survey

Publication Year: 2017, Page(s): 1
Non-negative Matrix Factorization (NMF), a classical method for dimensionality reduction, has been applied in many fields. It is based on the idea that negative numbers are physically meaningless in various data-processing tasks. Apart from its contribution to conventional data analysis, the recent overwhelming interest in NMF is due to its newly discovered ability to solve challenging data mining... View full abstract»

• Identifying bacterial essential genes based on a feature-integrated method

Publication Year: 2017, Page(s): 1
Essential genes are those genes of an organism that are considered to be crucial for its survival. Identification of essential genes is therefore of great significance to advance our understanding of the principles of cellular life. We have developed a novel computational method, which can effectively predict bacterial essential genes by extracting and integrating homologous features, protein doma... View full abstract»

• Construction of refined protein interaction network for predicting essential proteins

Publication Year: 2017, Page(s): 1
Identification of essential proteins based on protein interaction network (PIN) is a very important and hot topic in the post genome era. Up to now, a number of network-based essential protein discovery methods have been proposed. Generally, a static protein interaction network was constructed by using the protein-protein interactions obtained from different experiments or databases. Unfortunately... View full abstract»

• PerPAS: Topology-Based Single Sample Pathway Analysis Method

Publication Year: 2017, Page(s): 1
Identification of intracellular pathways that play key roles in cancer progression and drug resistance is a prerequisite for developing targeted cancer treatments. The era of personalized medicine calls for computational methods that can function with one sample or very small set of samples. Developing such methods is challenging because standard statistical approaches pose several limiting assump... View full abstract»

• A Mutation Model from First Principles of the Genetic Code

Publication Year: 2016, Page(s):878 - 886
The paper presents a neutral Codons Probability Mutations (CPM) model of molecular evolution and genetic decay of an organism. The CPM model uses a Markov process with a 20-dimensional state space of probability distributions over amino acids. The transition matrix of the Markov process includes the mutation rate and those single point mutations compatible with the genetic code. This is an alterna... View full abstract»

• Robust Dynamic Multi-objective Vehicle Routing Optimization Method

Publication Year: 2017, Page(s): 1
For dynamic multi-objective vehicle routing problems, the waiting time of vehicle, the number of serving vehicles, the total distance of routes were normally considered as the optimization objectives. Except for above objectives, fuel consumption that leads to the environmental pollution and energy consumption was focused on in this paper. Considering the vehicles’ load and the driving dist... View full abstract»

• Security Assessment of Cyberphysical Digital Microfluidic Biochips

Publication Year: 2016, Page(s):445 - 458
A digital microfluidic biochip (DMFB) is an emerging technology that enables miniaturized analysis systems for point-of-care clinical diagnostics, DNA sequencing, and environmental monitoring. A DMFB reduces the rate of sample and reagent consumption, and automates the analysis of assays. In this paper, we provide the first assessment of the security vulnerabilities of DMFBs. We identify result-ma... View full abstract»

• Improve Glioblastoma Multiforme Prognosis Prediction by Using Feature Selection and Multiple Kernel Learning

Publication Year: 2016, Page(s):825 - 835
Glioblastoma multiforme (GBM) is a highly aggressive type of brain cancer with very low median survival. In order to predict the patient's prognosis, researchers have proposed rules to classify different glioma cancer cell subtypes. However, survival time of different subtypes of GBM is often various due to different individual basis. Recent development in gene testing has evolved classic subtype ... View full abstract»

• An Application of Invertibility of Boolean Control Networks to the Control of the Mammalian Cell Cycle

Publication Year: 2017, Page(s):225 - 229
In Fauré et al. (2006), the dynamics of the core network regulating the mammalian cell cycle is formulated as a Boolean control network (BCN) model consisting of nine proteins as state nodes and a tenth protein (protein CycD) as the control input node. In this model, one of the state nodes, protein Cdc20, plays a central role in the separation of sister chromatids. Hence, if any Cdc20 sequ... View full abstract»

• Fireworks Algorithm with Enhanced Fireworks Interaction

Publication Year: 2017, Page(s):42 - 55
As a relatively new metaheuristic in swarm intelligence, fireworks algorithm (FWA) has exhibited promising performance on a wide range of optimization problems. This paper aims to improve FWA by enhancing fireworks interaction in three aspects: 1) Developing a new Gaussian mutation operator to make sparks learn from more exemplars; 2) Integrating the regular explosion operator of FWA with the migr... View full abstract»

• Swarm Robots Search for Multiple Targets Based on an Improved Grouping Strategy

Publication Year: 2017, Page(s): 1
Swarm robots search for multiple targets in collaboration in unknown environments has been addressed in this paper. An improved grouping strategy based on constriction factors Particle Swarm Optimization is proposed. Robots are grouped under this strategy after several iterations of stochastic movements, which considers the influence range of targets and environmental information they have sensed.... View full abstract»

• A Visual Interface for Querying Heterogeneous Phylogenetic Databases

Publication Year: 2017, Page(s):131 - 144
Despite the recent growth in the number of phylogenetic databases, access to these wealth of resources remain largely tool or form-based interface driven. It is our thesis that the flexibility afforded by declarative query languages may offer the opportunity to access these repositories in a better way, and to use such a language to pose truly powerful queries in unprecedented ways. In this paper,... View full abstract»

• Biomarker Identification and Cancer Classification Based on Microarray Data Using Laplace Naive Bayes Model with Mean Shrinkage

Publication Year: 2012, Page(s):1649 - 1662
Biomarker identification and cancer classification are two closely related problems. In gene expression data sets, the correlation between genes can be high when they share the same biological pathway. Moreover, the gene expression data sets may contain outliers due to either chemical or electrical reasons. A good gene selection method should take group effects into account and be robust to outlie... View full abstract»

• Indexing Graphs for Path Queries with Applications in Genome Research

Publication Year: 2014, Page(s):375 - 388
We propose a generic approach to replace the canonical sequence representation of genomes with graph representations, and study several applications of such extensions. We extend the Burrows-Wheeler transform (BWT) of strings to acyclic directed labeled graphs, to support path queries as an extension to substring searching. We develop, apply, and tailor this technique to a) read alignment on an ex... View full abstract»

• Predicting Metabolic Fluxes Using Gene Expression Differences As Constraints

Publication Year: 2011, Page(s):206 - 216
A standard approach to estimate intracellular fluxes on a genome-wide scale is flux-balance analysis (FBA), which optimizes an objective function subject to constraints on (relations between) fluxes. The performance of FBA models heavily depends on the relevance of the formulated objective function and the completeness of the defined constraints. Previous studies indicated that FBA predictions can... View full abstract»

