By Topic

Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on

Date 4-7 Oct. 2012

Filter Results

Displaying Results 1 - 25 of 137
  • Author index

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | PDF file iconPDF (62 KB)  
    Freely Available from IEEE
  • Preface

    Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (42 KB)  
    Freely Available from IEEE
  • Conference committees

    Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (31 KB)  
    Freely Available from IEEE
  • [Title page]

    Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (74 KB)  
    Freely Available from IEEE
  • Program committees

    Page(s): 1 - 12
    Save to Project icon | Request Permissions | PDF file iconPDF (61 KB)  
    Freely Available from IEEE
  • Robust RFCM algorithm for identification of co-expressed miRNAs

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1459 KB) |  | HTML iconHTML  

    MicroRNAs (miRNAs) are short, endogenous RNAs having ability to regulate gene expression at the post-transcriptional level. Various studies have revealed that miRNAs tend to cluster on chromosomes. Members of a cluster that are at close proximity on chromosome are highly likely to be processed as cotranscribed units. Therefore, a large proportion of miRNAs are co-expressed. Expression profiling of miRNAs generates a huge volume of data. Complicated networks of miRNA-mRNA interaction create a big challenge for scientists to decipher this huge expression data. In order to extract meaningful information from expression data, this paper presents the application of robust rough-fuzzy c-means (rRFCM) algorithm to discover co-expressed miRNA clusters. The rRFCM algorithm comprises a judicious integration of rough sets, fuzzy sets, and c-means algorithm. The effectiveness of the rRFCM algorithm and different initialization methods, along with a comparison with other related methods, is demonstrated on three miRNA microarray expression data sets using Silhouette index, Davies-Bouldin index, Dunn index, β index, and gene ontology based analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Skeleton Timed Up and Go

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1216 KB) |  | HTML iconHTML  

    This paper presents a novel approach to fully automate the Timed Up and Go Assessment Test (TUG) in professional environments. The approach, called Skeleton Timed Up and Go (sTUG), is based on the usage of two Kinect for Xbox 360 sensors. sTUG supports the execution and documentation of the traditional TUG assessment test and furthermore calculates nine events, which demarcate the five main components during a run. On two days we conducted an experiment with five elderly aged 70-84 and four males aged 29-31 in the activity laboratory of the OFFIS Institute, Oldenburg to proof the reliability and feasibility of the system. Results demonstrate that sTUG can precisely measure the total duration of traditional TUG and is capable of detecting accurately nine motion events which demarcate the components during a run. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The human imprintome vl.O: Over 120 imprinted genes in the human genome impose a major review on previous censuses

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2063 KB) |  | HTML iconHTML  

    A relatively small number of human genes are marked with their parental origin and undergo a process termed "genomic imprinting", which, as a field of study, has grown rapidly in the last 20 years, with a growing figure of around 100 imprinted genes known in the mouse and approximately 120 in the human to comprise the updated whole set of human imprinted genes, the imprintome. Human imprintome is here analyzed by means of a reasonable, valid application of the Semantic Web and Linked Data approaches to a few structured datasets in order to provide a comprehensive collection of all known and predicted imprinted genes in the human genome. We have examined, compiled, structured and linked data to use them as a sharing resource for genome and epigenome interrogated studies regarding imprinted genes. Moreover, we offer our datasets of structured, linked data as being the actual research outcome of this human imprintome analysis because as genomics become more and more data intensive, due to huge amounts of biological data, so does our needs for more structured data to be easier mined and shared. The resulting version of the Linked Human Imprintome is a project within Linked Open Data (LOD) initiative (http://lod-cloud.net/) through Data Hub (http://thedatahub.org/en/dataset/a-draft-version-of-the-linked-human-imprintome). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved biomarker performance for the detection of hepatocellular carcinoma by inclusion of clinical parameters

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (11989 KB) |  | HTML iconHTML  

    We have previously identified several biomarkers of hepatocellular carcinoma (HCC). The levels of three of these biomarkers were analyzed individually and in combination with the currently used marker, alpha fetoprotein (AFP), for the ability to distinguish between a diagnosis of cirrhosis (n=113) and HCC (n=164). We have utilized several novel biostatistical tools, along with the inclusion of clinical factors such as age and gender, to determine if improved algorithms could be used to increase the probability of detection of cancer. Using several of these methods, we are able to detect HCC in the background of cirrhosis with an AUC of at least 0.95. The use of clinical factors in combination with biomarker values to detect HCC is discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting BI-RADS features from Portuguese clinical texts

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (483 KB) |  | HTML iconHTML  

    In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BI-RADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parser's performance is comparable to the manual method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A model to predict and analyze protein-protein interaction types using electrostatic energies

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (932 KB) |  | HTML iconHTML  

    Identification and analysis of types of protein-protein interactions (PPI) is an important problem in molecular biology because of their key role in many biological processes in living cells. We propose a model to predict and analyze protein interaction types using electrostatic energies as properties to distinguish between obligate and non-obligate interactions. Our prediction approach uses electrostatic energies for pairs of atoms and amino acids present in interfaces where the interaction occurs. Our results confirm that electrostatic energy is an important property to predict obligate and non obligate protein interaction types achieving accuracy of over 96% on two well known datasets. The classifiers used are support vector machines and linear dimensionality reduction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Building a classifier for identifying sentences pertaining to disease-drug relationships in tardive dyskinesia

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1581 KB) |  | HTML iconHTML  

    In this paper, we attempt to build a pipeline that identifies and extracts disease-drug relationships via sentence classification, and demonstrate the feasibility and utility of our approach using tardive dyskinesia as a case study. We manually developed and annotated a biomedicai training corpus for tardive dyskinesia. Using 10-fold cross validation, we tested and trained a naïve Bayes classifier to identify sentences pertaining to disease-drug relationships. Our precision, recall, and F-measure were all approximately 66%, and area under the ROC curve was over 80%. Our method helps to elucidate various drug effects on tardive dyskinesia and constitutes an initial effort toward the task of disease-drug relationship extraction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using gene sets to identify putative drugs for breast cancer

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2271 KB) |  | HTML iconHTML  

    The number of current anti-cancer drugs was limited and the response rates were also not high. To "reposition" known drugs as anti-cancer drugs to increase the therapeutic efficiency, we presented a novel analysis framework to identify putative drugs for cancer. Using breast cancer as example, a "cancer - gene sets - drugs" network was constructed through two procedures. First, the "gene sets - drugs" network was built by applying the expression pattern of drugs for gene set enrichment analysis. Secondly, the breast cancer progression associated gene sets were identified by survival analysis of patient cohorts. By integrating the two results, 25 tumor progression associated gene sets and 360 putative anti-cancer drugs were identified. Our method has the ability to identify the "reposition" drugs and the potential affected mechanisms of tumor progression concurrently. It will be useful to speed up the development of anti-cancer drugs from bench to clinical application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the design of advanced filters for biological networks using graph theoretic properties

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2439 KB) |  | HTML iconHTML  

    Network modeling of biological systems is a powerful tool for analysis of high-throughput datasets by computational systems biologists. Integration of networks to form a heterogeneous model requires that each network be as noise-free as possible while still containing relevant biological information. In earlier work, we have shown that the graph theoretic properties of gene correlation networks can be used to highlight and maintain important structures such as high degree nodes, clusters, and critical links between sparse network branches while reducing noise. In this paper, we propose the design of advanced network filters using structurally related graph theoretic properties. While spanning trees and chordal subgraphs provide filters with special advantages, we hypothesize that a hybrid subgraph sampling method will allow for the design of a more effective filter preserving key properties in biological networks. That the proposed approach allows us to optimize a number of parameters associated with the filtering process which in turn improves upon the identification of essential genes in mouse aging networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • De novo co-assembly of bacterial genomes from multiple single cells

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1091 KB) |  | HTML iconHTML  

    Recent progress in DNA amplification techniques, particularly multiple displacement amplification (MDA), has made it possible to sequence and assemble bacterial genomes from a single cell. However, the quality of single cell genome assembly has not yet reached the quality of normal multiceli genome assembly due to the coverage bias and errors caused by MDA. Using a template of more than one cell for MDA or combining separate MDA products has been shown to improve the result of genome assembly from few single cells, but providing identical single cells, as a necessary step for these approaches, is a challenge. As a solution to this problem, we give an algorithm for de novo co-assembly of bacterial genomes from multiple single cells. Our novel method not only detects the outlier cells in a pool, it also identifies and eliminates their genomic sequences from the final assembly. Our proposed co-assembly algorithm is based on colored de Bruijn graph which has been recently proposed for de novo structural variation detection. Our results show that de novo co-assembly of bacterial genomes from multiple single cells outperforms single cell assembly of each individual one in all standard metrics. Moreover, co-assembly outperforms mixed assembly in which the input datasets are simply concatenated. We implemented our algorithm in a software tool called HyDA which is available from http://compbio.cs.wayne.edu/software/hyda. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Log-Linear Graphical Model for inferring genetic networks from high-throughput sequencing data

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1924 KB) |  | HTML iconHTML  

    Gaussian graphical models are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies to measure gene expression. As the resulting high-dimensional count data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for modeling gene networks based on this discrete data. We develop a novel method for estimating high-dimensional Poisson graphical models, the Log-Linear Graphical Model, allowing us to infer networks based on high-throughput sequencing data. Our model assumes a pair-wise Markov property: conditional on all other variables, each variable is Poisson. We estimate our model locally via neighborhood selection by fitting 1-norm penalized log-linear models. Additionally, we develop a fast parallel algorithm permitting us to fit our graphical model to high-dimensional genomic data sets. We illustrate the effectiveness of our methods for recovering network structure from count data through simulations and a case study on breast cancer microRNA networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maps, rates, and fuzzy mountains: Generating meaningful risk maps

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7346 KB) |  | HTML iconHTML  

    Creating meaningful maps that represent rates and risks in the population is a challenge. Risk rates are often computed for small area units such as census entities that may contain small population counts. Due to the unstable nature of such estimates, maps produced using such data are likely to misrepresent the risk of an event's occurrence over geographic space. This paper introduces two systems based on distinct approaches to generate risk maps that are not biased by the underlying population distribution of a given region: the adaptive kernel density estimation procedure implemented in WebDMAP and the population-uniform partitioning method included in UPAS. Comparison of both systems shows that qualitatively similar results can be obtained by both approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Composition of bioinformatics model federations using communication aspects

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1536 KB) |  | HTML iconHTML  

    Scientists in bioinformatics research utilize multiple software tools and models for their analysis of genomic data. This involves a mixture of stand-alone and custom software. Stand-alone application tasks include computational data analysis and data visualization. The genomic data links the software together; the output from one may be suitable as input for another. Scientists often create custom tools and scripts to perform additional computational and processing tasks, such as filtering, before passing data from one tool to the next. This interoperating collection of tools and models becomes a "federation" of collaborative software. We propose an approach using aspect-oriented programming and Linda-style tuple space coordination language to facilitate connecting the individual components of the federation together; we assemble the separate components into the larger complex system, ideally without altering the original components. This paper describes the concepts behind our approach of using AOP and tuple spaces to intercept, filter and transform data as well as manage tool execution and coordination. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MDAsim: A multiple displacement amplification simulator

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (978 KB) |  | HTML iconHTML  

    Multiple displacement amplification (MDA) is a fast non-PCR based isothermal DNA amplification method that can amplify small amounts of DNA samples to a reasonable quantity for genomic analysis. This technique is suitable for metagenomics and single cell genome sequencing and related analyses. The distribution of the coverage of the output amplicons for single cell MDA unlike the multiceli amplification is not uniform. This distribution is unknown, and the parameters that affect this amplification bias have not been studied thoroughly. To have a better understanding of the MDA reaction we have developed a simplified mathematical model and the corresponding simulation algorithm called MDAsim to obtain a generative model for the output amplicons. In this paper we will show that the output coverage of MDAsim matches the experimental coverage very well. Therefore, the combination of MDAsim and a sequencing simulator can be utilized for test and evaluation of single cell assemblers avoiding the burden of experimental sequencing. Our results suggest that modelling the MDA mechanism and simulation of such increasingly complex models may provide valuable insight into the MDA process, which in turn can be used to design more efficient MDA reactions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Bayesian-based prediction model for personalized medical health care

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (523 KB) |  | HTML iconHTML  

    In this paper, we present a Bayesian-based Personalized Laboratory Tests prediction (BPLT) model to solve a real world medical problem: how to recommend laboratory tests to a group of patients? Given a patient who has conducted several laboratory tests, BPLT model recommends further laboratory tests that are the most related to this patient. We regard this laboratory test prediction problem as a special classification problem, where a new laboratory test belongs to either a "taken" or "not-taken" class. Our goal is to find the laboratory tests with high probability of "taken" and low probability of "not taken". Based on Bayesian method, the BPLT model builds a weighting function to investigate the correlations among laboratory tests and generate the rank of laboratory tests. In order to evaluate the proposed BPLT model, we further propose a novel evaluation metric to subjectively measure the accuracy of BPLT model. Experimental results show that BPLT model achieves good performance on the real data sets and provides a good solution to our real world application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ENISI Visual, an agent-based simulator for modeling gut immunity

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1136 KB) |  | HTML iconHTML  

    This paper presents ENISI Visual, an agent-based simulator for modeling gut immunity to enteric pathogens. Gastrointestinal systems are important for in-taking food and other nutritions and gut immunity is an important part of human immune system. ENISI Visual provides quality visualizations and users can control initial cell concentrations and the simulation speed, take snapshots, and record videos. The cells are represented with different icons and the icons change colors as their states change. Users can observe real-time immune responses, including cell recruitment, cytokine and chemokine secretion and dissipation, random or chemotactic movement, cell-cell interactions, and state changes. The case study clearly shows that users can use ENISI Visual to develop models and run novel and insightful in silico experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A neural network approach to the identification of b-/y-ions in MS/MS spectra

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (720 KB) |  | HTML iconHTML  

    The effectiveness of de novo peptide sequencing algorithms depends on the quality of MS/MS spectra. Since most of the peaks in a spectrum are uninterpretable `noise' peaks it is necessary to carefully pre-filter the spectra to identify the `signal' peaks that likely correspond to b-/y-ions. Selecting the optimal set of peaks for candidate peptide generation is essential for obtaining accurate results. A careful balance must be maintained between the precision and recall of peaks that are selected for further processing and candidate peptide generation. If too many peaks are selected the search space will be too large and the problem becomes intractable. If too few peaks are selected cleavage sites will be missed, the resulting candidate peptides will have large gaps, and sequencing results will be poor. For this reason pre-filtering of MS/MS spectra and accurate selection of peaks for peptide candidate generation is essential to any de novo peptide sequencing algorithm. We present a novel neural network approach for the selection of b-/y-ions using known fragmentation characteristics, and leveraging neural network probability estimates of flanking and complementary ions. We show a significant improvement in precision and recall of peaks corresponding to b-/y-ions and a reduction in search space over approaches used by other de novo peptide sequencing algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stress induces biphasic-rewiring and modularization patterns in the metabolomic networks of Escherichia coli

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2165 KB) |  | HTML iconHTML  

    Metabolomic networks describe correlated change in metabolite levels that crucially link the transcriptome and proteome with the complex matter and energy dynamics of small molecule metabolism. These networks are atypical. They do not directly portray regulatory and pathway information, yet they embed both. Here we study how stress rewires the metabolomic networks of Escherichia coli. Networks with vertices describing metabolites and edges representing correlated changes in metabolite concentrations were used to study time resolved bacterial responses to four non-lethal stress perturbations, cold, heat, lactose diauxie, and oxidative stress. We find notable patterns that are common to all stress responses examined: (1) networks are random rather than scale-free, i.e. metabolite connectivity is dictated by large network components rather than `hubs' (2) networks rewire quickly even in the absence of stress and are therefore highly dynamic; (3) rewiring occurs minutes after exposure to the Stressor and results in significant decreases in network connectivity, and (4) at longer time frames connectivity is regained. The common biphasic-rewiring pattern revealed in our time-resolved exploration of metabolite connectivity also uncovers unique structural and functional features. We find that stress-induced decreases in connectivity were always counterbalanced by increases in network modularity. Remarkably, rewiring begins with energetics and carbon metabolism that is needed for growth and then focuses on lipids, hubs and metabolic centrality needed for membrane restructuring. While these patterns may simply represent the need of the cell to stop growing and to prepare for uncertainty, the biphasic modularization of the network is an unanticipated result that links the effects of environmental perturbations and the generation of modules in biology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Inferring Fuzzy Cognitive Map models for Gene Regulatory Networks from gene expression data

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1703 KB) |  | HTML iconHTML  

    Gene Regulatory Networks (GRNs) represent the causal relations among the genes and provide insight on the cellular functions and the mechanism of the diseases. GRNs can be inferred from gene expression data by a number of algorithms, e.g. Boolean networks, Bayesian networks, and differential equations. While reliable inference of GRNs is still an open problem, new algorithms need to be developed. Fuzzy Cognitive Maps (FCMs) is used to represent GRNs in this paper. Most of the FCM learning algorithms are able to learn FCMs with less than 40 nodes. A new algorithm that is able to learn FCMs with more than 100 nodes is proposed. The proposed method is based on Ant Colony Optimization (ACO). A decomposed approach is proposed to reduce the dimension of the problem; therefore the FCM learning algorithm is more scalable (the dimension of the problem to be solved in one ACO run equals to the number of nodes or genes). The proposed approach is tested on data from DREAM project. The experiment results suggest the proposed approach outperforms several other algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.