By Topic

Computational Biology and Bioinformatics, IEEE/ACM Transactions on

Issue 4 • Date Oct.-Dec. 2004

Filter Results

Displaying Results 1 - 11 of 11
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (196 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB)  
    Freely Available from IEEE
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maximum-scoring segment sets

    Page(s): 139 - 150
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (940 KB) |  | HTML iconHTML  

    We examine the problem of finding maximum-scoring sets of disjoint segments in a sequence of scores. The problem arises in DNA and protein segmentation and in postprocessing of sequence alignments. Our key result states a simple recursive relationship between maximum-scoring segment sets. The statement leads to fast algorithms for finding such segment sets. We apply our methods to the identification of noncoding RNA genes in thermophiles View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Phylogenetic super-networks from partial trees

    Page(s): 151 - 158
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB) |  | HTML iconHTML  

    In practice, one is often faced with incomplete phylogenetic data, such as a collection of partial trees or partial splits. This paper poses the problem of inferring a phylogenetic super-network from such data and provides an efficient algorithm for doing so, called the Z-closure method. Additionally, the questions of assigning lengths to the edges of the network and how to restrict the "dimensionality" of the network are addressed. Applications to a set of five published partial gene trees relating different fungal species and to six published partial gene trees relating different grasses illustrate the usefulness of the method and an experimental study confirms its potential. The method is implemented as a plug-in for the program SplitsTree4 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An O(N2) algorithm for discovering optimal Boolean pattern pairs

    Page(s): 159 - 170
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1043 KB) |  | HTML iconHTML  

    We consider the problem of finding the optimal combination of string patterns, which characterizes a given set of strings that have a numeric attribute value assigned to each string. Pattern combinations are scored based on the correlation between their occurrences in the strings and the numeric attribute values. The aim is to find the combination of patterns which is best with respect to an appropriate scoring function. We present an O(N2) time algorithm for finding the optimal pair of substring patterns combined with Boolean functions, where N is the total length of the sequences. The algorithm looks for all possible Boolean combinations of the patterns, e.g., patterns of the form p nland notq, which indicates that the pattern pair is considered to occur in a given string s, if p occurs in s, and q does not occur in s. An efficient implementation using suffix arrays is presented, and we further show that the algorithm can be adapted to find the best k-pattern Boolean combination in O(Nk) time. The algorithm is applied to mRNA sequence data sets of moderate size combined with their turnover rates for the purpose of finding regulatory elements that cooperate, complement, or compete with each other in enhancing and/or silencing mRNA decay View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A polynomial-time algorithm for the matching of crossing contact-map patterns

    Page(s): 171 - 180
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (649 KB) |  | HTML iconHTML  

    Contact maps are a model to capture the core information in the structure of biological molecules, e.g., proteins. A contact map consists of an ordered set S of elements (representing a protein's sequence of amino acids), and a set A of element pairs of S, called arcs (representing amino acids which are closely neighbored in the structure). Given two contact maps (S, A) and (Sp, Ap ) with |A| ges |Ap| the contact map pattern matching (CMPM) problem asks whether the "pattern" (Sp, Ap) "occurs" in (S, A), i.e., informally stated, whether there is a subset of |Ap| arcs in A whose arc structure coincides with Ap . CMPM captures the biological question of finding structural motifs in protein structures. In general, CMPM is NP-hard. In this paper, we show that CMPM is solvable in O(|A|6|Ap| time when the pattern is {<, }-structured, i.e., when each two arcs in the pattern are disjoint or crossing. Our algorithm extends to other closely related models. In particular, it answers an open question raised by Vialette that, rephrased in terms of contact maps, asked whether CMPM for {<, } -structured patterns is NP-hard or solvable in polynomial time. Our result stands in sharp contrast to the NP-hardness of closely related problems. We provide experimental results which show that contact maps derived from real protein structures can be processed efficiently View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using uncorrelated discriminant analysis for tissue classification with gene expression data

    Page(s): 181 - 190
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (960 KB) |  | HTML iconHTML  

    The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high (in the thousands) compared to the number of data samples (in the tens or low hundreds); that is, the data dimension is large compared to the number of data points (such data is said to be undersampled). To cope with performance and accuracy problems associated with high dimensionality, it is commonplace to apply a preprocessing step that transforms the data to a space of significantly lower dimension with limited loss of the information present in the original data. Linear discriminant analysis (LDA) is a well-known technique for dimension reduction and feature extraction, but it is not applicable for undersampled data due to singularity problems associated with the matrices in the underlying representation. This paper presents a dimension reduction and feature extraction scheme, called uncorrelated linear discriminant analysis (ULDA), for undersampled problems and illustrates its utility on gene expression data. ULDA employs the generalized singular value decomposition method to handle undersampled data and the features that it produces in the transformed space are uncorrelated, which makes it attractive for gene expression data. The properties of ULDA are established rigorously and extensive experimental results on gene expression data are presented to illustrate its effectiveness in classifying tissue samples. These results provide a comparative study of various state-of-the-art classification methods on well-known gene expression data sets View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Annual Index

    Page(s): 191 - 192
    Save to Project icon | Request Permissions | PDF file iconPDF (145 KB)  
    Freely Available from IEEE
  • IEEE/ACM TCBB: Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (196 KB)  
    Freely Available from IEEE

Aims & Scope

This bimonthly publishes archival research results related to the algorithmic, mathematical, statistical, and computational methods that are central in bioinformatics and computational biology.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Ying Xu
University of Georgia
xyn@bmb.uga.edu

Associate Editor-in-Chief
Dong Xu
University of Missouri
xudong@missouri.edu