By Topic

Towards a DNA sequencing theory (learning a string)

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Ming Li ; Waterloo Univ., Ont., Canada

Mathematical frameworks suitable for massive automated DNA sequencing and for analyzing DNA sequencing algorithms are studied under plausible assumptions. The DNA sequencing problem is modeled as learning a superstring from its randomly drawn substrings. Under certain restrictions, this may be viewed as learning a superstring in L.G. Valiant's (1984) learning model, and in this case the author gives an efficient algorithm for learning a superstring and a quantitative bound on how many samples suffice. A major obstacle to the approach turns out to be a quite well-known open question on how to approximate the shortest common superstring of a set of strings. The author presents the first provably good algorithm that approximates the shortest superstring of length n by a superstring of length O(n log n)

Published in:

Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on

Date of Conference:

22-24 Oct 1990