Discover
/
Article

Mathematics of Genome Analysis

FEB 01, 2003

DOI: 10.1063/1.1564356

Ralf Bundschuh

Mathematics of Genome Analysis , Jerome K. Percus Cambridge U. Press, New York, 2002. $59.95, $19.95 paper (139 pp.). ISBN 0-521-58517-1, ISBN 0-521-58526-0 paper

The sequencing of the human genome alerted researchers to the importance of sequence data for modern molecular biology. Acquiring and interpreting that data requires powerful quantitative methods, and the rapidly growing field of computational biology develops such methods. Computational biology draws heavily on several disciplines (such as computer science, mathematics, statistics, and statistical physics), and in turn stimulates new research in those areas by posing new kinds of problems.

Mathematicians have led the way in computational biology. Many of their contributions are summarized in the textbook, Introduction to Computational Biology: Maps, Sequences, and Genomes (Chapman and Hall, 1995) by Michael S. Waterman, a leader in the field. Waterman’s book introduces many of the fundamental techniques of computational biology and focuses on real-world applications while maintaining mathematical rigor.

In Mathematics of Genome Analysis (in the Cambridge Studies of Mathematical Biology series) Jerome K. Percus takes a very different approach. As the book’s title suggests, Percus’s focus is mathematics rather than biological or computational application. His theme is the DNA molecule and its sequence, and indeed the book discusses many aspects of DNA, including sequencing and statistical properties of genomes, comparison of DNA sequences, and such physical properties of the DNA molecule as its melting behavior. Percus uses such practical questions about DNA and its sequence to showcase a variety of mathematical problems triggered by the biological questions, and to offer techniques for solving them. Many of those techniques—including stochastic processes described by the Fokker-Planck equation, correlation functions, power spectra, transfer matrices, and the WKB approximation—are rooted in physics. Others involve more mathematics, reflecting the breadth of Percus’s own research.

The book, based on a mathematics course that Percus taught at New York University, features a variety of assignments that exemplify the techniques and can be used for problem sets. Its moderate length is well suited for a textbook of a one-semester course, and its witty language makes it easy for the mathematically inclined reader to join the author in his obvious excitement. However, the dense technical detail and mathematical symbols demand very careful reading and at times obscure the bigger picture. One should definitely work through the text—it is not bedtime reading.

Because of the book’s focus on mathematics, I would not recommend it as a source to learn biology. Although the book gives biological background, it does so only to the extent needed to understand the mathematical problems. That limitation often leaves the reader with wrong impressions. One example is the chapter on determining DNA sequences. That chapter elaborates the statistics needed to sequence a randomly generated genome, but fails to mention that the main challenge in determining real-life genomes is the repetition of subsequences that are far longer than would be expected at random. Another example is the chapter on sequence comparison. By artificially restricting himself to DNA sequences, the author implies that they are the topic’s main application. However, most real applications compare sequences of protein rather than of DNA. Reducing the important protein case to a side remark is especially puzzling, since it can be treated in the same way as the comparison of DNA sequences.

The biggest downside of the book is its references. The author admits the reference list is “very incomplete” and I can confirm that at least for my own area of expertise. Such a subjective choice of references may be adequate in a book written for experts, but for a textbook, I would prefer a bit more diligence. Another reference-related problem is that it is sometimes difficult to tell which parts of the book present other people’s results and which are the author’s own ideas.

In summary, despite its shortcomings in biology, Mathematics of Genome Analysis is a suitable textbook for a mathematics course aimed at raising awareness of the challenges that are posed by computational biology. It is also good first reading for mathematics students and professionals who want to get an idea of the exciting mathematical problems in the analysis of biological sequences.

More about the Authors

Ralf Bundschuh. Ohio State University, Columbus, US .

This Content Appeared In
pt-cover_2003_02.jpeg

Volume 56, Number 2

Related content
/
Article
Immeasurable Weather: Meteorological Data and Settler Colonialism from 1820 to Hurricane Sandy, Sara J. Grossman
/
Article
/
Article
Predicting Our Climate Future: What We Know, What We Don’t Know, and What We Can’t Know, David Stainforth
/
Article
/
Article
/
Article
Physics of Wave Turbulence, Sébastien Galtier

Get PT in your inbox

Physics Today - The Week in Physics

The Week in Physics" is likely a reference to the regular updates or summaries of new physics research, such as those found in publications like Physics Today from AIP Publishing or on news aggregators like Phys.org.

Physics Today - Table of Contents
Physics Today - Whitepapers & Webinars
By signing up you agree to allow AIP to send you email newsletters. You further agree to our privacy policy and terms of service.