After Adleman successfully solved a directed Hamiltonian path problem using the tools of biomolecular engineering, others followed, applying similar algorithms to other hard computation problems, as well as devising more efficient computational schemes. A number of theoretical models for creating a universal DNA computer have been developed, and mathematical proofs have shown that some models of DNA computing are at least equivalent to a classical Turing machine. Other work has concentrated on improving error rates and identifying specific types of bio-operations and molecular engineering techniques that could be harnessed for computational and storage-related purposes. A limited amount of work has been directed at real-life applications and the practical feasibility of DNA computers.
While the practical benefits of DNA based computational schemes are still questionable and the vast majority of work to date has been theoretical, there have been many allusions to potential uses of this emerging computational paradigm. The usefulness of developing techniques of DNA computing and ultimately developing working DNA computers can be described as falling into one of the following three general categories:
Proposed "classical" models of DNA computers derive their potential advantage over conventional computers from their ability to:
Within the paradigm of "classic" DNA computation there exists many different models, each with different advantages and degrees of applicability to classes of problems. An example of one model differing from Adleman's original search algorithm is the surface-based DNA algorithm used to solve the minimal set cover problem in [8]. This technique proposes to decrease the potential for error from lost DNA strands by affixing them to a surface of glass or silicon. The efficiency of their algorithm is dependent on the method of encoding quantity by strand length so the authors have speculated that such a model might be inappropriate for application to problems where the validity of a solution is not proportional to its size. A surface-based approach to DNA computation was also considered in [13], which suggests the products of bit operations could be identified using optical readers scanning for the relative surface position of hybrid double strands consisting of a previously unknown bitstring and a value being held by that bit. The ability to read output in this manner may drastically increase the feasibility of implementing a DNA computer outside the conventional laboratory setting. It might also encourage the development of DNA/Electronic computer hybrids, with different problem types being handled by different components, and the electronic portion conveying the output to the user. Another model [10] makes use of 3-Dimensional DNA structures to reduce errors and necessary steps. This method of using the structral properties of DNA may also lead to more efficient storage mechanisms.
Classical DNA computing techniques have already been theoretically applied to a real life problem: breaking the Data Encryption Standard (DES). Although this problem has already been solved using conventional techniques in a much shorter time than proposed by the DNA methods, the DNA models are much more flexible, potent, and cost effective.
DES is a method of encrypting 64-bit messages with a 56-bit key, used extensively in the United States. Electronic keys are normally a string of data used to code and/or decode sensitive messages. By finding the appropriate key to a set of encrypted messages, one can either read encoded messages or pose as the sender of such messages. Using a special purpose electronic computer and differential cryptanalysis, it has been shown that the key to DES can be found in several days. However, to do so would require 2^43 examples of corresponding encrypted and unencrypted messages (known as plain-text/cipher-text pairs) and would slow down by a factor of 256 if the strength of the encrypting key was increased to 64-bits. In [6] it is proposed that DES could be broken using a DNA based computer and a search algorithm similar to Adleman's original technique. This procedure would be expected to take 4 months, but would only need a single plain-text/cipher-text pair or an example of cipher text with several plain text candidates to be successful. The feasibility of applying DNA computation to this problem was also addressed in [3] using a more refined algorithm (the sticker model approach) which enabled the researchers to suggest that they could solve the problem using less than a gram of DNA, an amount that could presumably be handled by a desk top sized machine. Both models would likely be more cost and energy effective than the expensive electronic processors required by conventional means, but are entirely theoretical. The first model ignores error rates incurred though laboratory techniques and the inherent properties of the DNA being used. The second model requires an error rate approaching .0001, with higher rates substantially affecting the volume of DNA required. Despite these assumptions, these models show that existing methods of DNA computation could be used to solve a real life problem in a way that is both practical and superior to methods used by conventional computers. In [3] it is also demonstrated that such benefits can be obtained despite error rates that would be unacceptable in an electronic computer and that may be unavoidable in a molecular one.
The ability to obtain tractable solutions to NP-complete and other hard computational problems has many implications to real life problems, particularly in business planning and management science. Many of the cost optimization problems faced by managers are in NP-complete are currently solved using heuristic methods and other approximations. These problems include scheduling, routing, and optimal use of raw materials and correspond to problems already solved, either theoretically or experimentally, using DNA computation.
Current and near-term advances in laboratory technology preclude the use of DNA computation as a method of solving problems in real time. This restricts the possible application of classical DNA computing to areas where the calculation of optimal solutions could be performed over a number of days, weeks, or months. This would preclude many otherwise obvious uses in expert systems, but would be able to be applied to long term production planning where initial cost commitments are very high such as chip design and manufacturing. DNA computing could also be applied extensively to optimizing airline and bus routes for planning purposes.
To address these practical instances of hard computational problems, DNA computers could be implemented in a variety of different capacities. In terms of hardware, the most effective solution might come from concentrating efforts on problem specific processors rather than universal DNA computers. Such problem specific processors could focus on problems whose solution will have the greatest cost benefit while also allowing for the time necessary to conduct the bio-operations.
Different models might also be used that more closely mirror the current iterative approximation techniques used to currently solve these problems. An efficient model of DNA computing has been developed to make use of dynamic programming, solving successively harder instances of a problem until the required problem has been solved [5]. In addition to using massive parallelism this method makes use of the large size of memory available to DNA computers. In this way, a solution to an instance of the knapsack problem was proposed, an NP-complete problem with implications to optimal use of materials and packaging. In [9] a general technique of constructing approximations to NP-optimization problems using DNA computing was proposed. Such methods of approximation may be relevant to instances where an exact solution is not absolutely necessary but the problem is sufficiently large enough to preclude conventional electronic techniques or other, more exact models of DNA computing.
If the use of DNA is found to be a clearly superior method of solving large instances of hard computational methods but bio-molecular techniques continue to dictate solutions times much longer than electronic based approximation methods, DNA computers could be used for off-line benchmarking of existing heuristic algorithms, which would continue to used on a more frequent basis. In such a scheme, a DNA computer could be used to initially optimize the production scheduling for a proposed product. This step may take several weeks or months to complete before production is ready to begin. The results of this initial optimization could then be compared to various approximation methods. The method found to most closely approximate the optimal solution could then be used on a more active basis to account for changes in supply and demand parameters so that the production will continue to run at near optimal levels.
Baum's models for a DNA associative memory are quite simple, and build from the techniques used in other areas of DNA computation. Storing a word could be done by assigning a specific DNA subsequence to each component value pair and building a fixed length word from these subsequences. To then retrieve the word closest to the input, one would introduce marked complementary subsequences into the storage medium and chose the molecule that has the most matches to this input. This technique could be further refined to more closely approximate the brain by appending words to only store attributes that an object has, rather than wasting space using '0's to represent attributes that an object does not have.
Baum has further speculated that a memory could be constructed where only portions of the data are content-addressable and associative, with other information on an object compactly stored in addresses relative to the associative portion of the entry. To save on operating costs and reduce error frequency, this portion of the memory could be kept in double-stranded form.
Considering the brain's limit of about 10^15 synapses and Feynman's low-end estimate that the brain can distinguish about 10^6 concepts, such a DNA based associative memory could have certain advantages over the brain. Without accounting for redundant molecules, Baum estimates that a large bath tub of DNA, about 50g in 1000L, could hold over 10^20 words. In attempting to come up with practical uses for this memory scheme one will have to weigh the massive size and ability to retrieve in an associative manner against the slow retrieval times necessitated by current biomolecular engineering techniques. And, although Baum's simplistic approach has accounted for some error rates, his initial paper remains quite speculative.
The potential applications of re-coding natural DNA into a computable form are many and include:
There are other models of DNA computation that suggest that DNA might be used to detect and evaluate other chemical and bio-chemical substances. In [16] it is suggested that nucleic acid structures, in addition to nucleic acid sequences, could play an important role in molecular computation. Various shapes of folded nucleic acids can be used to detect the presence of drugs, proteins, or other molecules. It is also suggested that selected or engineered ribozymes could be used as operators to effect re-write rules and to detect the presence of such non-nucleic acid molecules. Using these structures and operators to sense levels of substances, it would then be possible to compute an output readable using proposed biosensors that detect fluorescence or polarization. These biosensors could potentially allow communication between molecular sensory computers and conventional electronic computers.
A particular area within the natural and applied sciences that may benefit from advances in DNA computation is combinatorial chemistry. Combinatorial chemistry involves the construction of enzymes, sequences of RNA, and other molecules, particularly for use in biomolecular engineering or medicine. Adleman [2] describes this process as being similar to "classic" models of DNA computation, as combinatorial chemistry involves generating large sets of random RNA sequences and searches for molecules with the desired properties. Advances in either area could easily benefit the other field or even pave a way to combining the two fields, producing both products and related computational results in parallel.
Several papers also extend the use of biomolecular computing into applications in the emerging science of nanotechnology, specifically nano-fabrication, making use of both the small scale computational abilities of DNA and the manufacturing abilities of RNA [15, 17]. Since both fields are still very embryonic, the practical or even experimental implementation of this use is still highly speculative but promising.
Applying the techniques of DNA2DNA computing could result in improved laboratory interfaces capable of performing computations prior to input into conventional computers and may lead to improved methods of sequencing DNA by motivating the use of magnetic beads, optical scanners, and other emerging techniques that may allow DNA to be read directly into an electronic interface.
With the advent of DNA computational paradigms, especially those directed towards improved methods of solving NP-hard problems, will come an increased understanding about the limitations of computation. DNA based computers may postpone certain expected thermodynamic obstacles to computation as well as exploring the limitations of Turing machines and questioning theories of computation based on electronic and mechanical models.
Developments in parallel processing algorithms to take advantage of the massive parallelism of "classic" DNA based computation may serve as a foundation for future developments in other parallel processing architectures using more conventional electronic computers.
DNA based computing may also be able to contribute to other unconventional areas of computation. Through the use of its massive parallelism and potentially non-deterministic mechanisms, DNA based models of computation might be useful for simulating or modeling other emerging computational paradigms, such as quantum computing, which may not be feasible until much later. As well, the trials and tribulations experienced by DNA computing researchers may very well be related to the future obstacles expected to face other revolutionary ideas within computer science and information systems.
A "killer app" is yet to be found for DNA computation, but might exist outside the bulk of current research, in the domain of DNA2DNA applications and other more natural models and applications of manipulated DNA. This direction is particularly interesting because it is an area in which DNA based solutions are not only an improvement over existing techniques, but may prove to be the only feasible way of directly solving such problems that involve the direct interaction with biological matter.
On the "classical" front, problem specific computers may prove to be the first practical use of DNA computation for several reasons. First, a problem specific computer will be easier to design and implement, with less need for functional complexity and flexibility. Secondly, DNA computing may prove to be entirely inefficient for a wide range of problems, and directing efforts on universal models may be diverting energy away from its true calling. Thirdly, the types of hard computational problems that DNA based computers may be able to effectively solve are of sufficient economic importance that a dedicated processor would be financially reasonable. As well, these problems will be likely to require extensive time the would preclude the need for a more versatile and interactive system that may be able to be implemented with a universal computing machine.
Even if the current difficulties found in translating theoretical DNA computing models into real life are never sufficiently overcome, there is still potential for other areas of development. Future applications might make use of the error rates and instability of DNA based computation methods as a means of simulating and predicting the emergent behavior of complex systems. This could pertain to weather forecasting, economics, and lead to more a scientific analysis of social science and the humanities. Such a system might rely on inducing increased error rates and mutation through exposure to radiation and deliberately inefficient encoding schemes. Similarly, methods of DNA computing might serve as the most obvious medium for use of evolutionary programming for applications in design or expert systems. DNA computing might also serve as a medium to implement a true fuzzy logic system.
With so many possible advantages over conventional techniques, DNA computing has great potential for practical use. Future work in this field should begin to incorporate cost-benefit analysis so that comparisons can be more appropriately made with existing techniques and so that increased funding can be obtained for this research that has the potential to benefit many circles of science and industry.