<< Chapter < Page | Chapter >> Page > |
Biosensing of pathogens is a research area of high consequence. An accurate and rapid biosensing paradigm has the potential to impact several fields, including healthcare, defense and environmental monitoring. In this module we address the concept of biosensing based on compressive sensing (CS) via the Compressive Sensing Microarray (CSM), a DNA microarray adapted to take CS-style measurements.
DNA microarrays are a frequently applied solution for microbe sensing; they have a significant edge over competitors due to their ability to sense many organisms in parallel [link] , [link] . A DNA microarray consists of genetic sensors or spots , each containing DNA sequences termed probes . From the perspective of a microarray, each DNA sequence can be viewed as a sequence of four DNA bases { , , , } that tend to bind with one another in complementary pairs: with and with . Therefore, a DNA subsequence in a target organism's genetic sample will tend to bind or “hybridize” with its complementary subsequence on a microarray to form a stable structure. The target DNA sample to be identified is fluorescently tagged before it is flushed over the microarray. The extraneous DNA is washed away so that only the bound DNA is left on the array. The array is then scanned using laser light of a wavelength designed to trigger fluorescence in the spots where binding has occurred. A specific pattern of array spots will fluoresce, which is then used to infer the genetic makeup in the test sample.
There are three issues with the traditional microarray design. Each spot consists of probes that can uniquely identify only one target of interest (each spot contains multiple copies of a probe for robustness.) The first concern with this design is that very often the targets in a test sample have similar base sequences, causing them to hybridize with the wrong probe (see [link] ). These cross-hybridization events lead to errors in the array readout. Current microarray design methods do not address cross-matches between similar DNA sequences.
The second concern in choosing unique identifier based DNA probes is its restriction on the number of organisms that can be identified. In typical biosensing applications multiple organisms must be identified; therefore a large number of DNA targets requires a microarray with a large number of spots. In fact, there are over 1000 known harmful microbes, many with more than 100 strains. The implementation cost and processing speed of microarray data is directly related to its number of spots, representing a significant problem for commercial deployment of microarray-based biosensors.As a consequence readout systems for traditional DNA arrays cannot be miniaturized or implemented using electronic components and require complicated fluorescent tagging.
The third concern is the inefficient utilization of the large number of array spots in traditional microarrays. Although the number of potential agents in a sample is very large, not all agents are expected to be present in a significant concentration at a given time and location, or in an air/water/soil sample to be tested. Therefore, in a traditionally designed microarray only a small fraction of spots will be active at a given time, corresponding to the few targets present.
To combat these problems, a Compressive Sensing DNA Microarray (CSM) uses “combinatorial testing sensors” in order to reduce the number of sensor spots [link] , [link] , [link] . Each spot in the CSM identifies a group of target organisms, and several spots together generate a unique pattern identifier for a single target. (See also "Group testing and data stream algorithms" .) Designing the probes that perform this combinatorial sensing is the essence of the microarray design process, and what we aim to describe in this module.
To obtain a CS-type measurement scheme, we can choose each probe in a CSM to be a group identifier such that the readout of each probe is a probabilistic combination of all the targets in its group. The probabilities are representative of each probe's hybridization affinity (or stickiness) to those targets in its group; the targets that are not in its group have low affinity to the probe. The readout signal at each spot of the microarray is a linear combination of hybridization affinities between its probe sequence and each of the target agents.
[link] illustrates the sensing process. To formalize, we assume there are spots on the CSM and targets; we have far fewer spots than target agents. For and , the probe at spot hybridizes with target with affinity . The target occurs in the tested DNA sample with concentration , so that the total hybridization of spot is , where and are a row and column vector, respectively. The resulting measured microarray signal intensity vector fits the CS measurement model .
While group testing has previously been proposed for microarrays [link] , the sparsity in the target signal is key in applying CS. The chief advantage of a CS-based approach over regular group testing is in its information scalability. We are able to not just detect, but estimate the target signal with a reduced number of measurements similar to that of group testing [link] . This is important since there are always minute quantities of certain pathogens in the environment, but it is only their large concentrations that may be harmful to us. Furthermore, we are able to use CS recovery methods such as Belief Propagation that decode while accounting for experimental noise and measurement nonlinearities due to excessive target molecules [link] .
Notification Switch
Would you like to follow the 'An introduction to compressive sensing' conversation and receive update notifications?