<< Chapter < Page | Chapter >> Page > |
In order to achieve the goals set out in the project introduction, we intend to exploit compressive sensing. The reasoning behind our choice of this technique and the technique itself are described in detail in the following paragraphs.
A conventional digital camera works by capturing light from an object and focusing it through a lens onto an array of sensors. Cameras used by the general public operate in the visible range, 400 to 750 nm, with CMOS sensors. They can be modified to image in a slightly wider range, from 280 to 1200 nm, if their infrared (IR) filters are removed .
Current IR cameras can operate in much higher ranges, from the near infrared (NIR) at 0.75 – 1.4 µm, to the far infrared (FIR) at 15 – 1,000 µm . The average person, however, does not have the purchasing power to buy an IR camera, as one generally costs between $3,000 and $50,000 . Even for companies or governmental institutions, this is a substantial sum of money.
The bulk of the cost of an IR camera is in its sensor array and the cooling system that regulates the array's temperature. The array alone accounts for at least one-third of any IR camera's price. The cooling system that helps reduce background noise and the effects of pixel bleeding is similarly expensive. Unfortunately, there is no Moore's law for these sort of components. Unlike transistors, IR sensors and their cooling systems are not likely to decrease in size or cost anytime in the near future. Thus, while the camera's image processing chips will probably become cheaper in the next few years, no such reduction in cost can be predicted for the sensor components. In fact, as the general trend in the camera industry is toward more pixels, it is likely that soon the cost of an IR camera will be almost solely determined by the price of its sensor array and this array's cooling system.
Therefore, to create an economical IR camera, the number of sensors and cost of their cooling system would have to be dramatically reduced, without severely damaging the camera’s resolution. The technology to do this already exists; it is called compressive sensing. In principle, a compressive sensing device needs only a single sensor—a single pixel. Not only does the use of a single pixel remove the price of a sensor array from the picture, but also it has the potential to reduce the cost of the sensor cooling system. Since a single pixel device does not suffer from the pixel-bleeding problem that necessitates a high-quality cooling system in regular IR cameras, in a single-pixel camera an inexpensive cooling alternative just for removing background noise could be used.
The basic operation of a compressive sensing camera is relatively simple. Light from an object is focused by a lens onto a digital micromirror device (DMD). A DMD is an array of tiny mirrors that can be tilted ±10-12º, from “on” to “off” positions . In the “on” state, these mirrors reflect light from the object to a secondary lens, which focuses this light to the sensor. Data from this sensor is sent to a signal processing unit, which manipulates data points to construct an image of the original object. To generate a set of useful data points for image reconstruction, the mirrors of the DMD are flipped into a series of random configurations, in each of which approximately half of the mirrors are in their on position.
For an N-pixel image, N samples are taken. Then, since this is usually too many samples too store or transmit, the data is compressed. Transform coding is used to perform the compression. This takes a sample vector x and expresses it in basis Ψ, as shown in the figure below. Then, x can be specified just by the K-sparse matrix s, which has K muchlessthan N non-zero values. One example of a typical Ψ matrix is the discrete cosine transform (DCT) used for JPEG. This works well for natural images, as they tend to be sparse in the DCT.
The standard approach is lossy. N - K samples are thrown away, and only the K values from s are stored or transmitted. This means that removed data cannot be recovered afterwards, so the quality of the image cannot be improved later.
The standard sample-then-compress process can be simplified. Consider directly acquiring a signal in a compressed representation. Say we obtain M lessthan N measurements and place them in the vector y. Then y is equivalent to the inner product of x and M other vectors (where x is the N-sample vector from the standard sampling example). If we stack these M other vectors into the matrix Φ, then the overall equation becomes . This is illustrated in the figure below.
Gaussian noise actually works quite well for creating the matrix. So, the only really difficult math to be done here is in designing a reconstruction algorithm to obtain x from y. This is a complex process, but faster methods are developing rapidly.
Notification Switch
Would you like to follow the 'Nir single pixel camera' conversation and receive update notifications?