Choosing a Computer Vision Camera

Know thy requirements

The most important part of picking a camera for a computer vision application is to know your requirements. That sounds common sense enough, but people don't typically feel the gravity of that decision until it's forced upon them in the choosing process. Additionally, it helps to be able to explain to other people what you need. Then, when you know what you need, you still must weigh dependent requirements against each other (cost vs. performance, for example). For us, the requirements look something like this, listed in order of importance:

  • Low-light sensitivity
  • Accessible API
  • Cost
  • Frames per second
  • USB3 interface
  • Other size/weight/power/operating temperature/field of view requirements

Low-light camera considerations

From here, I took my primary requirement and learned a bit (a lot) about how digital cameras work these days. To be perfectly honest, I haven't ever been into photography, and I know little about how any of it works, technically or stylistically, apart from a handful of fun facts. After a day of studying, I learned about a subset of digital camera properties that can affect its ability to operate in a low-light environment, namely the semi-deep sea. This shabby-looking graph will give you an idea of which wavelengths of the visible light spectrum penetrate deepest:



Interestingly, you can see from this graph why deeper ocean water looks blue, while coastal waters occasionally look green-- blue penetrates deepest in most of the ocean, but certain microbes/particulates present in shallow coastal waters allow green to penetrate deepest.


CCD and CMOS are the two most widely adopted technologies for the digital camera sensors which absorb light's photons for conversion into an electrical charge later interpreted by your camera. Without delving far too deep into the differences, know that CCD:

  • Used to be king because CMOS required more and complicated electronics
  • Dedicates more of its surface area to light photo absorption (potentially better light sensitivity)
  • Every pixel's charge exits through a limited number of output nodes

and CMOS:

  • Has a smaller footprint
  • Allows for more frames per second
  • Creates less noise
  • Costs less
  • Every pixel transmits its charge in parallel

I've undoubtedly conveyed a piece of information improperly here, but that's the gist of it. And at first glance, you'd think "better light sensitivity, problem solved." But wait, there's more...

Pixel Size

Pixel size is just that-- the size of each pixel in µm. It has a non-linear effect on the photon absorption ability of the sensor.

Quantum efficiency

Quantum efficiency is somewhat related to pixel size, in that it represents the percentage of photons, which fall onto the sensor, that the sensor converts to an electric charge. Essentially, this attribute is the full light detection rate for an individual sensor. 

Temporal dark noise & Dynamic range

After a sensor absorbs a photon and converts it to an electrical charge, the charge is inserted in a well until digitization, or measurement, of the charge begins. The error in that measurement is called temporal dark noise, Noise in the popular signal-to-noise ration measurement comprises temporal dark noise as well as shot noise, which comes solely from the nature of the light.

Absolute sensitivity threshold

This is simply the number of photons needed to obtain a signal equivalent to the noise observed by the camera. It represents most directly the minimum amount of light needed to observe any meaningful signal in the resultant image.

Signal & Noise

Signal then, may be calculated with the equation:

Light Density x (Pixel Size)^2 x Quantum efficiency

If we take Light Density as variable, then we can represent a sensor's signal as a line whose slope denotes the signal at different light levels. Following that, noise may be calculated with the equation:

Noise = SQRT[ (Temporal Dark Noise)^2 + (Shot Noise)^2 ]

Given that we can't control the amount of shot noise, for the purpose of evaluating a sensor, take Noise = Temporal Dark Noise.

Why can't I just use a GoPro?

Funny thing about that-- we are, but not for our computer vision camera. GoPro has been extremely successful in bring durable, high performance recording equipment to the masses. From what I read, they perform well in lower-light conditions as well (we'll see). Unfortunately, I can't find many specs about its sensors, and it has no API or wired streaming capability. Still, reliable down to 40m, I conjecture that our new (GoPro Hero4 Black)[] shall perform well as a cinematic camera.

Where we're left

Considering the above characteristics affecting low-light performance, we can see the ultimate importance of pixel size, but also a clean digitization process (for low noise). However, there might also exist a sensor with a fantastic quantum efficiency that offsets a smaller pixel. There might also exist other digital camera sensor attributes, important to low-light performance, which I've failed to cover here.

After this research, I'm left with four sensors from which to choose: Sony IMX 174/249/250/252. Looking at finished products, from three separate manufacturers, using these sensors, I briefly profiles each:

  • IMX174 - $1000+, very high FPS, 1900x1200, good QE, standard noise
  • IMX249 - ~$500, low FPS, 1900x1200, good QE, standard noise
  • IMX250 - $1000+, standard FPS, 2048x2048, standard QE, low noise
  • IMX252 - $1000+, high FPS, 2048x1536, standard QE, low noise

The IMX174 and IMX249 look almost exactly the same, save FPS and price. I'm leaning toward those two more than the others. They also have a fortunate QE quirk where they absorb 74% of photons in the green wavelength (525nm). Given that green penetrates deeper into the ocean, and maybe even with the help of some green LEDs of the correct wavelength, they could pleasantly surprise in regard to performance in our application!