Smart camera basics: Matching image sensor, analytics and processor performance

Intelligent Network Cameras are an exciting new technology. This article will help you understand the basics of intelligent network cameras (or smart cameras), enabling technologies, and available design tools.

Intelligent network cameras — the basics
Video security systems today, thanks to technology, are far more efficient than just a few years ago. Modern digital video surveillance systems record video directly to a rewriteable media such as hard disk or flash drive. Cameras can be connected to standard data networks and the Internet for monitoring and control via secure communications anywhere in the world.

Adding intelligence to the surveillance camera makes the entire system more economical. As an example, to record video when nothing of interest is going on in the scene is pointless. Modern electronics can easily discern movement in a scene and trigger recording only during those moments, reducing the amount of storage capacity needed.

A more significant cost reduction comes in the form of automatic monitoring which reduces the need for human video scanning and has the additional effect of a more reliable system by reducing human error.

View full size
Figure 1: Intelligent network security system

I offer the following table to illustrate the benefits of modern technology in intelligent network cameras.

View full size
Table 1: Technologies in intelligent network cameras

Next: Essential network camera ingredients — digital image capture, image sensors

Essential ingredients
All network cameras have basically the same fundamental components. These are digital image capture (Either using an image sensor or an analog video decoder), digital video compression, video analytics, and a network connection.

View full size
Figure 2: Fundamental components of network cameras

Digital image capture
One of the best applications of existing technology for the security market has been the use of the digital image sensor. When compared with analog video signals from camera modules, using image sensors allows for more design flexibility and less total system cost.

Figure 3: Analog and digital camera comparison

Image sensor basics
An image sensor is an integrated circuit that converts a visual image into an electrical signal. These are used in digital still cameras, video cameras, and other imaging devices. There are two basic image sensor types: CCD, or Charge Couple Device, and CMOS.

CCD imagers have been around longer and are analog devices that convert light to a charge that is stored in a capacitor. This charge is converted into a digital signal using an external device containing an A/D converter.

CMOS imagers are a newer mixed-signal technology whereby the analog to digital conversion is done at each sensor cell, providing a digital signal out with no need for external devices to perform the conversion. CMOS sensors are available with integrated on-chip image preprocessing logic and this can reduce total system cost. While CCD image sensors are an attractive technology for some applications, we will concentrate on CMOS image sensors in this article.

Next: Image sensor resolution and tradeoffs

Image sensor resolution and tradeoffs
There are several parameters to think about when selecting a CMOS image sensor. First, what sensor resolution does your application require? In security applications, higher image sensor resolution has the advantage of finer image detail, and can enable a higher degree of digital “zoom” while maintaining acceptable levels of image quality.

There are several tradeoffs when choosing a higher resolution sensor. As sensor resolution goes up, for a certain imager size, the pixel element area goes down. This has a direct effect on reducing the amount of light each sensor element “sees”, and can require a longer exposure time for an equivalent light condition. This has an effect on frame update rate.

Additionally, with larger image sizes, the clock rate required to move the video data to the processor is higher (for a given frame rate), and this has power consequences. Furthermore, processing more data requires more processor throughput.

Here is a partial list of sensor parameters to consider:

View full size
Table 2: Image sensor parameters

Next: The image pipe and video capture

The image pipe
SOC-type CMOS image sensors most likely will include a set of features which really renders them into a camera-on-a-chip. The collection of these features is called an image pipe. This is a collection of video processing functions such as white balance, gamma correction, de-mosaic, and formatting. These are essential functions that render the raw output of the sensor into a video image suitable for processing.

View full size
Figure 4: The image pipe

Each system that uses an image sensor will need an image pipe, but this can be located in one of three different places: the processor, the image sensor itself, or in an external component such as an ASIC or FPGA. The location of the image pipe is a function of the system performance needed and processor and image sensor selection. CMOS image sensors with an image pipe are called SOC sensors. Select few processors have an image pipe.

View full size
Figure 5: Image pipe location options

Video capture
A function which directly affects the processor selection is the need for a video input interface.

CMOS image sensors and analog video decoders typically employ a unidirectional synchronous parallel or serial bus to transfer image data to the processor. This usually consists of 8 to 12 bits of pixel data, horizontal and vertical data valid signals, and a clock. This kind of interface will not connect directly onto the general-purpose interfaces most processors have, such as an asynchronous or synchronous memory interface. A dedicated hardware block is needed to capture the data from this bus and move it into processor memory.

In most processors, this hardware block is called a camera sensor interface (CSI) or video input interface. In Analog Devices' Blackfin processors for example, this interface, along with a special DMA capability called two-dimensional (2D) DMA, completely automates the capture of video data into frames in memory. The application can define a multitude of frame buffers in a chained fashion and the video input interface with 2D DMA will load them sequentially.

A well designed video input interface and DMA capability are essential to minimize the required processor load for video capture.

Next: Digital video compression and Multistreaming

Digital video compression
Video, in raw digital form, requires a great deal of bandwidth to transmit. For example, standard definition television, with a resolution of 640×480 pixels at 30 frames/second and 16 bits/pixel, has a raw digital bit rate of 150 megabits/second. This is far too high to be cost-effective given the bandwidth of existing LANs and telecommunications networks. An even greater problem is HDTV, which can take up to 1.5 gigabits/sec of bandwidth, far exceeding the limits of gigabit Ethernet. In order to transmit this video it has to be compressed and this is done using a codec. There are many standard codecs that a designer can select. Each has its own tradeoffs in features, video quality, and computational requirements.

Figure 6: Video data reduction

Other considerations when choosing a video codec are the costs associated with it. These are highly specialized algorithms and require some extensive optimization to run efficiently on a given processor platform. Some processor vendors offer free codec modules with an agreement that these are only run on the particular vendors' hardware. Other codecs are available from software IP vendors for an up-front fee and/or royalty.

View full size
Table 3: Codecs
*Spatial is also called Intraframe Compression
**Temporal is also called Interframe Compression.

Midrange and high-end smart cameras might provide two or more compressed video streams, each at different resolutions and frame rates. This is often called multistreaming. A low bandwidth video stream allows video monitoring via low-speed telephone lines or wirelessly using a cellular modem. At the same time, a full resolution video stream is stored locally and is sent when an event occurs.

View full size
Figure 7: Multistreaming

Multistreaming: advantages and tradeoffs

  • Allows use of low speed communications networks for monitoring
  • Saves network bandwidth
  • Allows full resolution event recording and capture


  • High processing throughput needed as two codecs run simultaneously — leads to increased camera cost

Next: On-board video analytics and design tools

On-board video analytics
Video Analytics is a generic term for a group of video processing algorithms used to process the live video stream to detect particular events. These events could be as simple as a person or vehicle crossing a “virtual trip wire” or as advanced as face recognition.

On-board video analytics means that these algorithms are located in the video camera itself, either in software running on a processor, DSP, or in an FPGA as a special IP core. The processing requirements of each of these analysis methods are highly dependant on the algorithm type, the processing platform, and the amount of optimization performed.

On board video analytics: advantages and tradeoffs

  • Reduces false alarms
  • Reduces network traffic
  • Distributed processing reduces cost of central monitoring system


  • Increase camera cost
  • Highly complex – 3rd Party IP required in most circumstances
  • High processing throughput needed for most algorithms

View full size
Table 4: Descriptions of video analytic events

View full size
Figure 8: A more extensive list of video analytics events

Design tools
Clearly, there are a number of aspects to consider when designing a smart camera, and there are some specialized tools to aid help the engineer get a head-start in development.

Figure 9: Video Surveillance Development Kit

Avnet's Video Surveillance Development Kit (VSK), for example, provides a simple evaluation and development platform for smart camera applications that allows customers to investigate performance features of a CMOS image sensor and create and test software applications prior to designing their own circuit boards. The kit includes a choice of image sensors, a 500MHz dual core Blackfin processor, and Ethernet.

This article has discussed the different considerations when designing an intelligent network camera. CMOS image sensor selection is quite important and determines a major portion of the camera feature set. Selecting the processor involves making sure it has the right interfaces and performance to accomplish the design objectives. Video capture, compression, and analysis represent significant processing loads.

About the author
John Weber is a Technology Specialist at Avnet Electronics Marketing with a focus on processors and DSP. Prior to this role, John worked as an FAE and before that as a designer of customized radar and robotics systems for government use. He has a BSEE from the UA Huntsville. He can be reached at .

0 comments on “Smart camera basics: Matching image sensor, analytics and processor performance

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.