Limits...
Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems.

Ehsan S, Clark AF - Sensors (Basel) (2015)

Bottom Line: The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size.An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video).Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK. sehsan@essex.ac.uk.

ABSTRACT
The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video). Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems.

No MeSH data available.


Block diagram of the proposed architecture. andare the image pixel value and the integral image value at locationin the image.is the row sum at that particular location.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4541907&req=5

sensors-15-16804-f009: Block diagram of the proposed architecture. andare the image pixel value and the integral image value at locationin the image.is the row sum at that particular location.

Mentions: Figure 9 shows a block diagram of the proposed architecture for an embedded integral image computation engine. This pipelined architecture computes two integral image values in a single clock cycle. Unlike the parallel methods presented in Section 3 and Section 4 which store a complete row of integral image values in internal memory for computing the next row, this design strategy saves only the difference values of the adjacent columns in a row for calculating the next row. Only the integral image value for the first column in that row is saved in a separate register to allow computation of the integral image values from the stored difference values. Although the depth of the internal memory remains the same as mentioned above, the proposed design approach requires the width to be(number of rows × maximum image pixel value) rounded to the upper integer value. Table 3 provides the timing results and the results for internal memory reduction when prototyped on an FPGA, a Virtex-6 XC6VLX240T device, for some common image sizes with 8-bit pixels. The architecture is implemented using Verilog. The maximum frequency of the design is 146.71 MHz. Please note that the values in parenthesis indicate the percentage of Virtex-6 resources consumed. It is evident from Table 3 that the architecture is capable of achieving significant memory reduction over other recursion-based methods, even for small image sizes while providing high throughput.


Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems.

Ehsan S, Clark AF - Sensors (Basel) (2015)

Block diagram of the proposed architecture. andare the image pixel value and the integral image value at locationin the image.is the row sum at that particular location.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4541907&req=5

sensors-15-16804-f009: Block diagram of the proposed architecture. andare the image pixel value and the integral image value at locationin the image.is the row sum at that particular location.
Mentions: Figure 9 shows a block diagram of the proposed architecture for an embedded integral image computation engine. This pipelined architecture computes two integral image values in a single clock cycle. Unlike the parallel methods presented in Section 3 and Section 4 which store a complete row of integral image values in internal memory for computing the next row, this design strategy saves only the difference values of the adjacent columns in a row for calculating the next row. Only the integral image value for the first column in that row is saved in a separate register to allow computation of the integral image values from the stored difference values. Although the depth of the internal memory remains the same as mentioned above, the proposed design approach requires the width to be(number of rows × maximum image pixel value) rounded to the upper integer value. Table 3 provides the timing results and the results for internal memory reduction when prototyped on an FPGA, a Virtex-6 XC6VLX240T device, for some common image sizes with 8-bit pixels. The architecture is implemented using Verilog. The maximum frequency of the design is 146.71 MHz. Please note that the values in parenthesis indicate the percentage of Virtex-6 resources consumed. It is evident from Table 3 that the architecture is capable of achieving significant memory reduction over other recursion-based methods, even for small image sizes while providing high throughput.

Bottom Line: The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size.An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video).Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK. sehsan@essex.ac.uk.

ABSTRACT
The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video). Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems.

No MeSH data available.