Note:
The chapter describes functions for image processing and analysis.
Most of the functions work with 2d arrays of pixels. We refer the arrays
as "images" however they do not necessarily have to be IplImage’s, they may
be CvMat’s or CvMatND’s as well.
Calculates first, second, third or mixed image derivatives using extended Sobel operator
void cvSobel( const CvArr* src, CvArr* dst, int xorder, int yorder, int aperture_size=3 );
aperture_size
=1 3x1 or 1x3 kernel is used (Gaussian smoothing is not done).
There is also special value CV_SCHARR
(=1) that corresponds to 3x3 Scharr filter that may
give more accurate results than 3x3 Sobel. Scharr aperture is:
 3 0 3 10 0 10  3 0 3for xderivative or transposed for yderivative.
The function cvSobel
calculates the image derivative by convolving the image
with the appropriate kernel:
dst(x,y) = d^{xorder+yorder}src/dx^{xorder}•dy^{yorder} _{(x,y)}The Sobel operators combine Gaussian smoothing and differentiation so the result is more or less robust to the noise. Most often, the function is called with (xorder=1, yorder=0, aperture_size=3) or (xorder=0, yorder=1, aperture_size=3) to calculate first x or y image derivative. The first case corresponds to
1 0 1 2 0 2 1 0 1
kernel and the second one corresponds to
1 2 1  0 0 0  1 2 1 or  1 2 1  0 0 0 1 2 1kernel, depending on the image origin (
origin
field of IplImage
structure).
No scaling is done, so the destination image usually has larger by absolute value numbers than
the source image. To avoid overflow, the function requires 16bit destination image if
the source image is 8bit. The result can be converted back to 8bit using cvConvertScale or
cvConvertScaleAbs functions. Besides 8bit images the function
can process 32bit floatingpoint images.
Both source and destination must be singlechannel images of equal size or ROI size.
Calculates Laplacian of the image
void cvLaplace( const CvArr* src, CvArr* dst, int aperture_size=3 );
The function cvLaplace
calculates Laplacian of the source image by summing
second x and y derivatives calculated using Sobel operator:
dst(x,y) = d^{2}src/dx^{2} + d^{2}src/dy^{2}
Specifying aperture_size
=1 gives the fastest variant that is equal to
convolving the image with the following kernel:
0 1 0 1 4 1 0 1 0
Similar to cvSobel function, no scaling is done and the same combinations of input and output formats are supported.
Implements Canny algorithm for edge detection
void cvCanny( const CvArr* image, CvArr* edges, double threshold1, double threshold2, int aperture_size=3 );
The function cvCanny
finds the edges on the input image image
and marks them in the
output image edges
using the Canny algorithm. The smallest of threshold1
and
threshold2
is used for edge linking, the largest  to find initial segments of strong edges.
Calculates feature map for corner detection
void cvPreCornerDetect( const CvArr* image, CvArr* corners, int aperture_size=3 );
The function cvPreCornerDetect
calculates the function
D_{x}^{2}D_{yy}+D_{y}^{2}D_{xx}  2D_{x}D_{y}D_{xy}
where D_{?} denotes one of the first image derivatives and D_{??} denotes a second image
derivative. The corners can be found as local maximums of the function:
// assume that the image is floatingpoint IplImage* corners = cvCloneImage(image); IplImage* dilated_corners = cvCloneImage(image); IplImage* corner_mask = cvCreateImage( cvGetSize(image), 8, 1 ); cvPreCornerDetect( image, corners, 3 ); cvDilate( corners, dilated_corners, 0, 1 ); cvSubS( corners, dilated_corners, corners ); cvCmpS( corners, 0, corner_mask, CV_CMP_GE ); cvReleaseImage( &corners ); cvReleaseImage( &dilated_corners );
Calculates eigenvalues and eigenvectors of image blocks for corner detection
void cvCornerEigenValsAndVecs( const CvArr* image, CvArr* eigenvv, int block_size, int aperture_size=3 );
For every pixel The function cvCornerEigenValsAndVecs
considers
block_size
× block_size
neighborhood S(p). It calculates
covariation matrix of derivatives over the neighborhood as:
 sum_{S(p)}(dI/dx)^{2} sum_{S(p)}(dI/dx•dI/dy) M =    sum_{S(p)}(dI/dx•dI/dy) sum_{S(p)}(dI/dy)^{2} 
After that it finds eigenvectors and eigenvalues of the matrix and stores
them into destination image in form
(λ_{1}, λ_{2}, x_{1}, y_{1}, x_{2}, y_{2}),
where
λ_{1}, λ_{2}  eigenvalues of M
; not sorted
(x_{1}, y_{1})  eigenvector corresponding to λ_{1}
(x_{2}, y_{2})  eigenvector corresponding to λ_{2}
Calculates minimal eigenvalue of gradient matrices for corner detection
void cvCornerMinEigenVal( const CvArr* image, CvArr* eigenval, int block_size, int aperture_size=3 );
image
The function cvCornerMinEigenVal
is similar to cvCornerEigenValsAndVecs but
it calculates and stores only the minimal eigenvalue of derivative covariation matrix for every pixel,
i.e. min(λ_{1}, λ_{2}) in terms of the previous function.
Harris edge detector
void cvCornerHarris( const CvArr* image, CvArr* harris_dst, int block_size, int aperture_size=3, double k=0.04 );
image
The function cvCornerHarris
finds feature points (corners) in the image
using Harris' method. Similarly to
cvCornerMinEigenVal and
cvCornerEigenValsAndVecs,
for each pixel it calculates 2x2 gradient covariation matrix
M
over block_size×block_size
neighborhood.
Then, it stores
det(M)  k*trace(M)^{2}to the corresponding pixel of the destination image. The corners can be found as local maxima in the destination image.
Refines corner locations
void cvFindCornerSubPix( const CvArr* image, CvPoint2D32f* corners, int count, CvSize win, CvSize zero_zone, CvTermCriteria criteria );
win
=(5,5) then
5*2+1 × 5*2+1 = 11 × 11 search window is used.
criteria
may specify either of or both the maximum
number of iteration and the required accuracy.
The function cvFindCornerSubPix
iterates to find the subpixel accurate location
of corners, or radial saddle points, as shown in on the picture below.
Subpixel accurate corner locator is based on the observation that every vector
from the center q
to a point p
located within a neighborhood of q
is orthogonal
to the image gradient at p
subject to image and measurement noise. Consider the expression:
ε_{i}=DI_{pi}^{T}•(qp_{i})where
DI_{pi}
is the image gradient
at the one of the points p_{i}
in a neighborhood of q
.
The value of q
is to be found such that ε_{i}
is minimized.
A system of equations may be set up with ε_{i}
' set to zero:
sum_{i}(DI_{pi}•DI_{pi}^{T})•q  sum_{i}(DI_{pi}•DI_{pi}^{T}•p_{i}) = 0
where the gradients are summed within a neighborhood ("search window") of q
.
Calling the first gradient term G
and the second gradient term b
gives:
q=G^{1}•b
The algorithm sets the center of the neighborhood window at this new center q
and then iterates until the center keeps within a set threshold.
Determines strong corners on image
void cvGoodFeaturesToTrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image, CvPoint2D32f* corners, int* corner_count, double quality_level, double min_distance, const CvArr* mask=NULL, int block_size=3, int use_harris=0, double k=0.04 );
image
.
eig_image
.
use_harris≠0
The function cvGoodFeaturesToTrack
finds corners with big eigenvalues in the
image. The function first calculates the minimal eigenvalue for every source image pixel
using cvCornerMinEigenVal function and stores them in eig_image
.
Then it performs nonmaxima suppression (only local maxima in 3x3 neighborhood remain).
The next step is rejecting the corners with the
minimal eigenvalue less than quality_level
•max(eig_image
(x,y)). Finally,
the function ensures that all the corners found are distanced enough one from
another by considering the corners (the most strongest corners are considered first)
and checking that the distance between the newly considered feature and the features considered earlier
is larger than min_distance
. So, the function removes the features than are too close
to the stronger features.
Extracts Speeded Up Robust Features from image
void cvExtractSURF( const CvArr* image, const CvArr* mask, CvSeq** keypoints, CvSeq** descriptors, CvMemStorage* storage, CvSURFParams params );
CvSURFPoint
structures:
typedef struct CvSURFPoint { CvPoint2D32f pt; // position of the feature within the image int laplacian; // 1, 0 or +1. sign of the laplacian at the point. // can be used to speedup feature comparison // (normally features with laplacians of different signs can not match) int size; // size of the feature float dir; // orientation of the feature: 0..360 degrees float hessian; // value of the hessian (can be used to approximately estimate the feature strengths; // see also params.hessianThreshold) } CvSURFPoint;
params.extended
value, each element of the sequence
will be either 64element or 128element floatingpoint (CV_32F) vector.
If the parameter is NULL, the descriptors are not computed.
CvSURFParams
:
typedef struct CvSURFParams { int extended; // 0 means basic descriptors (64 elements each), // 1 means extended descriptors (128 elements each) double hessianThreshold; // only features with keypoint.hessian larger than that are extracted. // good default value is ~300500 (can depend on the average // local contrast and sharpness of the image). // user can further filter out some features based on their hessian values // and other characteristics int nOctaves; // the number of octaves to be used for extraction. // With each next octave the feature size is doubled (3 by default) int nOctaveLayers; // The number of layers within each octave (4 by default) } CvSURFParams; CvSURFParams cvSURFParams(double hessianThreshold, int extended=0); // returns default parameters
The function cvExtractSURF
finds robust features in the image, as described in
[Bay06]. For each feature it returns its location, size, orientation
and optionally the descriptor, basic or extended. The function can be used for object tracking
and localization, image stitching etc. See find_obj.cpp demo in OpenCV samples directory.
Reads raster line to buffer
int cvSampleLine( const CvArr* image, CvPoint pt1, CvPoint pt2, void* buffer, int connectivity=8 );
pt2.x
pt1.x
+1, pt2.y
pt1.y
+1 ) points in case
of 8connected line and pt2.x
pt1.x
+pt2.y
pt1.y
+1 in case
of 4connected line.
The function cvSampleLine
implements a particular case of application of line
iterators. The function reads all the image points lying on the line between pt1
and pt2
, including the ending points, and stores them into the buffer.
Retrieves pixel rectangle from image with subpixel accuracy
void cvGetRectSubPix( const CvArr* src, CvArr* dst, CvPoint2D32f center );
The function cvGetRectSubPix
extracts pixels from src
:
dst(x, y) = src(x + center.x  (width(dst)1)*0.5, y + center.y  (height(dst)1)*0.5)
where the values of pixels at noninteger coordinates are retrieved using bilinear interpolation. Every channel of multiplechannel images is processed independently. Whereas the rectangle center must be inside the image, the whole rectangle may be partially occluded. In this case, the replication border mode is used to get pixel values beyond the image boundaries.
Retrieves pixel quadrangle from image with subpixel accuracy
void cvGetQuadrangleSubPix( const CvArr* src, CvArr* dst, const CvMat* map_matrix );
A
b
] (see the discussion).
The function cvGetQuadrangleSubPix
extracts pixels from src
at subpixel accuracy
and stores them to dst
as follows:
dst(x, y)= src( A_{11}x'+A_{12}y'+b_{1}, A_{21}x'+A_{22}y'+b_{2}), whereA
andb
are taken frommap_matrix
 A_{11} A_{12} b_{1}  map_matrix =    A_{21} A_{22} b_{2} , x'=x(width(dst)1)*0.5, y'=y(height(dst)1)*0.5
where the values of pixels at noninteger coordinates A•(x,y)^{T}+b are retrieved using bilinear interpolation. When the function needs pixels outside of the image, it uses replication border mode to reconstruct the values. Every channel of multiplechannel images is processed independently.
Resizes image
void cvResize( const CvArr* src, CvArr* dst, int interpolation=CV_INTER_LINEAR );
CV_INTER_NN
method.
The function cvResize
resizes image src
(or its ROI)
so that it fits exactly to dst
(or its ROI).
Applies affine transformation to the image
void cvWarpAffine( const CvArr* src, CvArr* dst, const CvMat* map_matrix, int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS, CvScalar fillval=cvScalarAll(0) );
fillval
.
matrix
is inverse transform from destination image
to source and, thus, can be used directly for pixel interpolation. Otherwise,
the function finds the inverse transform from map_matrix
.
The function cvWarpAffine
transforms source image using the specified
matrix:
dst(x’,y’)<src(x,y) (x’,y’)^{T}=map_matrix•(x,y,1)^{T}+b if CV_WARP_INVERSE_MAP is not set, (x, y)^{T}=map_matrix•(x’,y&apos,1)^{T}+b otherwise
The function is similar to cvGetQuadrangleSubPix but they are not exactly the same. cvWarpAffine requires input and output image have the same data type, has larger overhead (so it is not quite suitable for small images) and can leave part of destination image unchanged. While cvGetQuadrangleSubPix may extract quadrangles from 8bit images into floatingpoint buffer, has smaller overhead and always changes the whole destination image content.
To transform a sparse set of points, use cvTransform function from cxcore.
Calculates affine transform from 3 corresponding points
CvMat* cvGetAffineTransform( const CvPoint2D32f* src, const CvPoint2D32f* dst, CvMat* map_matrix );
The function cvGetAffineTransform
calculates the
matrix of an affine transform such that:
(x'_{i},y'_{i})^{T}=map_matrix•(x_{i},y_{i},1)^{T}
where dst(i)=(x'_{i},y'_{i}), src(i)=(x_{i},y_{i}), i=0..2
.
Calculates affine matrix of 2d rotation
CvMat* cv2DRotationMatrix( CvPoint2D32f center, double angle, double scale, CvMat* map_matrix );
The function cv2DRotationMatrix
calculates matrix:
[ α β  (1α)*center.x  β*center.y ] [ β α  β*center.x + (1α)*center.y ] where α=scale*cos(angle), β=scale*sin(angle)
The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted.
Applies perspective transformation to the image
void cvWarpPerspective( const CvArr* src, CvArr* dst, const CvMat* map_matrix, int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS, CvScalar fillval=cvScalarAll(0) );
fillval
.
matrix
is inverse transform from destination image
to source and, thus, can be used directly for pixel interpolation. Otherwise,
the function finds the inverse transform from map_matrix
.
The function cvWarpPerspective
transforms source image using
the specified matrix:
dst(x’,y’)<src(x,y) (t•x’,t•y’,t)^{T}=map_matrix•(x,y,1)^{T}+b if CV_WARP_INVERSE_MAP is not set, (t•x, t•y, t)^{T}=map_matrix•(x’,y&apos,1)^{T}+b otherwise
For a sparse set of points use cvPerspectiveTransform function from cxcore.
Calculates perspective transform from 4 corresponding points
CvMat* cvGetPerspectiveTransform( const CvPoint2D32f* src, const CvPoint2D32f* dst, CvMat* map_matrix ); #define cvWarpPerspectiveQMatrix cvGetPerspectiveTransform
The function cvGetPerspectiveTransform
calculates
matrix of perspective transform such that:
(t_{i}•x'_{i},t_{i}•y'_{i},t_{i})^{T}=map_matrix•(x_{i},y_{i},1)^{T}
where dst(i)=(x'_{i},y'_{i}), src(i)=(x_{i},y_{i}), i=0..3
.
Applies generic geometrical transformation to the image
void cvRemap( const CvArr* src, CvArr* dst, const CvArr* mapx, const CvArr* mapy, int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS, CvScalar fillval=cvScalarAll(0) );
fillval
.
The function cvRemap
transforms source image using
the specified map:
dst(x,y)<src(mapx(x,y),mapy(x,y))
Similar to other geometrical transformations, some interpolation method (specified by user) is used to extract pixels with noninteger coordinates.
Remaps image to logpolar space
void cvLogPolar( const CvArr* src, CvArr* dst, CvPoint2D32f center, double M, int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS );
matrix
is inverse transform from destination image
to source and, thus, can be used directly for pixel interpolation. Otherwise,
the function finds the inverse transform from map_matrix
.
The function cvLogPolar
transforms source image using
the following transformation:
Forward transformation (CV_WARP_INVERSE_MAP
is not set): dst(phi,rho)<src(x,y) Inverse transformation (CV_WARP_INVERSE_MAP
is set): dst(x,y)<src(phi,rho), where rho=M*log(sqrt(x^{2}+y^{2})) phi=atan(y/x)
The function emulates the human "foveal" vision and can be used for fast scale and rotationinvariant template matching, for object tracking etc.
#include <cv.h> #include <highgui.h> int main(int argc, char** argv) { IplImage* src; if( argc == 2 && (src=cvLoadImage(argv[1],1) != 0 ) { IplImage* dst = cvCreateImage( cvSize(256,256), 8, 3 ); IplImage* src2 = cvCreateImage( cvGetSize(src), 8, 3 ); cvLogPolar( src, dst, cvPoint2D32f(src>width/2,src>height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS ); cvLogPolar( dst, src2, cvPoint2D32f(src>width/2,src>height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS+CV_WARP_INVERSE_MAP ); cvNamedWindow( "logpolar", 1 ); cvShowImage( "logpolar", dst ); cvNamedWindow( "inverse logpolar", 1 ); cvShowImage( "inverse logpolar", src2 ); cvWaitKey(); } return 0; }
And this is what the program displays when opencv/samples/c/fruits.jpg is passed to it
Creates structuring element
IplConvKernel* cvCreateStructuringElementEx( int cols, int rows, int anchor_x, int anchor_y, int shape, int* values=NULL );
CV_SHAPE_RECT
, a rectangular element;
CV_SHAPE_CROSS
, a crossshaped element;
CV_SHAPE_ELLIPSE
, an elliptic element;
CV_SHAPE_CUSTOM
, a userdefined element. In this case the parameter values
specifies the mask, that is, which neighbors of the pixel must be considered.
NULL
, then all values are considered
nonzero, that is, the element is of a rectangular shape. This parameter is
considered only if the shape is CV_SHAPE_CUSTOM
.
The function cv CreateStructuringElementEx allocates and fills the structure
IplConvKernel
, which can be used as a structuring element in the morphological
operations.
Deletes structuring element
void cvReleaseStructuringElement( IplConvKernel** element );
The function cvReleaseStructuringElement
releases the structure IplConvKernel
that is no longer needed. If *element
is NULL
, the function has no effect.
Erodes image by using arbitrary structuring element
void cvErode( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );
NULL
, a 3×3 rectangular
structuring element is used.
The function cvErode
erodes the source image using the specified structuring element
that determines the shape of a pixel neighborhood over which the minimum is taken:
dst=erode(src,element): dst(x,y)=min_{((x',y') in element)})src(x+x',y+y')
The function supports the inplace mode. Erosion can be applied several (iterations
)
times. In case of color image each channel is processed independently.
Dilates image by using arbitrary structuring element
void cvDilate( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );
NULL
, a 3×3 rectangular
structuring element is used.
The function cvDilate
dilates the source image using the specified structuring element
that determines the shape of a pixel neighborhood over which the maximum is taken:
dst=dilate(src,element): dst(x,y)=max_{((x',y') in element)})src(x+x',y+y')
The function supports the inplace mode. Dilation can be applied several (iterations
)
times. In case of color image each channel is processed independently.
Performs advanced morphological transformations
void cvMorphologyEx( const CvArr* src, CvArr* dst, CvArr* temp, IplConvKernel* element, int operation, int iterations=1 );
CV_MOP_OPEN
 openingCV_MOP_CLOSE
 closingCV_MOP_GRADIENT
 morphological gradientCV_MOP_TOPHAT
 "top hat"CV_MOP_BLACKHAT
 "black hat"
The function cvMorphologyEx
can perform advanced morphological
transformations using erosion and dilation as basic operations.
Opening: dst=open(src,element)=dilate(erode(src,element),element) Closing: dst=close(src,element)=erode(dilate(src,element),element) Morphological gradient: dst=morph_grad(src,element)=dilate(src,element)erode(src,element) "Top hat": dst=tophat(src,element)=srcopen(src,element) "Black hat": dst=blackhat(src,element)=close(src,element)src
The temporary image temp
is required for morphological gradient and, in case of inplace
operation, for "top hat" and "black hat".
Smoothes the image in one of several ways
void cvSmooth( const CvArr* src, CvArr* dst, int smoothtype=CV_GAUSSIAN, int size1=3, int size2=0, double sigma1=0, double sigma2=0 );
size1
×size2
neighborhood of the pixel.
If the neighborhood size varies from pixel to pixel, compute the sums
using integral image (cvIntegral).
size1
×size2
neighborhood of the pixel.
size1
×size2
.
sigma1
and sigma2
may optionally be
used to specify shape of the kernel.
size1
×size1
(i.e. only square aperture can be used).
That is, for each pixel the result is the median computed
over size1
×size1
neighborhood.
sigma1
and
spatial sigma=sigma2
. If size1!=0
, then
a circular kernel with diameter size1
is used;
otherwise the diameter of the kernel is computed from sigma2
.
Information about bilateral filtering
can be found at
http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/MANDUCHI1/Bilateral_Filtering.html
size2
is zero, it is set to size1
.
When not 0, it should be odd too.
In case of Gaussian kernel this parameter may specify Gaussian sigma (standard deviation).
If it is zero, it is calculated from the kernel size:
sigma = (n/2  1)*0.3 + 0.8, where n=param1 for horizontal kernel, n=param2 for vertical kernel.With the standard sigma for small kernels (3×3 to 7×7) the performance is better. If
param3
is not zero, while param1
and param2
are zeros, the kernel size is calculated from the sigma (to provide accurate enough operation).
In case of Bilateral filter the parameter specifies color sigma; the larger the value, the stronger the pasterization effect of the filter is.
In case of nonsquare Gaussian kernel the parameter may be used to specify a different
(from param3
) sigma in the vertical direction.
In case of Bilateral filter the parameter specifies spatial sigma; the larger the value,
the stronger the blurring effect of the filter. Note that with large sigma2
the processing speed decreases substantionally, so it is recommended to limit the kernel
size using the parameter size1
.
The function cvSmooth
smoothes image using one of the predefined methods. Every of the methods
has some features and restrictions listed below:
Applies linear filter to image
void cvFilter2D( const CvArr* src, CvArr* dst, const CvMat* kernel, CvPoint anchor=cvPoint(1,1));
The function cvFilter2D
applies the specified linear filter to the image.
Inplace operation is supported. When the aperture is partially outside the image,
the function interpolates outlier pixel values from the nearest pixels at the
image boundary. If ROI is set in the input
image, cvFilter2D treats it, similarly to many other OpenCV functions, as if
it were an isolated image, i.e. pixels inside the image but outside of the ROI
are ignored. If it is undesirable, consider new C++ filtering classes declared
in cv.hpp.
Copies image and makes border around it
void cvCopyMakeBorder( const CvArr* src, CvArr* dst, CvPoint offset, int bordertype, CvScalar value=cvScalarAll(0) );
IPL_BORDER_CONSTANT

border is filled with the fixed value, passed as last parameter of the function.IPL_BORDER_REPLICATE

the pixels from the top and bottom rows, the leftmost and rightmost columns are replicated
to fill the border.IPL_BORDER_REFLECT
and IPL_BORDER_WRAP
,
are currently unsupported).
bordertype=IPL_BORDER_CONSTANT
.
The function cvCopyMakeBorder
copies the source 2D array into interior of destination array
and makes a border of the specified type around the copied area.
The function is useful when one needs to emulate border type that is different from the one embedded into a specific
algorithm implementation. For example, morphological functions, as well as most of other filtering functions in OpenCV,
internally use replication border type, while the user may need zero border or a border, filled with 1's or 255's.
Calculates integral images
void cvIntegral( const CvArr* image, CvArr* sum, CvArr* sqsum=NULL, CvArr* tilted_sum=NULL );
W
×H
, 8bit or floatingpoint (32f or 64f) image.
W+1
×H+1
, 32bit integer or double precision floatingpoint (64f).
W+1
×H+1
, double precision floatingpoint (64f).
W+1
×H+1
, of the same data type as sum
.
The function cvIntegral
calculates one or more integral images for the source image as following:
sum(X,Y)=sum_{x<X,y<Y}image(x,y) sqsum(X,Y)=sum_{x<X,y<Y}image(x,y)^{2} tilted_sum(X,Y)=sum_{y<Y,abs(xX)<y}image(x,y)
Using these integral images, one may calculate sum, mean, standard deviation over arbitrary upright or rotated rectangular region of the image in a constant time, for example:
sum_{x1<=x<x2,y1<=y<y2}image(x,y)=sum(x2,y2)sum(x1,y2)sum(x2,y1)+sum(x1,x1)
Using integral images it is possible to do variablesize image blurring, block correlation etc. In case of multichannel input images the integral images must have the same number of channels, and every channel is processed independently.
Converts image from one color space to another
void cvCvtColor( const CvArr* src, CvArr* dst, int code );
The function cvCvtColor
converts input image from one color space to another.
The function ignores colorModel
and channelSeq
fields of IplImage
header,
so the source image color space should be specified correctly (including order of the channels in case
of RGB space, e.g. BGR means 24bit format with B_{0} G_{0} R_{0} B_{1} G_{1} R_{1} ... layout,
whereas RGB means 24bit format with R_{0} G_{0} B_{0} R_{1} G_{1} B_{1} ... layout).
The conventional range for R,G,B channel values is:
The function can do the following transformations:
RGB[A]>Gray: Y<0.299*R + 0.587*G + 0.114*B Gray>RGB[A]: R<Y G<Y B<Y A<0
CV_BGR2XYZ, CV_RGB2XYZ, CV_XYZ2BGR, CV_XYZ2RGB
):
X 0.412453 0.357580 0.180423 R Y < 0.212671 0.715160 0.072169*G Z 0.019334 0.119193 0.950227 B R  3.240479 1.53715 0.498535 X G < 0.969256 1.875991 0.041556*Y B  0.055648 0.204043 1.057311 Z X, Y and Z cover the whole value range (in case of floatingpoint images Z may exceed 1).
CV_BGR2YCrCb, CV_RGB2YCrCb, CV_YCrCb2BGR, CV_YCrCb2RGB
)
Y < 0.299*R + 0.587*G + 0.114*B Cr < (RY)*0.713 + delta Cb < (BY)*0.564 + delta R < Y + 1.403*(Cr  delta) G < Y  0.344*(Cr  delta)  0.714*(Cb  delta) B < Y + 1.773*(Cb  delta), { 128 for 8bit images, where delta = { 32768 for 16bit images { 0.5 for floatingpoint images Y, Cr and Cb cover the whole value range.
CV_BGR2HSV, CV_RGB2HSV, CV_HSV2BGR, CV_HSV2RGB
)
// In case of 8bit and 16bit images // R, G and B are converted to floatingpoint format and scaled to fit 0..1 range V < max(R,G,B) S < (Vmin(R,G,B))/V if V≠0, 0 otherwise (G  B)*60/S, if V=R H < 180+(B  R)*60/S, if V=G 240+(R  G)*60/S, if V=B if H<0 then H<H+360 On output 0≤V≤1, 0≤S≤1, 0≤H≤360. The values are then converted to the destination data type: 8bit images: V < V*255, S < S*255, H < H/2 (to fit to 0..255) 16bit images (currently not supported): V < V*65535, S < S*65535, H < H 32bit images: H, S, V are left as is
CV_BGR2HLS, CV_RGB2HLS, CV_HLS2BGR, CV_HLS2RGB
)
// In case of 8bit and 16bit images // R, G and B are converted to floatingpoint format and scaled to fit 0..1 range V_{max} < max(R,G,B) V_{min} < min(R,G,B) L < (V_{max} + V_{min})/2 S < (V_{max}  V_{min})/(V_{max} + V_{min}) if L < 0.5 (V_{max}  V_{min})/(2  (V_{max} + V_{min})) if L ≥ 0.5 (G  B)*60/S, if V_{max}=R H < 180+(B  R)*60/S, if V_{max}=G 240+(R  G)*60/S, if V_{max}=B if H<0 then H<H+360 On output 0≤L≤1, 0≤S≤1, 0≤H≤360. The values are then converted to the destination data type: 8bit images: L < L*255, S < S*255, H < H/2 16bit images (currently not supported): L < L*65535, S < S*65535, H < H 32bit images: H, L, S are left as is
CV_BGR2Lab, CV_RGB2Lab, CV_Lab2BGR, CV_Lab2RGB
)
// In case of 8bit and 16bit images // R, G and B are converted to floatingpoint format and scaled to fit 0..1 range // convert R,G,B to CIE XYZ X 0.412453 0.357580 0.180423 R Y < 0.212671 0.715160 0.072169*G Z 0.019334 0.119193 0.950227 B X < X/Xn, where Xn = 0.950456 Z < Z/Zn, where Zn = 1.088754 L < 116*Y^{1/3} for Y>0.008856 L < 903.3*Y for Y<=0.008856 a < 500*(f(X)f(Y)) + delta b < 200*(f(Y)f(Z)) + delta where f(t)=t^{1/3} for t>0.008856 f(t)=7.787*t+16/116 for t<=0.008856 where delta = 128 for 8bit images, 0 for floatingpoint images On output 0≤L≤100, 127≤a≤127, 127≤b≤127 The values are then converted to the destination data type: 8bit images: L < L*255/100, a < a + 128, b < b + 128 16bit images are currently not supported 32bit images: L, a, b are left as is
CV_BGR2Luv, CV_RGB2Luv, CV_Luv2BGR, CV_Luv2RGB
)
// In case of 8bit and 16bit images // R, G and B are converted to floatingpoint format and scaled to fit 0..1 range // convert R,G,B to CIE XYZ X 0.412453 0.357580 0.180423 R Y < 0.212671 0.715160 0.072169*G Z 0.019334 0.119193 0.950227 B L < 116*Y^{1/3}16 for Y>0.008856 L < 903.3*Y for Y<=0.008856 u' < 4*X/(X + 15*Y + 3*Z) v' < 9*Y/(X + 15*Y + 3*Z) u < 13*L*(u'  u_{n}), where u_{n}=0.19793943 v < 13*L*(v'  v_{n}), where v_{n}=0.46831096 On output 0≤L≤100, 134≤u≤220, 140≤v≤122 The values are then converted to the destination data type: 8bit images: L < L*255/100, u < (u + 134)*255/354, v < (v + 140)*255/256 16bit images are currently not supported 32bit images: L, u, v are left as isThe above formulae for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from Color Space Conversions ([Ford98]) document at Charles Poynton site.
CV_BayerBG2BGR, CV_BayerGB2BGR, CV_BayerRG2BGR, CV_BayerGR2BGR,
CV_BayerBG2RGB, CV_BayerGB2RGB, CV_BayerRG2RGB, CV_BayerGR2RGB
)
Bayer pattern is widely used in CCD and CMOS cameras. It allows to get color picture out of a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:
R 
G 
R 
G 
R 
G 
B 
G 
B 
G 
R 
G 
R 
G 
R 
G 
B 
G 
B 
G 
R 
G 
R 
G 
R 
G 
B 
G 
B 
G 
The output RGB components of a pixel are interpolated from 1, 2 or 4 neighbors of the pixel having the same color. There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. The two letters C_{1} and C_{2} in the conversion constants CV_BayerC_{1}C_{2}2{BGRRGB} indicate the particular pattern type  these are components from the second row, second and third columns, respectively. For example, the above pattern has very popular "BG" type.
Applies fixedlevel threshold to array elements
double cvThreshold( const CvArr* src, CvArr* dst, double threshold, double max_value, int threshold_type );
src
or 8bit.
CV_THRESH_BINARY
and
CV_THRESH_BINARY_INV
thresholding types.
The function cvThreshold
applies fixedlevel thresholding to singlechannel array.
The function is typically used to get bilevel (binary) image out of grayscale image
(cvCmpS
could be also used for this purpose) or for removing a noise, i.e. filtering out pixels with too small or too large values.
There are several types of thresholding the function supports that are determined by threshold_type
:
threshold_type=CV_THRESH_BINARY: dst(x,y) = max_value, if src(x,y)>threshold 0, otherwise threshold_type=CV_THRESH_BINARY_INV: dst(x,y) = 0, if src(x,y)>threshold max_value, otherwise threshold_type=CV_THRESH_TRUNC: dst(x,y) = threshold, if src(x,y)>threshold src(x,y), otherwise threshold_type=CV_THRESH_TOZERO: dst(x,y) = src(x,y), if src(x,y)>threshold 0, otherwise threshold_type=CV_THRESH_TOZERO_INV: dst(x,y) = 0, if src(x,y)>threshold src(x,y), otherwise
And this is the visual description of thresholding types:
Also, the special value
CV_THRESH_OTSU
may be combined with
one of the above values. In this case the function determines the optimal threshold
value using Otsu algorithm and uses it instead of the specified thresh
.
The function returns the computed threshold value.
Currently, Otsu method is implemented only for 8bit images.
Applies adaptive threshold to array
void cvAdaptiveThreshold( const CvArr* src, CvArr* dst, double max_value, int adaptive_method=CV_ADAPTIVE_THRESH_MEAN_C, int threshold_type=CV_THRESH_BINARY, int block_size=3, double param1=5 );
CV_THRESH_BINARY
and CV_THRESH_BINARY_INV
.
CV_ADAPTIVE_THRESH_MEAN_C
or CV_ADAPTIVE_THRESH_GAUSSIAN_C
(see the discussion).
CV_THRESH_BINARY,
CV_THRESH_BINARY_INV
CV_ADAPTIVE_THRESH_MEAN_C
and CV_ADAPTIVE_THRESH_GAUSSIAN_C
it is a constant subtracted from mean or weighted mean (see the discussion), though it may be negative.
The function cvAdaptiveThreshold
transforms grayscale image to binary image according to
the formulae:
threshold_type=CV_THRESH_BINARY
: dst(x,y) = max_value, if src(x,y)>T(x,y) 0, otherwise threshold_type=CV_THRESH_BINARY_INV
: dst(x,y) = 0, if src(x,y)>T(x,y) max_value, otherwise
where T_{I} is a threshold calculated individually for each pixel.
For the method CV_ADAPTIVE_THRESH_MEAN_C
it is a mean of block_size
× block_size
pixel neighborhood, subtracted by param1
.
For the method CV_ADAPTIVE_THRESH_GAUSSIAN_C
it is a weighted sum (Gaussian) of
block_size
× block_size
pixel neighborhood, subtracted by param1
.
Downsamples image
void cvPyrDown( const CvArr* src, CvArr* dst, int filter=CV_GAUSSIAN_5x5 );
CV_GAUSSIAN_5x5
is
currently supported.
The function cvPyrDown
performs downsampling step of Gaussian pyramid
decomposition. First it convolves source image with the specified filter and
then downsamples the image by rejecting even rows and columns.
Upsamples image
void cvPyrUp( const CvArr* src, CvArr* dst, int filter=CV_GAUSSIAN_5x5 );
CV_GAUSSIAN_5x5
is
currently supported.
The function cvPyrUp
performs upsampling step of Gaussian pyramid decomposition.
First it upsamples the source image by injecting even zero rows and columns and
then convolves result with the specified filter multiplied by 4 for
interpolation. So the destination image is four times larger than the source
image.
Connected component
typedef struct CvConnectedComp { double area; /* area of the segmented component */ float value; /* gray scale value of the segmented component */ CvRect rect; /* ROI of the segmented component */ } CvConnectedComp;
Fills a connected component with given color
void cvFloodFill( CvArr* image, CvPoint seed_point, CvScalar new_val, CvScalar lo_diff=cvScalarAll(0), CvScalar up_diff=cvScalarAll(0), CvConnectedComp* comp=NULL, int flags=4, CvArr* mask=NULL ); #define CV_FLOODFILL_FIXED_RANGE (1 << 16) #define CV_FLOODFILL_MASK_ONLY (1 << 17)
new_val
is ignored),
but the fills mask (that must be nonNULL in this case).
image
. If not NULL, the function uses and updates the mask, so user takes responsibility of
initializing mask
content. Floodfilling can't go across
nonzero pixels in the mask, for example, an edge detector output can be used as a mask
to stop filling at edges. Or it is possible to use the same mask in multiple calls to the function
to make sure the filled area do not overlap. Note: because mask is larger than the filled image,
pixel in mask
that corresponds to (x,y)
pixel in image
will have coordinates (x+1,y+1)
.
The function cvFloodFill
fills a connected component starting from the seed point
with the specified color. The connectivity is determined by the closeness of pixel values.
The pixel at (x, y)
is considered to belong to the repainted domain if:
src(x',y')lo_diff<=src(x,y)<=src(x',y')+up_diff, grayscale image, floating range src(seed.x,seed.y)lo<=src(x,y)<=src(seed.x,seed.y)+up_diff, grayscale image, fixed range src(x',y')_{r}lo_diff_{r}<=src(x,y)_{r}<=src(x',y')_{r}+up_diff_{r} and src(x',y')_{g}lo_diff_{g}<=src(x,y)_{g}<=src(x',y')_{g}+up_diff_{g} and src(x',y')_{b}lo_diff_{b}<=src(x,y)_{b}<=src(x',y')_{b}+up_diff_{b}, color image, floating range src(seed.x,seed.y)_{r}lo_diff_{r}<=src(x,y)_{r}<=src(seed.x,seed.y)_{r}+up_diff_{r} and src(seed.x,seed.y)_{g}lo_diff_{g}<=src(x,y)_{g}<=src(seed.x,seed.y)_{g}+up_diff_{g} and src(seed.x,seed.y)_{b}lo_diff_{b}<=src(x,y)_{b}<=src(seed.x,seed.y)_{b}+up_diff_{b}, color image, fixed rangewhere
src(x',y')
is value of one of pixel neighbors.
That is, to be added to the connected component, a pixel’s color/brightness should be close enough to:
Finds contours in binary image
int cvFindContours( CvArr* image, CvMemStorage* storage, CvSeq** first_contour, int header_size=sizeof(CvContour), int mode=CV_RETR_LIST, int method=CV_CHAIN_APPROX_SIMPLE, CvPoint offset=cvPoint(0,0) );
binary
. To get such a binary image
from grayscale, one may use cvThreshold, cvAdaptiveThreshold or cvCanny.
The function modifies the source image content.
method
=CV_CHAIN_CODE,
and >=sizeof(CvContour) otherwise.
CV_RETR_EXTERNAL
 retrieve only the extreme outer contours
CV_RETR_LIST
 retrieve all the contours and puts them in the list
CV_RETR_CCOMP
 retrieve all the contours and organizes them into twolevel hierarchy:
top level are external boundaries of the components, second level are
boundaries of the holes
CV_RETR_TREE
 retrieve all the contours and reconstructs the full hierarchy of
nested contours
CV_RETR_RUNS
, which uses
builtin approximation).
CV_CHAIN_CODE
 output contours in the Freeman chain code. All other methods output polygons
(sequences of vertices).
CV_CHAIN_APPROX_NONE
 translate all the points from the chain code into
points;
CV_CHAIN_APPROX_SIMPLE
 compress horizontal, vertical, and diagonal segments,
that is, the function leaves only their ending points;
CV_CHAIN_APPROX_TC89_L1,
CV_CHAIN_APPROX_TC89_KCOS
 apply one of the flavors of
TehChin chain approximation algorithm.
CV_LINK_RUNS
 use completely different contour retrieval algorithm via
linking of horizontal segments of 1’s. Only CV_RETR_LIST
retrieval mode can be used
with this method.
The function cvFindContours
retrieves contours from the binary image and returns
the number of retrieved contours. The pointer first_contour
is filled by the function.
It will contain pointer to the first most outer contour or NULL if no contours is detected (if the image is completely black).
Other contours may be reached from first_contour
using h_next
and v_next
links.
The sample in cvDrawContours discussion shows how to use contours for connected component
detection. Contours can be also used for shape analysis and object recognition  see squares.c
in OpenCV sample directory.
Initializes contour scanning process
CvContourScanner cvStartFindContours( CvArr* image, CvMemStorage* storage, int header_size=sizeof(CvContour), int mode=CV_RETR_LIST, int method=CV_CHAIN_APPROX_SIMPLE, CvPoint offset=cvPoint(0,0) );
method
=CV_CHAIN_CODE,
and >=sizeof(CvContour) otherwise.
The function cvStartFindContours
initializes and returns pointer to the contour
scanner. The scanner is used further in cvFindNextContour to retrieve the rest of contours.
Finds next contour in the image
CvSeq* cvFindNextContour( CvContourScanner scanner );
cvStartFindContours
.
The function cvFindNextContour
locates and retrieves the next contour in the image and
returns pointer to it. The function returns NULL, if there is no more contours.
Replaces retrieved contour
void cvSubstituteContour( CvContourScanner scanner, CvSeq* new_contour );
The function cvSubstituteContour
replaces the retrieved contour, that was returned
from the preceding call of The function cvFindNextContour
and stored inside
the contour scanner state, with the userspecified contour. The contour is
inserted into the resulting structure, list, twolevel hierarchy, or tree,
depending on the retrieval mode. If the parameter new_contour
=NULL, the retrieved
contour is not included into the resulting structure, nor all of its children
that might be added to this structure later.
Finishes scanning process
CvSeq* cvEndFindContours( CvContourScanner* scanner );
The function cvEndFindContours
finishes the scanning process and returns the
pointer to the first contour on the highest level.
Does image segmentation by pyramids
void cvPyrSegmentation( IplImage* src, IplImage* dst, CvMemStorage* storage, CvSeq** comp, int level, double threshold1, double threshold2 );
The function cvPyrSegmentation
implements image segmentation by pyramids. The
pyramid builds up to the level level
. The links between any pixel a
on level i
and its candidate father pixel b
on the adjacent level are established if
p(c(a),c(b))<threshold1
.
After the connected components are defined, they are joined into several
clusters. Any two segments A and B belong to the same cluster, if
p(c(A),c(B))<threshold2
. The input
image has only one channel, then
p(c¹,c²)=c¹c²
. If the input image has three channels (red,
green and blue), then
p(c¹,c²)=0,3·(c¹_{r}c²_{r})+0,59·(c¹_{g}c²_{g})+0,11·(c¹_{b}c²_{b})
.
There may be more than one connected component per a cluster.
src
and dst
should be 8bit singlechannel or 3channel images
or equal size
Does MeanShift image segmentation
void cvPyrMeanShiftFiltering( const CvArr* src, CvArr* dst, double sp, double sr, int max_level=1, CvTermCriteria termcrit=cvTermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS,5,1));
The function cvPyrMeanShiftFiltering
implements the filtering stage of meanshift
segmentation, that is, the output of the function is the filtered "posterized" image with
color gradients and finegrain texture flattened. At every pixel (X,Y)
of the
input image (or downsized input image, see below) the function executes meanshift iterations,
that is, the pixel (X,Y)
neighborhood in the joint spacecolor
hyperspace is considered:
{(x,y): Xsp≤x≤X+sp && Ysp≤y≤Y+sp && (R,G,B)(r,g,b) ≤ sr},where
(R,G,B)
and (r,g,b)
are the vectors of color components
at (X,Y)
and (x,y)
, respectively (though, the algorithm does not depend
on the color space used, so any 3component color space can be used instead).
Over the neighborhood the average spatial value (X',Y')
and average color vector
(R',G',B')
are found and they act as the neighborhood center on the next iteration:
(X,Y)~(X',Y'), (R,G,B)~(R',G',B').After the iterations over, the color components of the initial pixel (that is, the pixel from where the iterations started) are set to the final value (average color at the last iteration):
I(X,Y) < (R*,G*,B*).
Then max_level
>0, the Gaussian pyramid of max_level
+1 levels is built,
and the above procedure is run on the smallest layer. After that, the results are propagated to the
larger layer and the iterations are run again only on those pixels where the layer colors
differ much (>sr
) from the lowerresolution layer, that is,
the boundaries of the color regions are clarified. Note, that the results
will be actually different from the ones obtained by running the meanshift procedure
on the whole original image (i.e. when max_level
==0).
Does watershed segmentation
void cvWatershed( const CvArr* image, CvArr* markers );
The function cvWatershed
implements one of the variants of watershed,
nonparametric markerbased segmentation algorithm, described in [Meyer92]
Before passing the image to the function, user has to outline roughly the desired regions
in the image markers
with positive (>0) indices, i.e. every region is represented as one or more
connected components with the pixel values 1, 2, 3 etc. Those components will be "seeds" of the
future image regions.
All the other pixels in markers
, which relation to the outlined regions is not known
and should be defined by the algorithm, should be set to 0's. On the output of the function,
each pixel in markers
is set to one of values of the "seed" components, or to 1
at boundaries between the regions.
Note, that it is not necessary that every
two neighbor connected components
are separated by a watershed boundary (1's pixels), for example, in case when such tangent components
exist in the initial marker image. Visual demonstration and usage example of the function can be
found in OpenCV samples directory; see watershed.cpp
demo.
Calculates all moments up to third order of a polygon or rasterized shape
void cvMoments( const CvArr* arr, CvMoments* moments, int binary=0 );
The function cvMoments
calculates spatial and central moments up to the third order and
writes them to moments
. The moments may be used then to calculate gravity center of the shape,
its area, main axises and various shape characteristics including 7 Hu invariants.
Retrieves spatial moment from moment state structure
double cvGetSpatialMoment( CvMoments* moments, int x_order, int y_order );
x_order
>= 0.
y_order
>= 0 and x_order
+ y_order
<= 3.
The function cvGetSpatialMoment
retrieves the spatial moment, which in case of
image moments is defined as:
M_{x_order,y_order}=sum_{x,y}(I(x,y)•x^{x_order}•y^{y_order})
where I(x,y)
is the intensity of the pixel (x, y)
.
Retrieves central moment from moment state structure
double cvGetCentralMoment( CvMoments* moments, int x_order, int y_order );
x_order
>= 0.
y_order
>= 0 and x_order
+ y_order
<= 3.
The function cvGetCentralMoment
retrieves the central moment, which in case of
image moments is defined as:
μ_{x_order,y_order}=sum_{x,y}(I(x,y)•(xx_{c})^{x_order}•(yy_{c})^{y_order}),
where x_{c}=M_{10}/M_{00}, y_{c}=M_{01}/M_{00}
 coordinates of the gravity center
Retrieves normalized central moment from moment state structure
double cvGetNormalizedCentralMoment( CvMoments* moments, int x_order, int y_order );
x_order
>= 0.
y_order
>= 0 and x_order
+ y_order
<= 3.
The function cvGetNormalizedCentralMoment
retrieves the normalized central moment:
η_{x_order,y_order}= μ_{x_order,y_order}/M_{00}^{((y_order+x_order)/2+1)}
Calculates seven Hu invariants
void cvGetHuMoments( CvMoments* moments, CvHuMoments* hu_moments );
The function cvGetHuMoments
calculates seven Hu invariants that are defined as:
h_{1}=η_{20}+η_{02} h_{2}=(η_{20}η_{02})²+4η_{11}² h_{3}=(η_{30}3η_{12})²+ (3η_{21}η_{03})² h_{4}=(η_{30}+η_{12})²+ (η_{21}+η_{03})² h_{5}=(η_{30}3η_{12})(η_{30}+η_{12})[(η_{30}+η_{12})²3(η_{21}+η_{03})²]+(3η_{21}η_{03})(η_{21}+η_{03})[3(η_{30}+η_{12})²(η_{21}+η_{03})²] h_{6}=(η_{20}η_{02})[(η_{30}+η_{12})² (η_{21}+η_{03})²]+4η_{11}(η_{30}+η_{12})(η_{21}+η_{03}) h_{7}=(3η_{21}η_{03})(η_{21}+η_{03})[3(η_{30}+η_{12})²(η_{21}+η_{03})²](η_{30}3η_{12})(η_{21}+η_{03})[3(η_{30}+η_{12})²(η_{21}+η_{03})²]
where η_{i,j}
are normalized central moments of
2nd and 3rd orders.
The computed values are proved to be invariant to the image scaling, rotation, and
reflection except the seventh one, whose sign is changed by reflection.
Finds lines in binary image using Hough transform
CvSeq* cvHoughLines2( CvArr* image, void* line_storage, int method, double rho, double theta, int threshold, double param1=0, double param2=0 );
cols
or rows
will contain
a number of lines detected. If line_storage
is a matrix and the actual number of lines
exceeds the matrix size, the maximum possible number of lines is returned
(in case of standard hough transform the lines are sorted by the accumulator value).
CV_HOUGH_STANDARD
 classical or standard Hough transform. Every line is represented by two floatingpoint numbers
(ρ, θ), where ρ is a distance between (0,0) point and the line, and θ is the angle
between xaxis and the normal to the line. Thus, the matrix must be (the created sequence will
be) of CV_32FC2 type.
CV_HOUGH_PROBABILISTIC
 probabilistic Hough transform (more efficient in case if picture contains
a few long linear segments). It returns line segments rather than the whole lines.
Every segment is represented by starting and ending points, and the matrix must be
(the created sequence will be) of CV_32SC4 type.
CV_HOUGH_MULTI_SCALE
 multiscale variant of classical Hough transform.
The lines are encoded the same way as in CV_HOUGH_STANDARD.
threshold
.
rho
.
(The coarse distance resolution will be rho
and the accurate resolution will be (rho
/ param1
)).
theta
.
(The coarse angle resolution will be theta
and the accurate resolution will be (theta
/ param2
)).
The function cvHoughLines2
implements a few variants of Hough transform for line detection.
/* This is a standalone program. Pass an image name as a first parameter of the program. Switch between standard and probabilistic Hough transform by changing "#if 1" to "#if 0" and back */ #include <cv.h> #include <highgui.h> #include <math.h> int main(int argc, char** argv) { IplImage* src; if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0) { IplImage* dst = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* color_dst = cvCreateImage( cvGetSize(src), 8, 3 ); CvMemStorage* storage = cvCreateMemStorage(0); CvSeq* lines = 0; int i; cvCanny( src, dst, 50, 200, 3 ); cvCvtColor( dst, color_dst, CV_GRAY2BGR ); #if 1 lines = cvHoughLines2( dst, storage, CV_HOUGH_STANDARD, 1, CV_PI/180, 100, 0, 0 ); for( i = 0; i < MIN(lines>total,100); i++ ) { float* line = (float*)cvGetSeqElem(lines,i); float rho = line[0]; float theta = line[1]; CvPoint pt1, pt2; double a = cos(theta), b = sin(theta); double x0 = a*rho, y0 = b*rho; pt1.x = cvRound(x0 + 1000*(b)); pt1.y = cvRound(y0 + 1000*(a)); pt2.x = cvRound(x0  1000*(b)); pt2.y = cvRound(y0  1000*(a)); cvLine( color_dst, pt1, pt2, CV_RGB(255,0,0), 3, 8 ); } #else lines = cvHoughLines2( dst, storage, CV_HOUGH_PROBABILISTIC, 1, CV_PI/180, 50, 50, 10 ); for( i = 0; i < lines>total; i++ ) { CvPoint* line = (CvPoint*)cvGetSeqElem(lines,i); cvLine( color_dst, line[0], line[1], CV_RGB(255,0,0), 3, 8 ); } #endif cvNamedWindow( "Source", 1 ); cvShowImage( "Source", src ); cvNamedWindow( "Hough", 1 ); cvShowImage( "Hough", color_dst ); cvWaitKey(0); } }
This is the sample picture the function parameters have been tuned for:
And this is the output of the above program in case of probabilistic Hough transform ("#if 0" case):
Finds circles in grayscale image using Hough transform
CvSeq* cvHoughCircles( CvArr* image, void* circle_storage, int method, double dp, double min_dist, double param1=100, double param2=100, int min_radius=0, int max_radius=0 );
cols
or rows
will contain
a number of lines detected. If circle_storage
is a matrix and the actual number of lines
exceeds the matrix size, the maximum possible number of circles is returned.
Every circle is encoded as 3 floatingpoint numbers: center coordinates (x,y) and the radius.
CV_HOUGH_GRADIENT
, which is basically 21HT, described in
[Yuen03].
CV_HOUGH_GRADIENT
it is the higher threshold of the two passed to Canny edge detector
(the lower one will be twice smaller).
CV_HOUGH_GRADIENT
it is accumulator threshold at the center detection stage.
The smaller it is, the more false circles may be detected. Circles, corresponding to the larger accumulator
values, will be returned first.
max(image_width, image_height)
.
The function cvHoughCircles
finds circles in grayscale image using some modification of Hough transform.
#include <cv.h> #include <highgui.h> #include <math.h> int main(int argc, char** argv) { IplImage* img; if( argc == 2 && (img=cvLoadImage(argv[1], 1))!= 0) { IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 ); CvMemStorage* storage = cvCreateMemStorage(0); cvCvtColor( img, gray, CV_BGR2GRAY ); cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 ); // smooth it, otherwise a lot of false circles may be detected CvSeq* circles = cvHoughCircles( gray, storage, CV_HOUGH_GRADIENT, 2, gray>height/4, 200, 100 ); int i; for( i = 0; i < circles>total; i++ ) { float* p = (float*)cvGetSeqElem( circles, i ); cvCircle( img, cvPoint(cvRound(p[0]),cvRound(p[1])), 3, CV_RGB(0,255,0), 1, 8, 0 ); cvCircle( img, cvPoint(cvRound(p[0]),cvRound(p[1])), cvRound(p[2]), CV_RGB(255,0,0), 3, 8, 0 ); } cvNamedWindow( "circles", 1 ); cvShowImage( "circles", img ); } return 0; }
Calculates distance to closest zero pixel for all nonzero pixels of source image
void cvDistTransform( const CvArr* src, CvArr* dst, int distance_type=CV_DIST_L2, int mask_size=3, const float* mask=NULL, CvArr* labels=NULL );
distance_type
==CV_DIST_L1
, 8bit, singlechannel
destination array may be also used (inplace operation is also supported in this case).
CV_DIST_L1, CV_DIST_L2, CV_DIST_C
or
CV_DIST_USER
.
CV_DIST_L1
or
CV_DIST_C
the parameter is forced to 3, because 3×3 mask gives the same result
as 5×5 yet it is faster. When mask_size
==0, a different nonapproximate algorithm
is used to calculate distances.
src
and dst
, can now be used only with
mask_size
==3 or 5.
The function cvDistTransform
calculates the approximated or exact distance from every binary image pixel
to the nearest zero pixel. When mask_size
==0, the function uses the accurate algorithm
[Felzenszwalb04]. When mask_size
==3 or 5, the function
uses the approximate algorithm [Borgefors86].
Here is how the approximate
algorithm works. For zero pixels the function sets the zero distance. For others it finds
the shortest path to a zero pixel, consisting of basic shifts: horizontal, vertical, diagonal or knight’s move (the
latest is available for 5×5 mask). The overall distance is calculated as a sum of these basic distances.
Because the distance function should be symmetric, all the horizontal and vertical shifts must have
the same cost (that is denoted as a
), all the diagonal shifts must have the same cost
(denoted b
), and all knight’s moves must have the same cost (denoted c
).
For CV_DIST_C
and CV_DIST_L1
types the distance is calculated precisely,
whereas for CV_DIST_L2
(Euclidean distance) the distance can be calculated only with
some relative error (5×5 mask gives more accurate results), OpenCV uses the values suggested in
[Borgefors86]:
CV_DIST_C (3×3): a=1, b=1 CV_DIST_L1 (3×3): a=1, b=2 CV_DIST_L2 (3×3): a=0.955, b=1.3693 CV_DIST_L2 (5×5): a=1, b=1.4, c=2.1969
And below are samples of distance field (black (0) pixel is in the middle of white square) in case of userdefined distance:
4.5  4  3.5  3  3.5  4  4.5 
4  3  2.5  2  2.5  3  4 
3.5  2.5  1.5  1  1.5  2.5  3.5 
3  2  1  0  1  2  3 
3.5  2.5  1.5  1  1.5  2.5  3.5 
4  3  2.5  2  2.5  3  4 
4.5  4  3.5  3  3.5  4  4.5 
4.5  3.5  3  3  3  3.5  4.5 
3.5  3  2  2  2  3  3.5 
3  2  1.5  1  1.5  2  3 
3  2  1  0  1  2  3 
3  2  1.5  1  1.5  2  3 
3.5  3  2  2  2  3  3.5 
4  3.5  3  3  3  3.5  4 
Typically, for fast coarse distance estimation CV_DIST_L2, 3×3 mask is used, and for more accurate distance estimation CV_DIST_L2, 5×5 mask is used.
When the output parameter labels
is not NULL
, for every nonzero pixel
the function also finds the nearest connected component consisting of zero pixels. The connected components
themselves are found as contours in the beginning of the function.
In this mode the processing time is still O(N), where N is the number of pixels. Thus, the function provides a very fast way to compute approximate Voronoi diagram for the binary image.
Inpaints the selected region in the image
void cvInpaint( const CvArr* src, const CvArr* mask, CvArr* dst, int flags, double inpaintRadius );
CV_INPAINT_NS
 NavierStokes based method.CV_INPAINT_TELEA
 The method by Alexandru Telea [Telea04]
The function cvInpaint
reconstructs the selected image area from the pixel near the
area boundary. The function may be used to remove dust and scratches from a scanned photo, or
to remove undesirable objects from still images or video.
Multidimensional histogram
typedef struct CvHistogram { int type; CvArr* bins; float thresh[CV_MAX_DIM][2]; /* for uniform histograms */ float** thresh2; /* for nonuniform histograms */ CvMatND mat; /* embedded matrix header for array histograms */ } CvHistogram;
Creates histogram
CvHistogram* cvCreateHist( int dims, int* sizes, int type, float** ranges=NULL, int uniform=1 );
CV_HIST_ARRAY
means that histogram data is
represented as an multidimensional dense array CvMatND;
CV_HIST_SPARSE
means that histogram data is represented
as a multidimensional sparse array CvSparseMat.
uniform
parameter value.
The ranges are used for when histogram is calculated or backprojected to determine, which histogram bin
corresponds to which value/tuple of values from the input image[s].
0<=i<cDims
ranges[i]
is array of two numbers: lower and upper
boundaries for the ith histogram dimension. The whole range [lower,upper] is split then
into dims[i]
equal parts to determine ith
input tuple value ranges for every histogram bin.
And if uniform=0
, then ith
element of ranges
array contains dims[i]+1
elements:
lower_{0}, upper_{0}, lower_{1}, upper_{1} == lower_{2}, ..., upper_{dims[i]1}
,
where lower_{j}
and upper_{j}
are lower and upper
boundaries of ith
input tuple value for jth
bin, respectively.
In either case, the input values that are beyond the specified range for a histogram bin, are not
counted by cvCalcHist and filled with 0 by cvCalcBackProject.
The function cvCreateHist
creates a histogram of the specified size and returns
the pointer to the created histogram. If the array ranges
is 0, the histogram
bin ranges must be specified later via The function cvSetHistBinRanges
, though
cvCalcHist and cvCalcBackProject may process 8bit images without setting
bin ranges, they assume equally spaced in 0..255 bins.
Sets bounds of histogram bins
void cvSetHistBinRanges( CvHistogram* hist, float** ranges, int uniform=1 );
The function cvSetHistBinRanges
is a standalone function for setting bin ranges
in the histogram. For more detailed description of the parameters ranges
and
uniform
see cvCalcHist function,
that can initialize the ranges as well.
Ranges for histogram bins must be set before the histogram is calculated or
back projection of the histogram is calculated.
Releases histogram
void cvReleaseHist( CvHistogram** hist );
The function cvReleaseHist
releases the histogram (header and the data).
The pointer to histogram is cleared by the function. If *hist
pointer is already
NULL
, the function does nothing.
Clears histogram
void cvClearHist( CvHistogram* hist );
The function cvClearHist
sets all histogram bins to 0 in case of dense histogram and
removes all histogram bins in case of sparse array.
Makes a histogram out of array
CvHistogram* cvMakeHistHeaderForArray( int dims, int* sizes, CvHistogram* hist, float* data, float** ranges=NULL, int uniform=1 );
The function cvMakeHistHeaderForArray
initializes the histogram, which header and
bins are allocated by user. No cvReleaseHist need to be called afterwards.
Only dense histograms can be initialized this way. The function returns hist
.
Queries value of histogram bin
#define cvQueryHistValue_1D( hist, idx0 ) \ cvGetReal1D( (hist)>bins, (idx0) ) #define cvQueryHistValue_2D( hist, idx0, idx1 ) \ cvGetReal2D( (hist)>bins, (idx0), (idx1) ) #define cvQueryHistValue_3D( hist, idx0, idx1, idx2 ) \ cvGetReal3D( (hist)>bins, (idx0), (idx1), (idx2) ) #define cvQueryHistValue_nD( hist, idx ) \ cvGetRealND( (hist)>bins, (idx) )
The macros cvQueryHistValue_*D return the value of the specified bin of 1D, 2D, 3D or ND histogram. In case of sparse histogram the function returns 0, if the bin is not present in the histogram, and no new bin is created.
Returns pointer to histogram bin
#define cvGetHistValue_1D( hist, idx0 ) \ ((float*)(cvPtr1D( (hist)>bins, (idx0), 0 )) #define cvGetHistValue_2D( hist, idx0, idx1 ) \ ((float*)(cvPtr2D( (hist)>bins, (idx0), (idx1), 0 )) #define cvGetHistValue_3D( hist, idx0, idx1, idx2 ) \ ((float*)(cvPtr3D( (hist)>bins, (idx0), (idx1), (idx2), 0 )) #define cvGetHistValue_nD( hist, idx ) \ ((float*)(cvPtrND( (hist)>bins, (idx), 0 ))
The macros cvGetHistValue_*D return pointer to the specified bin of 1D, 2D, 3D or ND histogram. In case of sparse histogram the function creates a new bin and sets it to 0, unless it exists already.
Finds minimum and maximum histogram bins
void cvGetMinMaxHistValue( const CvHistogram* hist, float* min_value, float* max_value, int* min_idx=NULL, int* max_idx=NULL );
The function cvGetMinMaxHistValue
finds the minimum and maximum histogram bins and
their positions. Any of output arguments is optional.
Among several extremums with the same value the ones with minimum index (in lexicographical order)
In case of several maximums or minimums the earliest in lexicographical order
extrema locations are returned.
Normalizes histogram
void cvNormalizeHist( CvHistogram* hist, double factor );
The function cvNormalizeHist
normalizes the histogram bins by scaling them,
such that the sum of the bins becomes equal to factor
.
Thresholds histogram
void cvThreshHist( CvHistogram* hist, double threshold );
The function cvThreshHist
clears histogram bins
that are below the specified threshold.
Compares two dense histograms
double cvCompareHist( const CvHistogram* hist1, const CvHistogram* hist2, int method );
The function cvCompareHist
compares two dense histograms using
the specified method as following
(H_{1}
denotes the first histogram, H_{2}
 the second):
Correlation (method=CV_COMP_CORREL): d(H_{1},H_{2})=sum_{I}(H'_{1}(I)•H'_{2}(I))/sqrt(sum_{I}[H'_{1}(I)^{2}]•sum_{I}[H'_{2}(I)^{2}]) where H'_{k}(I)=H_{k}(I)1/N•sum_{J}H_{k}(J) (N=number of histogram bins) ChiSquare (method=CV_COMP_CHISQR): d(H_{1},H_{2})=sum_{I}[(H_{1}(I)H_{2}(I))/(H_{1}(I)+H_{2}(I))] Intersection (method=CV_COMP_INTERSECT): d(H_{1},H_{2})=sum_{I}min(H_{1}(I),H_{2}(I)) Bhattacharyya distance (method=CV_COMP_BHATTACHARYYA): d(H_{1},H_{2})=sqrt(1sum_{I}(sqrt(H_{1}(I)•H_{2}(I))))
The function returns d(H_{1},H_{2})
value.
Note: the method CV_COMP_BHATTACHARYYA
only works with normalized histograms.
To compare sparse histogram or more general sparse configurations of weighted points, consider using cvCalcEMD2 function.
Copies histogram
void cvCopyHist( const CvHistogram* src, CvHistogram** dst );
The function cvCopyHist
makes a copy of the histogram. If the second histogram
pointer *dst
is NULL, a new histogram of the same size as src
is created.
Otherwise, both histograms must have equal types and sizes.
Then the function copies the source histogram bins values to destination histogram and
sets the same bin values ranges as in src
.
Calculates histogram of image(s)
void cvCalcHist( IplImage** image, CvHistogram* hist, int accumulate=0, const CvArr* mask=NULL );
The function cvCalcHist
calculates the histogram of one or more singlechannel images.
The elements of a tuple that is used to increment a histogram bin are taken at the same
location from the corresponding input images.
#include <cv.h> #include <highgui.h> int main( int argc, char** argv ) { IplImage* src; if( argc == 2 && (src=cvLoadImage(argv[1], 1))!= 0) { IplImage* h_plane = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* s_plane = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* v_plane = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* planes[] = { h_plane, s_plane }; IplImage* hsv = cvCreateImage( cvGetSize(src), 8, 3 ); int h_bins = 30, s_bins = 32; int hist_size[] = {h_bins, s_bins}; float h_ranges[] = { 0, 180 }; /* hue varies from 0 (~0°red) to 180 (~360°red again) */ float s_ranges[] = { 0, 255 }; /* saturation varies from 0 (blackgraywhite) to 255 (pure spectrum color) */ float* ranges[] = { h_ranges, s_ranges }; int scale = 10; IplImage* hist_img = cvCreateImage( cvSize(h_bins*scale,s_bins*scale), 8, 3 ); CvHistogram* hist; float max_value = 0; int h, s; cvCvtColor( src, hsv, CV_BGR2HSV ); cvCvtPixToPlane( hsv, h_plane, s_plane, v_plane, 0 ); hist = cvCreateHist( 2, hist_size, CV_HIST_ARRAY, ranges, 1 ); cvCalcHist( planes, hist, 0, 0 ); cvGetMinMaxHistValue( hist, 0, &max_value, 0, 0 ); cvZero( hist_img ); for( h = 0; h < h_bins; h++ ) { for( s = 0; s < s_bins; s++ ) { float bin_val = cvQueryHistValue_2D( hist, h, s ); int intensity = cvRound(bin_val*255/max_value); cvRectangle( hist_img, cvPoint( h*scale, s*scale ), cvPoint( (h+1)*scale  1, (s+1)*scale  1), CV_RGB(intensity,intensity,intensity), /* draw a grayscale histogram. if you have idea how to do it nicer let us know */ CV_FILLED ); } } cvNamedWindow( "Source", 1 ); cvShowImage( "Source", src ); cvNamedWindow( "HS Histogram", 1 ); cvShowImage( "HS Histogram", hist_img ); cvWaitKey(0); } }
Calculates back projection
void cvCalcBackProject( IplImage** image, CvArr* back_project, const CvHistogram* hist );
The function cvCalcBackProject
calculates the back project of the histogram. For
each tuple of pixels at the same position of all input singlechannel
images the function puts the value of the histogram bin, corresponding to the tuple,
to the destination image. In terms of statistics, the value of each output image pixel
is probability of the observed tuple given the distribution (histogram).
For example, to find a red object in the picture, one may do the following:
Locates a template within image by histogram comparison
void cvCalcBackProjectPatch( IplImage** images, CvArr* dst, CvSize patch_size, CvHistogram* hist, int method, float factor );
The function cvCalcBackProjectPatch
compares histogram, computed over
each possible rectangular patch of the specified size in the input images
,
and stores the results to the output map dst
.
In pseudocode the operation may be written as:
for (x,y) in images (until (x+patch_size.width1,y+patch_size.height1) is inside the images) do compute histogram over the ROI (x,y,x+patch_size.width,y+patch_size.height) in images (see cvCalcHist) normalize the histogram using the factor (see cvNormalizeHist) compare the normalized histogram with input histogram hist using the specified method (see cvCompareHist) store the result to dst(x,y) end forSee also a similar function cvMatchTemplate.
Divides one histogram by another
void cvCalcProbDensity( const CvHistogram* hist1, const CvHistogram* hist2, CvHistogram* dst_hist, double scale=255 );
The function cvCalcProbDensity
calculates the object probability density from
the two histograms as:
dist_hist(I)=0 if hist1(I)==0 scale if hist1(I)!=0 && hist2(I)>hist1(I) hist2(I)*scale/hist1(I) if hist1(I)!=0 && hist2(I)<=hist1(I)
So the destination histogram bins are within less than scale.
Equalizes histogram of grayscale image
void cvEqualizeHist( const CvArr* src, CvArr* dst );
src
.
The function cvEqualizeHist
equalizes histogram of the input image
using the following algorithm:
1. calculate histogram H for src. 2. normalize histogram, so that the sum of histogram bins is 255. 3. compute integral of the histogram: H’(i) = sum_{0≤j≤i}H(j) 4. transform the image using H’ as a lookup table: dst(x,y)=H’(src(x,y))
The algorithm normalizes brightness and increases contrast of the image.
Compares template against overlapped image regions
void cvMatchTemplate( const CvArr* image, const CvArr* templ, CvArr* result, int method );
image
is
W
×H
and templ
is w
×h
then result
must
be Ww+1
×Hh+1
.
The function cvMatchTemplate
is similar to cvCalcBackProjectPatch.
It slides through image
, compares overlapped patches of size w
×h
with templ
using the specified method and stores the comparison results
to result
. Here are the formulae for the different comparison methods one may use
(I
denotes image, T
 template, R
 result.
The summation is done over template and/or the image patch: x'=0..w1, y'=0..h1
):
method=CV_TM_SQDIFF: R(x,y)=sum_{x',y'}[T(x',y')I(x+x',y+y')]^{2} method=CV_TM_SQDIFF_NORMED: R(x,y)=sum_{x',y'}[T(x',y')I(x+x',y+y')]^{2}/sqrt[sum_{x',y'}T(x',y')^{2}•sum_{x',y'}I(x+x',y+y')^{2}] method=CV_TM_CCORR: R(x,y)=sum_{x',y'}[T(x',y')•I(x+x',y+y')] method=CV_TM_CCORR_NORMED: R(x,y)=sum_{x',y'}[T(x',y')•I(x+x',y+y')]/sqrt[sum_{x',y'}T(x',y')^{2}•sum_{x',y'}I(x+x',y+y')^{2}] method=CV_TM_CCOEFF: R(x,y)=sum_{x',y'}[T'(x',y')•I'(x+x',y+y')], where T'(x',y')=T(x',y')  1/(w•h)•sum_{x",y"}T(x",y") I'(x+x',y+y')=I(x+x',y+y')  1/(w•h)•sum_{x",y"}I(x+x",y+y") method=CV_TM_CCOEFF_NORMED: R(x,y)=sum_{x',y'}[T'(x',y')•I'(x+x',y+y')]/sqrt[sum_{x',y'}T'(x',y')^{2}•sum_{x',y'}I'(x+x',y+y')^{2}]After the function finishes comparison, the best matches can be found as global minimums (CV_TM_SQDIFF*) or maximums (CV_TM_CCORR* and CV_TM_CCOEFF*) using cvMinMaxLoc function. In case of color image and template summation in both numerator and each sum in denominator is done over all the channels (and separate mean values are used for each channel).
Compares two shapes
double cvMatchShapes( const void* object1, const void* object2, int method, double parameter=0 );
The function cvMatchShapes
compares two shapes. The 3 implemented methods all
use Hu moments (see cvGetHuMoments)
(A
~ object1
, B
 object2
):
method=CV_CONTOUR_MATCH_I1: I_{1}(A,B)=sum_{i=1..7}abs(1/m^{A}_{i}  1/m^{B}_{i}) method=CV_CONTOUR_MATCH_I2: I_{2}(A,B)=sum_{i=1..7}abs(m^{A}_{i}  m^{B}_{i}) method=CV_CONTOUR_MATCH_I3: I_{3}(A,B)=sum_{i=1..7}abs(m^{A}_{i}  m^{B}_{i})/abs(m^{A}_{i}) where m^{A}_{i}=sign(h^{A}_{i})•log(h^{A}_{i}), m^{B}_{i}=sign(h^{B}_{i})•log(h^{B}_{i}), h^{A}_{i}, h^{B}_{i}  Hu moments of A and B, respectively.
Computes "minimal work" distance between two weighted point configurations
float cvCalcEMD2( const CvArr* signature1, const CvArr* signature2, int distance_type, CvDistanceFunction distance_func=NULL, const CvArr* cost_matrix=NULL, CvArr* flow=NULL, float* lower_bound=NULL, void* userdata=NULL ); typedef float (*CvDistanceFunction)(const float* f1, const float* f2, void* userdata);
size1
×dims+1
floatingpoint matrix.
Each row stores the point weight followed by the point coordinates. The matrix is allowed to
have a single column (weights only) if the userdefined cost matrix is used.
signature1
, though the number
of rows may be different. The total weights may be different, in this case an extra "dummy" point
is added to either signature1
or signature2
.
CV_DIST_L1, CV_DIST_L2
, and CV_DIST_C
stand for one of
the standard metrics; CV_DIST_USER
means that a userdefined function distance_func
or
precalculated cost_matrix
is used.
size1
×size2
cost matrix.
At least one of cost_matrix
and distance_func
must be NULL.
Also, if a cost matrix is used, lower boundary (see below) can not be calculated,
because it needs a metric function.
size1
×size2
flow matrix: flow_{ij}
is a flow
from ith point of signature1
to jth point of signature2
*lower_bound
.
If the calculated distance between mass centers is greater or equal to *lower_bound
(it means that the signatures are far enough) the function does not calculate EMD.
In any case *lower_bound
is set to the calculated distance between mass centers
on return. Thus, if user wants to calculate both distance between mass centers and EMD,
*lower_bound
should be set to 0.
The function cvCalcEMD2
computes earth mover distance and/or a lower boundary of
the distance between the two weighted point configurations.
One of the application described in [RubnerSept98] is multidimensional
histogram comparison for image retrieval.
EMD is a transportation problem that is solved using some modification of simplex algorithm,
thus the complexity is exponential in the worst case, though, it is much faster in average.
In case of a real metric the lower boundary can be calculated even faster (using lineartime algorithm)
and it can be used to determine roughly whether the two
signatures are far enough so that they cannot relate to the same object.
Approximates Freeman chain(s) with polygonal curve
CvSeq* cvApproxChains( CvSeq* src_seq, CvMemStorage* storage, int method=CV_CHAIN_APPROX_SIMPLE, double parameter=0, int minimal_perimeter=0, int recursive=0 );
minimal_perimeter
. Other chains are removed from the resulting structure.
src_seq
by h_next
or v_next links
. If 0, the single chain is
approximated.
This is a standalone approximation routine. The function cvApproxChains
works
exactly in the same way as cvFindContours with the corresponding approximation flag.
The function returns pointer to the first resultant contour.
Other approximated contours, if any, can be accessed via v_next
or
h_next
fields of the returned structure.
Initializes chain reader
void cvStartReadChainPoints( CvChain* chain, CvChainPtReader* reader );
The function cvStartReadChainPoints
initializes a special reader
(see Dynamic Data Structures
for more information on sets and sequences).
Gets next chain point
CvPoint cvReadChainPoint( CvChainPtReader* reader );
The function cvReadChainPoint
returns the current chain point and updates the reader position.
Approximates polygonal curve(s) with desired precision
CvSeq* cvApproxPoly( const void* src_seq, int header_size, CvMemStorage* storage, int method, double parameter, int parameter2=0 );
CV_POLY_APPROX_DP
is supported, that
corresponds to DouglasPeucker algorithm.
CV_POLY_APPROX_DP
it is a desired approximation accuracy.
src_seq
is sequence it means whether the single sequence should
be approximated or all sequences on the same level or below src_seq
(see cvFindContours for
description of hierarchical contour structures). And if src_seq
is array (CvMat*) of
points, the parameter specifies whether the curve is closed (parameter2
!=0) or
not (parameter2
=0).
The function cvApproxPoly
approximates one or more curves and returns the approximation
result[s]. In case of multiple curves approximation the resultant tree will have the same structure as
the input one (1:1 correspondence).
Calculates upright bounding rectangle of point set
CvRect cvBoundingRect( CvArr* points, int update=0 );
CvSeq*
, CvContour*
)
or vector (CvMat*
) of points,
or 8bit singlechannel mask image (CvMat*
, IplImage*
),
in which nonzero pixels are considered.
contour
:
points
is CvContour*, update
=0: the bounding rectangle is not calculated, but it is read from rect
field of the contour header.
points
is CvContour*, update
=1: the bounding rectangle is calculated and written to rect
field of the contour header.
For example, this mode is used by cvFindContours.
points
is CvSeq* or CvMat*: update
is ignored,
the bounding rectangle is calculated and returned.
The function cvBoundingRect
returns the upright bounding rectangle for 2d point set.
Calculates area of the whole contour or contour section
double cvContourArea( const CvArr* contour, CvSlice slice=CV_WHOLE_SEQ );
The function cvContourArea
calculates area of the whole contour or contour section. In the latter
case the total area bounded by the contour arc and the chord connecting the 2 selected points is calculated as
shown on the picture below:
NOTE: Orientation of the contour affects the area sign, thus the function may return
negative
result. Use fabs()
function from C runtime to get the absolute value of
area.
Calculates contour perimeter or curve length
double cvArcLength( const void* curve, CvSlice slice=CV_WHOLE_SEQ, int is_closed=1 );
The function cvArcLength
calculates length or curve as sum of lengths of segments
between subsequent points
Creates hierarchical representation of contour
CvContourTree* cvCreateContourTree( const CvSeq* contour, CvMemStorage* storage, double threshold );
The function cvCreateContourTree
creates binary tree representation for the input
contour
and returns the pointer to its root. If the parameter threshold
is less than or equal to 0, the function creates full binary tree
representation. If the threshold is greater than 0, the function creates
representation with the precision threshold
: if the vertices with the
interceptive area of its base line are less than threshold
, the tree should not
be built any further. The function returns the created tree.
Restores contour from tree
CvSeq* cvContourFromContourTree( const CvContourTree* tree, CvMemStorage* storage, CvTermCriteria criteria );
The function cvContourFromContourTree
restores the contour from its binary tree
representation. The parameter criteria
determines the accuracy and/or the
number of tree levels used for reconstruction, so it is possible to build approximated contour.
The function returns reconstructed contour.
Compares two contours using their tree representations
double cvMatchContourTrees( const CvContourTree* tree1, const CvContourTree* tree2, int method, double threshold );
CV_CONTOUR_TREES_MATCH_I1
is supported.
The function cvMatchContourTrees
calculates the value of the matching measure for
two contour trees. The similarity measure is calculated level by level from the
binary tree roots. If at the certain level difference between contours becomes less than threshold
,
the reconstruction process is interrupted and the current difference is returned.
Finds bounding rectangle for two given rectangles
CvRect cvMaxRect( const CvRect* rect1, const CvRect* rect2 );
The function cvMaxRect
finds minimum area rectangle that contains both input rectangles inside:
Rotated 2D box
typedef struct CvBox2D { CvPoint2D32f center; /* center of the box */ CvSize2D32f size; /* box width and length */ float angle; /* angle between the horizontal axis and the first side (i.e. length) in degrees */ } CvBox2D;
Initializes point sequence header from a point vector
CvSeq* cvPointSeqFromMat( int seq_kind, const CvArr* mat, CvContour* contour_header, CvSeqBlock* block );
CV_SEQ_KIND_CURVE
),
closed curve (CV_SEQ_KIND_CURVE+CV_SEQ_FLAG_CLOSED
) etc.
CV_32SC2
or CV_32FC2
.
The function cvPointSeqFromMat
initializes sequence header to create a "virtual" sequence which
elements reside in the specified matrix. No data is copied. The initialized sequence header may be passed to
any function that takes a point sequence on input. No extra elements could be added to the sequence,
but some may be removed. The function is a specialized variant of
cvMakeSeqHeaderForArray and uses the latter internally.
It returns pointer to the initialized contour header. Note that the bounding rectangle (field rect
of
CvContour
structure is not initialized by the function. If you need one, use
cvBoundingRect.
Here is the simple usage example.
CvContour header; CvSeqBlock block; CvMat* vector = cvCreateMat( 1, 3, CV_32SC2 ); CV_MAT_ELEM( *vector, CvPoint, 0, 0 ) = cvPoint(100,100); CV_MAT_ELEM( *vector, CvPoint, 0, 1 ) = cvPoint(100,200); CV_MAT_ELEM( *vector, CvPoint, 0, 2 ) = cvPoint(200,100); IplImage* img = cvCreateImage( cvSize(300,300), 8, 3 ); cvZero(img); cvDrawContours( img, cvPointSeqFromMat(CV_SEQ_KIND_CURVE+CV_SEQ_FLAG_CLOSED, vector, &header, &block), CV_RGB(255,0,0), CV_RGB(255,0,0), 0, 3, 8, cvPoint(0,0));
Finds box vertices
void cvBoxPoints( CvBox2D box, CvPoint2D32f pt[4] );
The function cvBoxPoints
calculates vertices of the input 2d box.
Here is the function code:
void cvBoxPoints( CvBox2D box, CvPoint2D32f pt[4] ) { double angle = box.angle*CV_PI/180. float a = (float)cos(angle)*0.5f; float b = (float)sin(angle)*0.5f; pt[0].x = box.center.x  a*box.size.height  b*box.size.width; pt[0].y = box.center.y + b*box.size.height  a*box.size.width; pt[1].x = box.center.x + a*box.size.height  b*box.size.width; pt[1].y = box.center.y  b*box.size.height  a*box.size.width; pt[2].x = 2*box.center.x  pt[0].x; pt[2].y = 2*box.center.y  pt[0].y; pt[3].x = 2*box.center.x  pt[1].x; pt[3].y = 2*box.center.y  pt[1].y; }
Fits ellipse to set of 2D points
CvBox2D cvFitEllipse2( const CvArr* points );
The function cvFitEllipse
calculates ellipse that fits best (in leastsquares sense)
to a set of 2D points. The meaning of the returned structure fields is similar to those
in cvEllipse except that size
stores the full lengths of the ellipse axises,
not halflengths
Fits line to 2D or 3D point set
void cvFitLine( const CvArr* points, int dist_type, double param, double reps, double aeps, float* line );
C
) for some types of distances, if 0 then some optimal value is chosen.
(vx, vy, x0, y0)
where (vx, vy)
is a normalized vector collinear to the line and (x0, y0)
is some point on the line.
In case of 3D fitting it is array of 6 floats (vx, vy, vz, x0, y0, z0)
where (vx, vy, vz)
is a normalized vector collinear to the line and (x0, y0, z0)
is some point on the line.
The function cvFitLine
fits line to 2D or 3D point set by minimizing sum_{i}ρ(r_{i}),
where r_{i} is distance between ith point and the line and ρ(r) is a distance function, one of:
dist_type=CV_DIST_L2 (L_{2}): ρ(r)=r^{2}/2 (the simplest and the fastest leastsquares method) dist_type=CV_DIST_L1 (L_{1}): ρ(r)=r dist_type=CV_DIST_L12 (L_{1}L_{2}): ρ(r)=2•[sqrt(1+r^{2}/2)  1] dist_type=CV_DIST_FAIR (Fair): ρ(r)=C^{2}•[r/C  log(1 + r/C)], C=1.3998 dist_type=CV_DIST_WELSCH (Welsch): ρ(r)=C^{2}/2•[1  exp((r/C)^{2})], C=2.9846 dist_type=CV_DIST_HUBER (Huber): ρ(r)= r^{2}/2, if r < C C•(rC/2), otherwise; C=1.345
Finds convex hull of point set
CvSeq* cvConvexHull2( const CvArr* input, void* hull_storage=NULL, int orientation=CV_CLOCKWISE, int return_points=0 );
CV_CLOCKWISE
or CV_COUNTER_CLOCKWISE
.
hull_storage
is array, or pointers if hull_storage
is memory storage.
The function cvConvexHull2
finds convex hull of 2D point set using Sklansky’s algorithm.
If hull_storage
is memory storage, the function creates a sequence containing the hull points or
pointers to them, depending on return_points
value and returns the sequence on output.
#include "cv.h" #include "highgui.h" #include <stdlib.h> #define ARRAY 0 /* switch between array/sequence method by replacing 0<=>1 */ void main( int argc, char** argv ) { IplImage* img = cvCreateImage( cvSize( 500, 500 ), 8, 3 ); cvNamedWindow( "hull", 1 ); #if !ARRAY CvMemStorage* storage = cvCreateMemStorage(); #endif for(;;) { int i, count = rand()%100 + 1, hullcount; CvPoint pt0; #if !ARRAY CvSeq* ptseq = cvCreateSeq( CV_SEQ_KIND_GENERICCV_32SC2, sizeof(CvContour), sizeof(CvPoint), storage ); CvSeq* hull; for( i = 0; i < count; i++ ) { pt0.x = rand() % (img>width/2) + img>width/4; pt0.y = rand() % (img>height/2) + img>height/4; cvSeqPush( ptseq, &pt0 ); } hull = cvConvexHull2( ptseq, 0, CV_CLOCKWISE, 0 ); hullcount = hull>total; #else CvPoint* points = (CvPoint*)malloc( count * sizeof(points[0])); int* hull = (int*)malloc( count * sizeof(hull[0])); CvMat point_mat = cvMat( 1, count, CV_32SC2, points ); CvMat hull_mat = cvMat( 1, count, CV_32SC1, hull ); for( i = 0; i < count; i++ ) { pt0.x = rand() % (img>width/2) + img>width/4; pt0.y = rand() % (img>height/2) + img>height/4; points[i] = pt0; } cvConvexHull2( &point_mat, &hull_mat, CV_CLOCKWISE, 0 ); hullcount = hull_mat.cols; #endif cvZero( img ); for( i = 0; i < count; i++ ) { #if !ARRAY pt0 = *CV_GET_SEQ_ELEM( CvPoint, ptseq, i ); #else pt0 = points[i]; #endif cvCircle( img, pt0, 2, CV_RGB( 255, 0, 0 ), CV_FILLED ); } #if !ARRAY pt0 = **CV_GET_SEQ_ELEM( CvPoint*, hull, hullcount  1 ); #else pt0 = points[hull[hullcount1]]; #endif for( i = 0; i < hullcount; i++ ) { #if !ARRAY CvPoint pt = **CV_GET_SEQ_ELEM( CvPoint*, hull, i ); #else CvPoint pt = points[hull[i]]; #endif cvLine( img, pt0, pt, CV_RGB( 0, 255, 0 )); pt0 = pt; } cvShowImage( "hull", img ); int key = cvWaitKey(0); if( key == 27 ) // 'ESC' break; #if !ARRAY cvClearMemStorage( storage ); #else free( points ); free( hull ); #endif } }
Tests contour convex
int cvCheckContourConvexity( const CvArr* contour );
The function cvCheckContourConvexity
tests whether the input contour is convex or not.
The contour must be simple, i.e. without selfintersections.
Structure describing a single contour convexity detect
typedef struct CvConvexityDefect { CvPoint* start; /* point of the contour where the defect begins */ CvPoint* end; /* point of the contour where the defect ends */ CvPoint* depth_point; /* the farthest from the convex hull point within the defect */ float depth; /* distance between the farthest point and the convex hull */ } CvConvexityDefect;
Finds convexity defects of contour
CvSeq* cvConvexityDefects( const CvArr* contour, const CvArr* convexhull, CvMemStorage* storage=NULL );
return_points
parameter in cvConvexHull2
should be 0.
The function cvConvexityDefects
finds all convexity defects of the input contour
and returns a sequence of the CvConvexityDefect structures.
Point in contour test
double cvPointPolygonTest( const CvArr* contour, CvPoint2D32f pt, int measure_dist );
The function cvPointPolygonTest
determines whether the point is inside contour, outside, or lies
on an edge (or coincides with a vertex). It returns positive, negative or zero value, correspondingly.
When measure_dist=0
, the return value is +1, 1 and 0, respectively.
When measure_dist≠0
, it is a signed distance between the point and the nearest contour edge.
Here is the sample output of the function, where each image pixel is tested against the contour.
Finds circumscribed rectangle of minimal area for given 2D point set
CvBox2D cvMinAreaRect2( const CvArr* points, CvMemStorage* storage=NULL );
The function cvMinAreaRect2
finds a circumscribed rectangle of the minimal area for 2D point set
by building convex hull for the set and applying rotating calipers technique to the hull.
Finds circumscribed circle of minimal area for given 2D point set
int cvMinEnclosingCircle( const CvArr* points, CvPoint2D32f* center, float* radius );
The function cvMinEnclosingCircle
finds the minimal circumscribed circle for
2D point set using iterative algorithm. It returns nonzero if the resultant circle contains all the
input points and zero otherwise (i.e. algorithm failed).
Calculates pairwise geometrical histogram for contour
void cvCalcPGH( const CvSeq* contour, CvHistogram* hist );
The function cvCalcPGH
calculates 2D pairwise geometrical histogram (PGH), described in
[Iivarinen97], for the contour.
The algorithm considers every pair of the contour edges. The angle
between the edges and the minimum/maximum distances are determined for every
pair. To do this each of the edges in turn is taken as the base, while the
function loops through all the other edges. When the base edge and any other
edge are considered, the minimum and maximum distances from the points on the
nonbase edge and line of the base edge are selected. The angle between the
edges defines the row of the histogram in which all the bins that correspond to
the distance between the calculated minimum and maximum distances are
incremented (that is, the histogram is transposed relatively to [Iivarninen97] definition).
The histogram can be used for contour matching.
Planar subdivision
#define CV_SUBDIV2D_FIELDS() \ CV_GRAPH_FIELDS() \ int quad_edges; \ int is_geometry_valid; \ CvSubdiv2DEdge recent_edge; \ CvPoint2D32f topleft; \ CvPoint2D32f bottomright; typedef struct CvSubdiv2D { CV_SUBDIV2D_FIELDS() } CvSubdiv2D;
Planar subdivision is a subdivision of a plane into a set of nonoverlapped regions (facets) that cover the whole plane. The above structure describes a subdivision built on 2d point set, where the points are linked together and form a planar graph, which, together with a few edges connecting exterior subdivision points (namely, convex hull points) with infinity, subdivides a plane into facets by its edges.
For every subdivision there exists dual subdivision there facets and points (subdivision vertices) swap their roles, that is, a facet is treated as a vertex (called virtual point below) of dual subdivision and the original subdivision vertices become facets. On the picture below original subdivision is marked with solid lines and dual subdivision with dot lines
OpenCV subdivides plane into triangles using Delaunay’s algorithm. Subdivision is built iteratively starting from a dummy triangle that includes all the subdivision points for sure. In this case the dual subdivision is Voronoi diagram of input 2d point set. The subdivisions can be used for 3d piecewise transformation of a plane, morphing, fast location of points on the plane, building special graphs (such as NNG,RNG) etc.
Quadedge of planar subdivision
/* one of edges within quadedge, lower 2 bits is index (0..3) and upper bits are quadedge pointer */ typedef long CvSubdiv2DEdge; /* quadedge structure fields */ #define CV_QUADEDGE2D_FIELDS() \ int flags; \ struct CvSubdiv2DPoint* pt[4]; \ CvSubdiv2DEdge next[4]; typedef struct CvQuadEdge2D { CV_QUADEDGE2D_FIELDS() } CvQuadEdge2D;
Quadedge is a basic element of subdivision, it contains four edges (e, eRot (in red) and reversed e & eRot (in green)):
Point of original or dual subdivision
#define CV_SUBDIV2D_POINT_FIELDS()\ int flags; \ CvSubdiv2DEdge first; \ CvPoint2D32f pt; #define CV_SUBDIV2D_VIRTUAL_POINT_FLAG (1 << 30) typedef struct CvSubdiv2DPoint { CV_SUBDIV2D_POINT_FIELDS() } CvSubdiv2DPoint;
Returns one of edges related to given
CvSubdiv2DEdge cvSubdiv2DGetEdge( CvSubdiv2DEdge edge, CvNextEdgeType type ); #define cvSubdiv2DNextEdge( edge ) cvSubdiv2DGetEdge( edge, CV_NEXT_AROUND_ORG )
eOnext
on the picture above if e
is the input edge)
eDnext
)
eRnext
)
eLnext
)
eLnext
)
eRnext
)
eOnext
)
eDnext
)
The function cvSubdiv2DGetEdge
returns one the edges related to the input edge.
Returns another edge of the same quadedge
CvSubdiv2DEdge cvSubdiv2DRotateEdge( CvSubdiv2DEdge edge, int rotate );
e
on the picture above if e
is the input edge)
eRot
)
e
(in green))
eRot
(in green))
The function cvSubdiv2DRotateEdge
returns one the edges of the same quadedge as the input edge.
Returns edge origin
CvSubdiv2DPoint* cvSubdiv2DEdgeOrg( CvSubdiv2DEdge edge );
The function cvSubdiv2DEdgeOrg
returns the edge origin. The returned pointer may be NULL if
the edge is from dual subdivision and the virtual point coordinates are not calculated yet.
The virtual points can be calculated using function cvCalcSubdivVoronoi2D.
Returns edge destination
CvSubdiv2DPoint* cvSubdiv2DEdgeDst( CvSubdiv2DEdge edge );
The function cvSubdiv2DEdgeDst
returns the edge destination. The returned pointer may be NULL if
the edge is from dual subdivision and the virtual point coordinates are not calculated yet.
The virtual points can be calculated using function cvCalcSubdivVoronoi2D.
Creates empty Delaunay triangulation
CvSubdiv2D* cvCreateSubdivDelaunay2D( CvRect rect, CvMemStorage* storage );
The function cvCreateSubdivDelaunay2D
creates an empty Delaunay subdivision,
where 2d points can be added further using function cvSubdivDelaunay2DInsert.
All the points to be added must be within the specified rectangle, otherwise a runtime error will be
raised.
Inserts a single point to Delaunay triangulation
CvSubdiv2DPoint* cvSubdivDelaunay2DInsert( CvSubdiv2D* subdiv, CvPoint2D32f pt);
The function cvSubdivDelaunay2DInsert
inserts a single point to subdivision and
modifies the subdivision topology appropriately.
If a points with same coordinates exists already, no new points is added.
The function returns pointer to the allocated point.
No virtual points coordinates is calculated at this stage.
Inserts a single point to Delaunay triangulation
CvSubdiv2DPointLocation cvSubdiv2DLocate( CvSubdiv2D* subdiv, CvPoint2D32f pt, CvSubdiv2DEdge* edge, CvSubdiv2DPoint** vertex=NULL );
The function cvSubdiv2DLocate
locates input point within subdivision.
There are 5 cases:
*edge
will contain one of edges of the facet.
*edge
will contain this edge.
*vertex
will contain pointer to the vertex.
Finds the closest subdivision vertex to given point
CvSubdiv2DPoint* cvFindNearestPoint2D( CvSubdiv2D* subdiv, CvPoint2D32f pt );
The function cvFindNearestPoint2D
is another function that locates input point within subdivision.
It finds subdivision vertex that is the closest to the input point. It is not necessarily one of
vertices of the facet containing the input point, though the facet (located using cvSubdiv2DLocate)
is used as a starting point. The function returns pointer to the found subdivision vertex
Calculates coordinates of Voronoi diagram cells
void cvCalcSubdivVoronoi2D( CvSubdiv2D* subdiv );
The function cvCalcSubdivVoronoi2D
calculates coordinates of virtual points.
All virtual points corresponding to some vertex of original subdivision form (when connected together)
a boundary of Voronoi cell of that point.
Removes all virtual points
void cvClearSubdivVoronoi2D( CvSubdiv2D* subdiv );
The function cvClearSubdivVoronoi2D
removes all virtual points.
It is called internally in cvCalcSubdivVoronoi2D if the subdivision was modified
after previous call to the function.
There are a few other lowerlevel functions that work with planar subdivisions, see cv.h and the sources. Demo script delaunay.c that builds Delaunay triangulation and Voronoi diagram of random 2d point set can be found at opencv/samples/c.
Adds frame to accumulator
void cvAcc( const CvArr* image, CvArr* sum, const CvArr* mask=NULL );
The function cvAcc
adds the whole image image
or its selected region to accumulator sum
:
sum(x,y)=sum(x,y)+image(x,y) if mask(x,y)!=0
Adds the square of source image to accumulator
void cvSquareAcc( const CvArr* image, CvArr* sqsum, const CvArr* mask=NULL );
The function cvSquareAcc
adds the input image image
or its selected region,
raised to power 2, to the accumulator sqsum
:
sqsum(x,y)=sqsum(x,y)+image(x,y)^{2} if mask(x,y)!=0
Adds product of two input images to accumulator
void cvMultiplyAcc( const CvArr* image1, const CvArr* image2, CvArr* acc, const CvArr* mask=NULL );
The function cvMultiplyAcc
adds product of 2 images
or their selected regions to accumulator acc
:
acc(x,y)=acc(x,y) + image1(x,y)•image2(x,y) if mask(x,y)!=0
Updates running average
void cvRunningAvg( const CvArr* image, CvArr* acc, double alpha, const CvArr* mask=NULL );
The function cvRunningAvg
calculates weighted sum of input image image
and
the accumulator acc
so that acc
becomes a running average of frame sequence:
acc(x,y)=(1α)•acc(x,y) + α•image(x,y) if mask(x,y)!=0
where α (alpha) regulates update speed (how fast accumulator forgets about previous frames).
Updates motion history image by moving silhouette
void cvUpdateMotionHistory( const CvArr* silhouette, CvArr* mhi, double timestamp, double duration );
timestamp
.
The function cvUpdateMotionHistory
updates the motion history image as following:
mhi(x,y)=timestamp if silhouette(x,y)!=0 0 if silhouette(x,y)=0 and mhi(x,y)<timestampduration mhi(x,y) otherwise
That is, MHI pixels where motion occurs are set to the current timestamp, while the pixels where motion happened far ago are cleared.
Calculates gradient orientation of motion history image
void cvCalcMotionGradient( const CvArr* mhi, CvArr* mask, CvArr* orientation, double delta1, double delta2, int aperture_size=3 );
min(delta1,delta2) <= M(x,y)m(x,y) <= max(delta1,delta2).
The function cvCalcMotionGradient
calculates the derivatives Dx
and Dy
of
mhi
and then calculates gradient orientation as:
orientation(x,y)=arctan(Dy(x,y)/Dx(x,y))
where both Dx(x,y)
' and Dy(x,y)
' signs are taken into account
(as in cvCartToPolar function).
After that mask
is filled to indicate
where the orientation is valid (see delta1
and delta2
description).
Calculates global motion orientation of some selected region
double cvCalcGlobalOrientation( const CvArr* orientation, const CvArr* mask, const CvArr* mhi, double timestamp, double duration );
The function cvCalcGlobalOrientation
calculates the general motion direction in
the selected region and returns the angle between 0° and 360°.
At first the function builds the orientation histogram and finds the basic
orientation as a coordinate of the histogram maximum. After that the function
calculates the shift relative to the basic orientation as a weighted sum of all
orientation vectors: the more recent is the motion, the greater is the weight.
The resultant angle is a circular sum of the basic orientation and the shift.
Segments whole motion into separate moving parts
CvSeq* cvSegmentMotion( const CvArr* mhi, CvArr* seg_mask, CvMemStorage* storage, double timestamp, double seg_thresh );
The function cvSegmentMotion
finds all the motion segments and marks them in seg_mask
with individual values each (1,2,...). It also returns a sequence of CvConnectedComp structures,
one per each motion components. After than the motion direction for every component can be calculated
with cvCalcGlobalOrientation using extracted mask of the particular component
(using cvCmp)
Finds object center on back projection
int cvMeanShift( const CvArr* prob_image, CvRect window, CvTermCriteria criteria, CvConnectedComp* comp );
comp>rect
field) and sum of all pixels inside the window (comp>area
field).
The function cvMeanShift
iterates to find the object center given its back projection and
initial position of search window. The iterations are made until the search window
center moves by less than the given value and/or until the function has done the
maximum number of iterations. The function returns the number of iterations
made.
Finds object center, size, and orientation
int cvCamShift( const CvArr* prob_image, CvRect window, CvTermCriteria criteria, CvConnectedComp* comp, CvBox2D* box=NULL );
comp>rect
field) and sum of all pixels inside the window (comp>area
field).
NULL
, contains object size and
orientation.
The function cvCamShift
implements CAMSHIFT object tracking
algorithm ([Bradski98]).
First, it finds an object center using cvMeanShift and,
after that, calculates the object size and orientation. The function returns
number of iterations made within cvMeanShift.
CvCamShiftTracker class declared in cv.hpp implements color object tracker that uses the function.
Changes contour position to minimize its energy
void cvSnakeImage( const IplImage* image, CvPoint* points, int length, float* alpha, float* beta, float* gamma, int coeff_usage, CvSize win, CvTermCriteria criteria, int calc_gradient=1 );
length
floats,
one per each contour point.
alpha
.
alpha
.
CV_VALUE
indicates that each of alpha, beta, gamma
is a pointer to a single
value to be used for all points;
CV_ARRAY
indicates that each of alpha, beta, gamma
is a pointer to an array
of coefficients different for all the points of the snake. All the arrays must
have the size equal to the contour size.
win.width
and
win.height
must be odd.
The function cvSnakeImage
updates snake in order to minimize its total energy that is a sum
of internal energy that depends on contour shape (the smoother contour is, the smaller internal energy is)
and external energy that depends on the energy field and reaches minimum at the local energy extremums
that correspond to the image edges in case of image gradient.
The parameter criteria.epsilon
is used to define the minimal number of points
that must be moved during any iteration to keep the iteration process running.
If at some iteration the number of moved points is less than criteria.epsilon
or the function
performed criteria.max_iter
iterations, the function terminates.
Calculates optical flow for two images
void cvCalcOpticalFlowHS( const CvArr* prev, const CvArr* curr, int use_previous, CvArr* velx, CvArr* vely, double lambda, CvTermCriteria criteria );
The function cvCalcOpticalFlowHS
computes flow for every pixel of the first input image using
Horn & Schunck algorithm [Horn81].
Calculates optical flow for two images
void cvCalcOpticalFlowLK( const CvArr* prev, const CvArr* curr, CvSize win_size, CvArr* velx, CvArr* vely );
The function cvCalcOpticalFlowLK
computes flow for every pixel of the first input image using
Lucas & Kanade algorithm [Lucas81].
Calculates optical flow for two images by block matching method
void cvCalcOpticalFlowBM( const CvArr* prev, const CvArr* curr, CvSize block_size, CvSize shift_size, CvSize max_range, int use_previous, CvArr* velx, CvArr* vely );
velx
,
32bit floatingpoint, singlechannel.
The function cvCalcOpticalFlowBM
calculates optical flow for
overlapped blocks block_size.width×block_size.height
pixels each,
thus the velocity fields are smaller than the original images. For every block in prev
the functions tries to find a similar block in curr
in some neighborhood of the original
block or shifted by (velx(x0,y0),vely(x0,y0)) block as has been calculated
by previous function call (if use_previous=1
)
Calculates optical flow for a sparse feature set using iterative LucasKanade method in pyramids
void cvCalcOpticalFlowPyrLK( const CvArr* prev, const CvArr* curr, CvArr* prev_pyr, CvArr* curr_pyr, const CvPoint2D32f* prev_features, CvPoint2D32f* curr_features, int count, CvSize win_size, int level, char* status, float* track_error, CvTermCriteria criteria, int flags );
t
.
t + dt
.
NULL
,
the buffer must have a sufficient size to store the pyramid from level 1
to
level #level
; the total size of (image_width+8)*image_height/3
bytes
is sufficient.
prev_pyr
, used for the second frame.
0
, pyramids are not used (single level),
if 1
, two levels are used, etc.
1
if the flow for the
corresponding feature has been found, 0
otherwise.
NULL
.
CV_LKFLOW_PYR_A_READY
, pyramid for the first frame is precalculated before
the call;
CV_LKFLOW_PYR_B_READY
, pyramid for the second frame is precalculated before
the call;
CV_LKFLOW_INITIAL_GUESSES
, array B contains initial coordinates of features
before the function call.
The function cvCalcOpticalFlowPyrLK
implements
sparse iterative version of LucasKanade optical flow in pyramids ([Bouguet00]).
It calculates coordinates of the feature points on the current video frame given
their coordinates on the previous frame. The function finds the coordinates with subpixel accuracy.
Both parameters prev_pyr
and curr_pyr
comply with the following rules: if the image
pointer is 0, the function allocates the buffer internally, calculates the
pyramid, and releases the buffer after processing. Otherwise, the function
calculates the pyramid and stores it in the buffer unless the flag
CV_LKFLOW_PYR_A[B]_READY
is set. The image should be large enough to fit the
Gaussian pyramid data. After the function call both pyramids are calculated and
the readiness flag for the corresponding image can be set in the next call (i.e., typically,
for all the image pairs except the very first one CV_LKFLOW_PYR_A_READY
is set).
Constructs a tree of feature vectors
CvFeatureTree* cvCreateFeatureTree(CvMat* desc);
The function cvCreateFeatureTree
constructs a balanced kdtree index of
the given feature vectors. The lifetime of the desc matrix must exceed that
of the returned tree. I.e., no copy is made of the vectors.
Destroys a tree of feature vectors
void cvReleaseFeatureTree(CvFeatureTree* tr);
The function cvReleaseFeatureTree
deallocates the given kdtree.
Finds approximate k nearest neighbors of given vectors using bestbinfirst search
void cvFindFeatures(CvFeatureTree* tr, CvMat* desc, CvMat* results, CvMat* dist, int k=2, int emax=20);
The function cvFindFeatures
finds (with high probability) the k
nearest
neighbors in tr
for each of the given (row)vectors in desc
, using
bestbinfirst searching ([Beis97]).
The complexity of the entire operation is at most O(m*emax*log2(n))
,
where n
is the number of vectors in the tree.
Orthogonal range search
int cvFindFeaturesBoxed(CvFeatureTree* tr, CvMat* bounds_min, CvMat* bounds_max, CvMat* results);
The function cvFindFeaturesBoxed
performs orthogonal range seaching on the
given kdtree. That is, it returns the set of vectors v
in tr
that satisfy
bounds_min[i] <= v[i] <= bounds_max[i], 0 <= i < d
, where d
is the dimension
of vectors in the tree.
The function returns the number of such vectors found.
Kalman filter state
typedef struct CvKalman { int MP; /* number of measurement vector dimensions */ int DP; /* number of state vector dimensions */ int CP; /* number of control vector dimensions */ /* backward compatibility fields */ #if 1 float* PosterState; /* =state_pre>data.fl */ float* PriorState; /* =state_post>data.fl */ float* DynamMatr; /* =transition_matrix>data.fl */ float* MeasurementMatr; /* =measurement_matrix>data.fl */ float* MNCovariance; /* =measurement_noise_cov>data.fl */ float* PNCovariance; /* =process_noise_cov>data.fl */ float* KalmGainMatr; /* =gain>data.fl */ float* PriorErrorCovariance;/* =error_cov_pre>data.fl */ float* PosterErrorCovariance;/* =error_cov_post>data.fl */ float* Temp1; /* temp1>data.fl */ float* Temp2; /* temp2>data.fl */ #endif CvMat* state_pre; /* predicted state (x'(k)): x(k)=A*x(k1)+B*u(k) */ CvMat* state_post; /* corrected state (x(k)): x(k)=x'(k)+K(k)*(z(k)H*x'(k)) */ CvMat* transition_matrix; /* state transition matrix (A) */ CvMat* control_matrix; /* control matrix (B) (it is not used if there is no control)*/ CvMat* measurement_matrix; /* measurement matrix (H) */ CvMat* process_noise_cov; /* process noise covariance matrix (Q) */ CvMat* measurement_noise_cov; /* measurement noise covariance matrix (R) */ CvMat* error_cov_pre; /* priori error estimate covariance matrix (P'(k)): P'(k)=A*P(k1)*At + Q)*/ CvMat* gain; /* Kalman gain matrix (K(k)): K(k)=P'(k)*Ht*inv(H*P'(k)*Ht+R)*/ CvMat* error_cov_post; /* posteriori error estimate covariance matrix (P(k)): P(k)=(IK(k)*H)*P'(k) */ CvMat* temp1; /* temporary matrices */ CvMat* temp2; CvMat* temp3; CvMat* temp4; CvMat* temp5; } CvKalman;
The structure CvKalman is used to keep Kalman filter state. It is created by cvCreateKalman function, updated by cvKalmanPredict and cvKalmanCorrect functions and released by cvReleaseKalman functions. Normally, the structure is used for standard Kalman filter (notation and the formulae below are borrowed from the excellent Kalman tutorial [Welch95]):
x_{k}=A•x_{k1}+B•u_{k}+w_{k} z_{k}=H•x_{k}+v_{k},
where:
x_{k} (x_{k1})  state of the system at the moment k (k1) z_{k}  measurement of the system state at the moment k u_{k}  external control applied at the moment k w_{k} and v_{k} are normallydistributed process and measurement noise, respectively: p(w) ~ N(0,Q) p(v) ~ N(0,R), that is, Q  process noise covariance matrix, constant or variable, R  measurement noise covariance matrix, constant or variable
In case of standard Kalman filter, all the matrices: A, B, H, Q and R are initialized once after CvKalman structure is allocated via cvCreateKalman. However, the same structure and the same functions may be used to simulate extended Kalman filter by linearizing extended Kalman filter equation in the current system state neighborhood, in this case A, B, H (and, probably, Q and R) should be updated on every step.
Allocates Kalman filter structure
CvKalman* cvCreateKalman( int dynam_params, int measure_params, int control_params=0 );
The function cvCreateKalman
allocates CvKalman and all its matrices
and initializes them somehow.
Deallocates Kalman filter structure
void cvReleaseKalman( CvKalman** kalman );
The function cvReleaseKalman
releases the structure CvKalman
and all underlying matrices.
Estimates subsequent model state
const CvMat* cvKalmanPredict( CvKalman* kalman, const CvMat* control=NULL ); #define cvKalmanUpdateByTime cvKalmanPredict
control_params
=0).
The function cvKalmanPredict
estimates the subsequent stochastic model state
by its current state and stores it at kalman>state_pre
:
x'_{k}=A•x_{k}+B•u_{k}
P'_{k}=A•P_{k1}*A^{T} + Q,
where
x'_{k} is predicted state (kalman>state_pre),
x_{k1} is corrected state on the previous step (kalman>state_post)
(should be initialized somehow in the beginning, zero vector by default),
u_{k} is external control (control
parameter),
P'_{k} is priori error covariance matrix (kalman>error_cov_pre)
P_{k1} is posteriori error covariance matrix on the previous step (kalman>error_cov_post)
(should be initialized somehow in the beginning, identity matrix by default),
The function returns the estimated state.
Adjusts model state
const CvMat* cvKalmanCorrect( CvKalman* kalman, const CvMat* measurement ); #define cvKalmanUpdateByMeasurement cvKalmanCorrect
The function cvKalmanCorrect
adjusts stochastic model state on the
basis of the given measurement of the model state:
K_{k}=P'_{k}•H^{T}•(H•P'_{k}•H^{T}+R)^{1}
x_{k}=x'_{k}+K_{k}•(z_{k}H•x'_{k})
P_{k}=(IK_{k}•H)•P'_{k}
where
z_{k}  given measurement (mesurement
parameter)
K_{k}  Kalman "gain" matrix.
The function stores adjusted state at kalman>state_post
and returns it on output.
#include "cv.h" #include "highgui.h" #include <math.h> int main(int argc, char** argv) { /* A matrix data */ const float A[] = { 1, 1, 0, 1 }; IplImage* img = cvCreateImage( cvSize(500,500), 8, 3 ); CvKalman* kalman = cvCreateKalman( 2, 1, 0 ); /* state is (phi, delta_phi)  angle and angle increment */ CvMat* state = cvCreateMat( 2, 1, CV_32FC1 ); CvMat* process_noise = cvCreateMat( 2, 1, CV_32FC1 ); /* only phi (angle) is measured */ CvMat* measurement = cvCreateMat( 1, 1, CV_32FC1 ); CvRandState rng; int code = 1; cvRandInit( &rng, 0, 1, 1, CV_RAND_UNI ); cvZero( measurement ); cvNamedWindow( "Kalman", 1 ); for(;;) { cvRandSetRange( &rng, 0, 0.1, 0 ); rng.disttype = CV_RAND_NORMAL; cvRand( &rng, state ); memcpy( kalman>transition_matrix>data.fl, A, sizeof(A)); cvSetIdentity( kalman>measurement_matrix, cvRealScalar(1) ); cvSetIdentity( kalman>process_noise_cov, cvRealScalar(1e5) ); cvSetIdentity( kalman>measurement_noise_cov, cvRealScalar(1e1) ); cvSetIdentity( kalman>error_cov_post, cvRealScalar(1)); /* choose random initial state */ cvRand( &rng, kalman>state_post ); rng.disttype = CV_RAND_NORMAL; for(;;) { #define calc_point(angle) \ cvPoint( cvRound(img>width/2 + img>width/3*cos(angle)), \ cvRound(img>height/2  img>width/3*sin(angle))) float state_angle = state>data.fl[0]; CvPoint state_pt = calc_point(state_angle); /* predict point position */ const CvMat* prediction = cvKalmanPredict( kalman, 0 ); float predict_angle = prediction>data.fl[0]; CvPoint predict_pt = calc_point(predict_angle); float measurement_angle; CvPoint measurement_pt; cvRandSetRange( &rng, 0, sqrt(kalman>measurement_noise_cov>data.fl[0]), 0 ); cvRand( &rng, measurement ); /* generate measurement */ cvMatMulAdd( kalman>measurement_matrix, state, measurement, measurement ); measurement_angle = measurement>data.fl[0]; measurement_pt = calc_point(measurement_angle); /* plot points */ #define draw_cross( center, color, d ) \ cvLine( img, cvPoint( center.x  d, center.y  d ), \ cvPoint( center.x + d, center.y + d ), color, 1, 0 ); \ cvLine( img, cvPoint( center.x + d, center.y  d ), \ cvPoint( center.x  d, center.y + d ), color, 1, 0 ) cvZero( img ); draw_cross( state_pt, CV_RGB(255,255,255), 3 ); draw_cross( measurement_pt, CV_RGB(255,0,0), 3 ); draw_cross( predict_pt, CV_RGB(0,255,0), 3 ); cvLine( img, state_pt, predict_pt, CV_RGB(255,255,0), 3, 0 ); /* adjust Kalman filter state */ cvKalmanCorrect( kalman, measurement ); cvRandSetRange( &rng, 0, sqrt(kalman>process_noise_cov>data.fl[0]), 0 ); cvRand( &rng, process_noise ); cvMatMulAdd( kalman>transition_matrix, state, process_noise, state ); cvShowImage( "Kalman", img ); code = cvWaitKey( 100 ); if( code > 0 ) /* break current simulation by pressing a key */ break; } if( code == 27 ) /* exit by ESCAPE */ break; } return 0; }
ConDenstation state
typedef struct CvConDensation { int MP; //Dimension of measurement vector int DP; // Dimension of state vector float* DynamMatr; // Matrix of the linear Dynamics system float* State; // Vector of State int SamplesNum; // Number of the Samples float** flSamples; // array of the Sample Vectors float** flNewSamples; // temporary array of the Sample Vectors float* flConfidence; // Confidence for each Sample float* flCumulative; // Cumulative confidence float* Temp; // Temporary vector float* RandomSample; // RandomVector to update sample set CvRandState* RandS; // Array of structures to generate random vectors } CvConDensation;
The structure CvConDensation stores CONditional DENSity propagATION tracker state. The information about the algorithm can be found at http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/ISARD1/condensation.html
Allocates ConDensation filter structure
CvConDensation* cvCreateConDensation( int dynam_params, int measure_params, int sample_count );
The function cvCreateConDensation
creates CvConDensation
structure and returns pointer to the structure.
Deallocates ConDensation filter structure
void cvReleaseConDensation( CvConDensation** condens );
The function cvReleaseConDensation
releases the structure CvConDensation (see
cvConDensation) and frees all memory previously allocated for the structure.
Initializes sample set for ConDensation algorithm
void cvConDensInitSampleSet( CvConDensation* condens, CvMat* lower_bound, CvMat* upper_bound );
The function cvConDensInitSampleSet
fills the samples arrays in the structure
CvConDensation with values within specified ranges.
Estimates subsequent model state
void cvConDensUpdateByTime( CvConDensation* condens );
The function cvConDensUpdateByTime
estimates the subsequent stochastic model state from its current state.
The object detector described below has been initially proposed by Paul Viola
[Viola01] and improved by Rainer Lienhart
[Lienhart02].
First, a classifier (namely a cascade of boosted classifiers working
with haarlike features
) is trained with a few hundreds of sample
views of a particular object (i.e., a face or a car), called positive
examples, that are scaled to the same size (say, 20x20), and negative examples
 arbitrary images of the same size.
After a classifier is trained, it can be applied to a region of interest (of the same size as used during the training) in an input image. The classifier outputs a "1" if the region is likely to show the object (i.e., face/car), and "0" otherwise. To search for the object in the whole image one can move the search window across the image and check every location using the classifier. The classifier is designed so that it can be easily "resized" in order to be able to find the objects of interest at different sizes, which is more efficient than resizing the image itself. So, to find an object of an unknown size in the image the scan procedure should be done several times at different scales.
The word "cascade" in the classifier name means that the resultant classifier
consists of several simpler classifiers (stages
) that are applied
subsequently to a region of interest until at some stage the candidate
is rejected or all the stages are passed. The word
"boosted" means that the classifiers at every stage of the cascade are complex
themselves and they are built out of basic classifiers using one of four
different boosting
techniques (weighted voting). Currently
Discrete Adaboost, Real Adaboost, Gentle Adaboost and Logitboost are supported.
The basic classifiers are decisiontree classifiers with at least
2 leaves. Haarlike features are the input to the basic classifiers, and
are calculated as described below. The current algorithm uses the following
Haarlike features:
The feature used in a particular classifier is specified by its shape (1a, 2b etc.), position within the region of interest and the scale (this scale is not the same as the scale used at the detection stage, though these two scales are multiplied). For example, in case of the third line feature (2c) the response is calculated as the difference between the sum of image pixels under the rectangle covering the whole feature (including the two white stripes and the black stripe in the middle) and the sum of the image pixels under the black stripe multiplied by 3 in order to compensate for the differences in the size of areas. The sums of pixel values over a rectangular regions are calculated rapidly using integral images (see below and cvIntegral description).
To see the object detector at work, have a look at HaarFaceDetect demo.
The following reference is for the detection part only. There is a
separate application called haartraining
that can train a
cascade of boosted classifiers from a set of samples.
See opencv/apps/haartraining
for details.
Boosted Haar classifier structures
#define CV_HAAR_FEATURE_MAX 3 /* a haar feature consists of 23 rectangles with appropriate weights */ typedef struct CvHaarFeature { int tilted; /* 0 means upright feature, 1 means 45rotated feature */ /* 23 rectangles with weights of opposite signs and with absolute values inversely proportional to the areas of the rectangles. if rect[2].weight !=0, then the feature consists of 3 rectangles, otherwise it consists of 2 */ struct { CvRect r; float weight; } rect[CV_HAAR_FEATURE_MAX]; } CvHaarFeature; /* a single tree classifier (stump in the simplest case) that returns the response for the feature at the particular image location (i.e. pixel sum over subrectangles of the window) and gives out a value depending on the response */ typedef struct CvHaarClassifier { int count; /* number of nodes in the decision tree */ /* these are "parallel" arrays. Every indexi
corresponds to a node of the decision tree (root has 0th index). left[i]  index of the left child (or negated index if the left child is a leaf) right[i]  index of the right child (or negated index if the right child is a leaf) threshold[i]  branch threshold. if feature response is <= threshold, left branch is chosen, otherwise right branch is chosen. alpha[i]  output value corresponding to the leaf. */ CvHaarFeature* haar_feature; float* threshold; int* left; int* right; float* alpha; } CvHaarClassifier; /* a boosted battery of classifiers(=stage classifier): the stage classifier returns 1 if the sum of the classifiers' responses is greater thanthreshold
and 0 otherwise */ typedef struct CvHaarStageClassifier { int count; /* number of classifiers in the battery */ float threshold; /* threshold for the boosted classifier */ CvHaarClassifier* classifier; /* array of classifiers */ /* these fields are used for organizing trees of stage classifiers, rather than just straight cascades */ int next; int child; int parent; } CvHaarStageClassifier; typedef struct CvHidHaarClassifierCascade CvHidHaarClassifierCascade; /* cascade or tree of stage classifiers */ typedef struct CvHaarClassifierCascade { int flags; /* signature */ int count; /* number of stages */ CvSize orig_window_size; /* original object size (the cascade is trained for) */ /* these two parameters are set by cvSetImagesForHaarClassifierCascade */ CvSize real_window_size; /* current object size */ double scale; /* current scale */ CvHaarStageClassifier* stage_classifier; /* array of stage classifiers */ CvHidHaarClassifierCascade* hid_cascade; /* hidden optimized representation of the cascade, created by cvSetImagesForHaarClassifierCascade */ } CvHaarClassifierCascade;
All the structures are used for representing a cascaded of boosted Haar classifiers. The cascade has the following hierarchical structure:
Cascade: Stage_{1}: Classifier_{11}: Feature_{11} Classifier_{12}: Feature_{12} ... Stage_{2}: Classifier_{21}: Feature_{21} ... ...
The whole hierarchy can be constructed manually or loaded from a file using functions cvLoadHaarClassifierCascade or cvLoad.
Loads a trained cascade classifier from file or the classifier database embedded in OpenCV
CvHaarClassifierCascade* cvLoadHaarClassifierCascade( const char* directory, CvSize orig_window_size );
The function cvLoadHaarClassifierCascade
loads a trained cascade of haar classifiers from a file or the classifier
database embedded in OpenCV. The base can be trained using haartraining
application (see opencv/apps/haartraining for details).
The function is obsolete. Nowadays object detection classifiers are stored in XML or YAML files, rather than in directories. To load cascade from a file, use cvLoad function.
Releases haar classifier cascade
void cvReleaseHaarClassifierCascade( CvHaarClassifierCascade** cascade );
The function cvReleaseHaarClassifierCascade
deallocates the cascade that has been created manually or loaded using
cvLoadHaarClassifierCascade or
cvLoad.
Detects objects in the image
typedef struct CvAvgComp { CvRect rect; /* bounding rectangle for the object (average rectangle of a group) */ int neighbors; /* number of neighbor rectangles in the group */ } CvAvgComp; CvSeq* cvHaarDetectObjects( const CvArr* image, CvHaarClassifierCascade* cascade, CvMemStorage* storage, double scale_factor=1.1, int min_neighbors=3, int flags=0, CvSize min_size=cvSize(0,0) );
min_neighbors
1 are rejected.
If min_neighbors
is 0, the function does not any
grouping at all and returns all the detected candidate rectangles,
which may be useful if the user wants to apply a customized grouping procedure.
CV_HAAR_SCALE_IMAGE
 for each scale factor used the function will
downscale the image rather than "zoom" the feature coordinates in the classifier cascade.
Currently, the option can only be used alone, i.e. the flag can not be set together with the others.CV_HAAR_DO_CANNY_PRUNING
 If it is set, the function uses Canny
edge detector to reject some image regions that contain too few or too much edges
and thus can not contain the searched object. The particular threshold values
are tuned for face detection and in this case the pruning speeds up the processing.CV_HAAR_FIND_BIGGEST_OBJECT
 If it is set, the function finds
the largest object (if any) in the image. That is, the output sequence will
contain one (or zero) element(s).CV_HAAR_DO_ROUGH_SEARCH
 It should be used only when
CV_HAAR_FIND_BIGGEST_OBJECT
is set and min_neighbors
> 0.
If the flag is set, the function does not look for candidates of a smaller size
as soon as it has found the object (with enough neighbor candidates) at the current
scale. Typically, when min_neighbors
is fixed, the
mode yields less accurate (a bit larger) object rectangle than
the regular singleobject mode (flags
=CV_HAAR_FIND_BIGGEST_OBJECT
),
but it is much faster, up to an order of magnitude. A greater value of
min_neighbors
may be specified to improve the accuracy.
Note, that in singleobject mode CV_HAAR_DO_CANNY_PRUNING
does not improve performance much and can even slow down the processing.
The function cvHaarDetectObjects
finds
rectangular regions in the given image that are likely to contain objects
the cascade has been trained for and returns those regions as
a sequence of rectangles. The function scans the image several
times at different scales (see
cvSetImagesForHaarClassifierCascade). Each time it considers
overlapping regions in the image and applies the classifiers to the regions
using cvRunHaarClassifierCascade.
It may also apply some heuristics to reduce number of analyzed regions, such as
Canny pruning. After it has proceeded and collected the candidate rectangles
(regions that passed the classifier cascade), it groups them and returns a
sequence of average rectangles for each large enough group. The default
parameters (scale_factor
=1.1, min_neighbors
=3, flags
=0)
are tuned for accurate yet slow object detection. For a faster operation on
real video images the more preferable settings are: scale_factor
=1.2, min_neighbors
=2,
flags
=CV_HAAR_DO_CANNY_PRUNING, min_size
=<minimum possible face size>
(for example, ~1/4 to 1/16 of the image area in case of video conferencing).
#include "cv.h" #include "highgui.h" CvHaarClassifierCascade* load_object_detector( const char* cascade_path ) { return (CvHaarClassifierCascade*)cvLoad( cascade_path ); } void detect_and_draw_objects( IplImage* image, CvHaarClassifierCascade* cascade, int do_pyramids ) { IplImage* small_image = image; CvMemStorage* storage = cvCreateMemStorage(0); CvSeq* faces; int i, scale = 1; /* if the flag is specified, downscale the input image to get a performance boost w/o loosing quality (perhaps) */ if( do_pyramids ) { small_image = cvCreateImage( cvSize(image>width/2,image>height/2), IPL_DEPTH_8U, 3 ); cvPyrDown( image, small_image, CV_GAUSSIAN_5x5 ); scale = 2; } /* use the fastest variant */ faces = cvHaarDetectObjects( small_image, cascade, storage, 1.2, 2, CV_HAAR_DO_CANNY_PRUNING ); /* draw all the rectangles */ for( i = 0; i < faces>total; i++ ) { /* extract the rectangles only */ CvRect face_rect = *(CvRect*)cvGetSeqElem( faces, i, 0 ); cvRectangle( image, cvPoint(face_rect.x*scale,face_rect.y*scale), cvPoint((face_rect.x+face_rect.width)*scale, (face_rect.y+face_rect.height)*scale), CV_RGB(255,0,0), 3 ); } if( small_image != image ) cvReleaseImage( &small_image ); cvReleaseMemStorage( &storage ); } /* takes image filename and cascade path from the command line */ int main( int argc, char** argv ) { IplImage* image; if( argc==3 && (image = cvLoadImage( argv[1], 1 )) != 0 ) { CvHaarClassifierCascade* cascade = load_object_detector(argv[2]); detect_and_draw_objects( image, cascade, 1 ); cvNamedWindow( "test", 0 ); cvShowImage( "test", image ); cvWaitKey(0); cvReleaseHaarClassifierCascade( &cascade ); cvReleaseImage( &image ); } return 0; }
Assigns images to the hidden cascade
void cvSetImagesForHaarClassifierCascade( CvHaarClassifierCascade* cascade, const CvArr* sum, const CvArr* sqsum, const CvArr* tilted_sum, double scale );
cvIntegral
.
scale
=1, original window size is
used (objects of that size are searched)  the same size as specified in
cvLoadHaarClassifierCascade
(24x24 in case of "<default_face_cascade>"), if scale
=2,
a two times larger window is used (48x48 in case of default face cascade).
While this will speedup search about four times,
faces smaller than 48x48 cannot be detected.
The function cvSetImagesForHaarClassifierCascade
assigns images and/or window scale to the hidden classifier cascade.
If image pointers are NULL, the previously set images are used further
(i.e. NULLs mean "do not change images"). Scale parameter has no such a "protection" value, but
the previous value can be retrieved by
cvGetHaarClassifierCascadeScale function and reused again. The function
is used to prepare cascade for detecting object of the particular size in the
particular image. The function is called internally by
cvHaarDetectObjects, but it can be called by user if there is a need in
using lowerlevel function cvRunHaarClassifierCascade.
Runs cascade of boosted classifier at given image location
int cvRunHaarClassifierCascade( CvHaarClassifierCascade* cascade, CvPoint pt, int start_stage=0 );
The function cvRunHaarHaarClassifierCascade
runs Haar classifier cascade at a single image location. Before using this
function the integral images and the appropriate scale (=> window size)
should be set using cvSetImagesForHaarClassifierCascade.
The function returns positive value if the analyzed rectangle passed all the classifier
stages (it is a candidate) and zero or negative value otherwise.
The functions in this section use socalled pinhole camera model. That is, a scene view is formed by projecting 3D points into the image plane using perspective transformation.
s*m' = A*[Rt]*M', or [u] [fx 0 cx] [r_{11} r_{12} r_{13} t_{1}] [X] s[v] = [0 fy cy]*[r_{21} r_{22} r_{23} t_{2}]*[Y] [1] [0 0 1] [r_{31} r_{32} r_{33} t_{2}] [Z] [1]Where
(X, Y, Z)
are coordinates of a 3D point in the world coordinate space,
(u, v)
are coordinates of point projection in pixels.
A
is called a camera matrix, or matrix of intrinsic parameters.
(cx, cy)
is a principal point (that is usually at the image center),
and fx, fy
are focal lengths expressed in pixelrelated units.
Thus, if an image from camera is upsampled/downsampled by some factor,
all these parameters (fx, fy, cx
and cy
) should be scaled
(multiplied/divided, respectively) by the same factor.
The matrix of intrinsic parameters does not depend on the scene viewed
and, once estimated, can be reused (as long as the focal length is fixed (in case of zoom lens)).
The joint rotationtranslation matrix [Rt]
is called a matrix of extrinsic parameters.
It is used to describe the camera motion around a static scene, or vice versa,
rigid motion of an object in front of still camera. That is, [Rt]
translates coordinates
of a point (X, Y, Z)
to some coordinate system, fixed with respect to the camera.
The transformation above is equivalent to the following (when z≠0):
[x] [X] [y] = R*[Y] + t [z] [Z] x' = x/z y' = y/z u = fx*x' + cx v = fy*y' + cyReal lens usually have some distortion, which is mainly a radial distortion and slight tangential distortion. So, the above model is extended as:
[x] [X] [y] = R*[Y] + t [z] [Z] x' = x/z y' = y/z x" = x'*(1 + k_{1}r^{2} + k_{2}r^{4} + k_{3}r^{6}) + 2*p_{1}x'*y' + p_{2}(r^{2}+2*x'^{2}) y" = y'*(1 + k_{1}r^{2} + k_{2}r^{4} + k_{3}r^{6}) + p_{1}(r^{2}+2*y'^{2}) + 2*p_{2}*x'*y' where r^{2} = x'^{2}+y'^{2} u = fx*x" + cx v = fy*y" + cyk_{1}, k_{2}, k_{3} are radial distortion coefficients, p_{1}, p_{2} are tangential distortion coefficients. Higherorder coefficients are not considered in OpenCV. The distortion coefficients also do not depend on the scene viewed, thus they are intrinsic camera parameters. And they remain the same regardless of the captured image resolution. That is, if, for example, a camera has been calibrated on images of 320x240 resolution, absolutely the same distortion coefficients can be used for images of 640x480 resolution from the same camera (while fx, fy, cx and cy need to be scaled appropriately).
(k_{1}, k_{2}, p_{1}, p_{2}[, k_{3}]).
That is, the first 2 radial distortion coefficients are followed by 2 tangential distortion coefficients and then, optionally, by the third radial distortion coefficients. Such ordering is used to keep backward compatibility with previous versions of OpenCV.
The functions below use the above model to
Projects 3D points to image plane
void cvProjectPoints2( const CvMat* object_points, const CvMat* rotation_vector, const CvMat* translation_vector, const CvMat* intrinsic_matrix, const CvMat* distortion_coeffs, CvMat* image_points, CvMat* dpdrot=NULL, CvMat* dpdt=NULL, CvMat* dpdf=NULL, CvMat* dpdc=NULL, CvMat* dpddist=NULL, double aspect_ratio=0 );
dpdf
.
(When cvCalibrateCamera2 or cvStereoCalibrate
are called with the flag CV_CALIB_FIX_ASPECT_RATIO
, only fy
is estimated as
independent parameter, and fx
is computed as fy*apect_ratio
; this affects dpdf
too).
If the parameter is 0, it means that the aspect ratio is not fixed.
The function cvProjectPoints2
computes projections of 3D points to the image plane given
intrinsic and extrinsic camera parameters. Optionally, the function computes Jacobians  matrices of
partial derivatives of image points as functions of all the input parameters w.r.t. the particular parameters,
intrinsic and/or extrinsic. The Jacobians are used during the global optimization in
cvCalibrateCamera2 and
cvFindExtrinsicCameraParams2.
The function itself is also used to compute reprojection error for with current
intrinsic and extrinsic parameters.
Note, that with intrinsic and/or extrinsic parameters set to special values, the function can be used to compute just extrinsic transformation or just intrinsic transformation (i.e. distortion of a sparse set of points).
Finds perspective transformation between two planes
void cvFindHomography( const CvMat* src_points, const CvMat* dst_points, CvMat* homography, int method=0, double ransacReprojThreshold=0, CvMat* mask=NULL );
0
 regular method using all the point pairsCV_RANSAC
 RANSACbased robust methodCV_LMEDS
 LeastMedian robust methoddst_points
coordinates are measured in pixels
with pixelaccurate precision, it makes sense to set this parameter somewhere in the range ~1..3.
CV_RANSAC
or CV_LMEDS
).
The function cvFindHomography
finds perspective transformation H=hij
between the source
and the destination planes:
[x'_{i}] [x_{i}] s_{i}[y'_{i}]~H*[y_{i}] [1 ] [ 1]So that the reprojection error is minimized:
sum_i((x'_{i}(h11*x_{i} + h12*y_{i} + h13)/(h31*x_{i} + h32*y_{i} + h33))^{2}+ (y'_{i}(h21*x_{i} + h22*y_{i} + h23)/(h31*x_{i} + h32*y_{i} + h33))^{2}) > min
If the parameter method
is set to the default value 0, the function uses all the point pairs and estimates
the best suitable homography matrix. However, if there can not all the points pairs
(src_points
_{i}, dst_points
_{i}) fit the rigid perspective transformation
(i.e. there can be outliers), it is still possible to estimate the correct transformation using one of the robust methods available.
Both methods, CV_RANSAC
and CV_LMEDS
, try many different random subsets of the corresponding point pairs
(of 5 pairs each), estimate homography matrix using this subset using simple leastsquare algorithm and then compute
quality/goodness of the computed homography (which is the number of inliers for RANSAC or the median reprojection error for LMeDs).
The best subset is then used to produce the initial estimate of the homography matrix and the mask of inliers/outliers.
Regardless of the method, robust or not, the computed homography matrix is refined further (using inliers only in case of a robust method) with LevenbergMarquardt method in order to reduce the reprojection error even more.
The method CV_RANSAC
can handle practically any ratio of outliers, but it needs the threshold to distinguish inliers from outliers.
The method CV_LMEDS
does not need any threshold, but it works correctly only when there are more than 50% of inliers.
Finally, if you are sure in the computed features and there can be only some small noise, but no outliers,
the default method could be the best choice.
The function is used to find initial intrinsic and extrinsic matrices. Homography matrix is determined up to a scale, thus it is normalized to make h33=1.
Finds intrinsic and extrinsic camera parameters using calibration pattern
void cvCalibrateCamera2( const CvMat* object_points, const CvMat* image_points, const CvMat* point_counts, CvSize image_size, CvMat* intrinsic_matrix, CvMat* distortion_coeffs, CvMat* rotation_vectors=NULL, CvMat* translation_vectors=NULL, int flags=0 );
CV_CALIB_USE_INTRINSIC_GUESS
and/or
CV_CALIB_FIX_ASPECT_RATIO
are specified, some or all
of fx, fy, cx, cy
must be initialized.
CV_CALIB_USE_INTRINSIC_GUESS
 intrinsic_matrix
contains
valid initial values of fx, fy, cx, cy
that are optimized further.
Otherwise, (cx, cy)
is initially set to the image center
(image_size
is used here),
and focal distances are computed in some leastsquares fashion.
Note, that if intrinsic parameters are known, there is no need to use this function.
Use cvFindExtrinsicCameraParams2 instead.CV_CALIB_FIX_PRINCIPAL_POINT
 The principal point is not changed during the global
optimization, it stays at the center and at the other location specified (when
CV_CALIB_FIX_FOCAL_LENGTH
 Both fx and fy are fixed.CV_CALIB_USE_INTRINSIC_GUESS
is set as well).CV_CALIB_FIX_ASPECT_RATIO
 The optimization procedure consider only
one of fx
and fy
as independent variable and keeps the aspect ratio
fx/fy
the same as it was set initially in intrinsic_matrix
.
In this case the actual initial values of (fx, fy)
are either taken from the matrix
(when CV_CALIB_USE_INTRINSIC_GUESS
is set) or estimated somehow (in the latter case
fx, fy
may be set to arbitrary values, only their ratio is used).CV_CALIB_ZERO_TANGENT_DIST
 Tangential distortion coefficients are set to
zeros and do not change during the optimization.CV_CALIB_FIX_K1
 The 0th distortion coefficient (k1) is fixed (to 0 or to the initial passed value if CV_CALIB_USE_INTRINSIC_GUESS
is passed)CV_CALIB_FIX_K2
 The 1st distortion coefficient (k2) is fixed (see above)CV_CALIB_FIX_K3
 The 4th distortion coefficient (k3) is fixed (see above)
The function cvCalibrateCamera2
estimates intrinsic camera parameters and, optionally, the extrinsic parameters
for each view of the calibration pattern.
The coordinates of 3D object points and their correspondent 2D projections in each view
must be specified. That may be achieved by using an object with known geometry and easily detectable
feature points. Such an object is called a calibration rig or calibration pattern, and OpenCV has builtin
support for a chess board as a calibration rig
(see cvFindChessboardCorners).
Currently, initialization of intrinsic parameters (when CV_CALIB_USE_INTRINSIC_GUESS
is not set) is only implemented for planar calibration rigs (zcoordinates of object points
must be all 0's or all 1's). 3D rigs can still be used as long as the initial intrinsic_matrix
is provided. After the initial values of intrinsic and extrinsic parameters are obtained by the function, they are
optimized further to minimize the total reprojection error  the sum of squared differences between the
actual coordinates of image points and the ones computed using
cvProjectPoints2 with current intrinsic and extrinsic parameters.
Finds intrinsic and extrinsic camera parameters using calibration pattern
void cvCalibrationMatrixValues( const CvMat *calibMatr, int imgWidth, int imgHeight, double apertureWidth=0, double apertureHeight=0, double *fovx=NULL, double *fovy=NULL, double *focalLength=NULL, CvPoint2D64f *principalPoint=NULL, double *pixelAspectRatio=NULL );
The function cvCalibrationMatrixValues
computes various useful camera (sensor/lens)
characteristics using the computed camera calibration matrix, image frame resolution in pixels
and the physical aperture size.
Finds extrinsic camera parameters for particular view
void cvFindExtrinsicCameraParams2( const CvMat* object_points, const CvMat* image_points, const CvMat* intrinsic_matrix, const CvMat* distortion_coeffs, CvMat* rotation_vector, CvMat* translation_vector );
The function cvFindExtrinsicCameraParams2
estimates the object pose
using the intrinsic camera parameters and a few (>=4) 2D<>3D point correspondences.
Calibrates stereo camera
void cvStereoCalibrate( const CvMat* object_points, const CvMat* image_points1, const CvMat* image_points2, const CvMat* point_counts, CvMat* camera_matrix1, CvMat* dist_coeffs1, CvMat* camera_matrix2, CvMat* dist_coeffs2, CvSize image_size, CvMat* R, CvMat* T, CvMat* E=0, CvMat* F=0, CvTermCriteria term_crit=cvTermCriteria( CV_TERMCRIT_ITER+CV_TERMCRIT_EPS,30,1e6), int flags=CV_CALIB_FIX_INTRINSIC );
CV_CALIB_USE_INTRINSIC_GUESS
or
CV_CALIB_FIX_ASPECT_RATIO
are specified, some or all
of the elements of the matrices must be initialized.
CV_CALIB_FIX_INTRINSIC
 If it is set, camera_matrix1,2
, as well as dist_coeffs1,2
are fixed,
so that only extrinsic parameters are optimized.CV_CALIB_USE_INTRINSIC_GUESS
 The flag allows the function to optimize some or all of the intrinsic parameters,
depending on the other flags, but the initial values are provided by the userCV_CALIB_FIX_PRINCIPAL_POINT
 The principal points are fixed during the optimization.CV_CALIB_FIX_FOCAL_LENGTH
 fx_{k} and fy_{k} are fixedCV_CALIB_FIX_ASPECT_RATIO
 fy_{k} is optimized, but the ratio fx_{k}/fy_{k} is fixed.CV_CALIB_SAME_FOCAL_LENGTH
 Enforces fx_{0}=fx_{1} and fy_{0}=fy_{1}.
CV_CALIB_ZERO_TANGENT_DIST
 Tangential distortion coefficients for each camera are set to zeros and fixed there.CV_CALIB_FIX_K1
 The 0th distortion coefficients (k1) are fixedCV_CALIB_FIX_K2
 The 1st distortion coefficients (k2) are fixedCV_CALIB_FIX_K3
 The 4th distortion coefficients (k3) are fixed
The function cvStereoCalibrate
estimates transformation between the 2 cameras making a stereo pair. If we have
a stereo camera, where the relative position and orientatation of the 2 cameras is fixed, and if we computed poses of an object relative to
the fist camera and to the second camera, (R_{1}, T_{1}) and (R_{2}, T_{2}), respectively (that can be done with
cvFindExtrinsicCameraParams2), obviously, those poses will relate to each other, i.e. given (R_{1}, T_{1})
it should be possible to compute (R_{2}, T_{2})  we only need to know the position and orientation of the 2nd camera
relative to the 1st camera. That's what the described function does. It computes (R, T) such that:
R_{2}=R*R_{1} T_{2}=R*T_{1} + T,Optionally, it computes the essential matrix E:
[0 T_{2} T_{1}] E = [T_{2} 0 T_{0}]*R, [T_{1} T_{0} 0]where T_{i} are components of the translation vector T: T=[T_{0}, T_{1}, T_{2}]^{T}. And also the function can compute the fundamental matrix F:
F = inv(camera_matrix2)^{T}*E*inv(camera_matrix1),Besides the stereorelated information, the function can also perform full calibration of each of the 2 cameras. However, because of the high dimensionality of the parameter space and noise in the input data the function can diverge from the correct solution. Thus, if intrinsic parameters can be estimated with high accuracy for each of the cameras individually (e.g. using cvCalibrateCamera2), it is recommended to do so and then pass
CV_CALIB_FIX_INTRINSIC
flag to the function along with the computed intrinsic parameters. Otherwise, if all the parameters
are estimated at once, it makes sense to restrict some parameters, e.g. pass CV_CALIB_SAME_FOCAL_LENGTH
and
CV_CALIB_ZERO_TANGENT_DIST
flags, which are usually reasonable assumptions.
Computes rectification transform for stereo camera
void cvStereoRectify( const CvMat* camera_matrix1, const CvMat* camera_matrix2, const CvMat* dist_coeffs1, const CvMat* dist_coeffs2, CvSize image_size, const CvMat* R, const CvMat* T, CvMat* R1, CvMat* R2, CvMat* P1, CvMat* P2, CvMat* Q=0, int flags=CV_CALIB_ZERO_DISPARITY );
CV_CALIB_ZERO_DISPARITY
. If the flag is set, the function makes
the principal points of each camera have the same pixel coordinates in the rectified views. And if the flag is not set, the function can
shift one of the image in horizontal or vertical direction (depending on the orientation of epipolar lines) in order to maximise
the useful image area.
The function cvStereoRectify
computes the rotation matrices for each camera that (virtually) make both
camera image planes the same plane. Consequently, that makes all the epipolar lines parallel and thus simplifies
the dense stereo correspondence problem. On input the function takes the matrices computed by cvStereoCalibrate
and on output it gives 2 rotation matrices and also 2 projection matrices in the new coordinates.
The function is normally called after cvStereoCalibrate that computes
both camera matrices, the distortion coefficients, R
and T
.
The 2 cases are distinguished by the function:
[f 0 cx1 0] P1=[0 f cy 0] [0 0 1 0] [f 0 cx2 Tx*f] P2=[0 f cy 0 ], [0 0 1 0 ]where Tx is horizontal shift between the cameras and cx1=cx2 if
CV_CALIB_ZERO_DISPARITY
is set.
[f 0 cx 0] P1=[0 f cy1 0] [0 0 1 0] [f 0 cx 0 ] P2=[0 f cy2 Ty*f], [0 0 1 0 ]where Ty is vertical shift between the cameras and cy1=cy2 if
CV_CALIB_ZERO_DISPARITY
is set.
Computes rectification transform for uncalibrated stereo camera
void cvStereoRectifyUncalibrated( const CvMat* points1, const CvMat* points2, const CvMat* F, CvSize image_size, CvMat* H1, CvMat* H2, double threshold=5 );
points1
and points2
using cvFindFundamentalMat
fabs(points2[i]^{T}*F*points1[i])>threshold
)
are rejected prior to computing the homographies.
The function cvStereoRectifyUncalibrated
computes the rectification transformations
without knowing intrinsic parameters of the cameras and their relative position in space,
hence the suffix "Uncalibrated". Another related difference from cvStereoRectify
is that the function outputs not the rectification transformations in the object (3D) space, but
the planar perspective transformations, encoded by the homography matrices H1
and H2
.
The function implements the following algorithm [Hartley99].
Note that while the algorithm does not need to know the intrinsic parameters of the cameras, it heavily depends on the epipolar geometry. Therefore, if the camera lenses have significant distortion, it would better be corrected before computing the fundamental matrix and calling this function. For example, distortion coefficients can be estimated for each head of stereo camera separately by using cvCalibrateCamera2 and then the images can be corrected using cvUndistort2.
Converts rotation matrix to rotation vector or vice versa
int cvRodrigues2( const CvMat* src, CvMat* dst, CvMat* jacobian=0 );
The function cvRodrigues2
converts a rotation vector to rotation matrix or
vice versa. Rotation vector is a compact representation of rotation matrix.
Direction of the rotation vector is the rotation axis and the length of the vector is the rotation
angle around the axis.
The rotation matrix R
, corresponding to the rotation vector r
,
is computed as following:
theta < norm(r) r < r/theta [0 r_{z} r_{y}] R = cos(theta)*I + (1cos(theta))*rr^{T} + sin(theta)*[r_{z} 0 r_{x}] [r_{y} r_{x} 0]Inverse transformation can also be done easily as
[0 r_{z} r_{y}] sin(theta)*[r_{z} 0 r_{x}] = (R  R^{T})/2 [r_{y} r_{x} 0]Rotation vector is a convenient representation of a rotation matrix as a matrix with only 3 degrees of freedom. The representation is used in the global optimization procedures inside cvFindExtrinsicCameraParams2 and cvCalibrateCamera2.
Transforms image to compensate lens distortion
void cvUndistort2( const CvArr* src, CvArr* dst, const CvMat* intrinsic_matrix, const CvMat* distortion_coeffs );
The function cvUndistort2
transforms the image to compensate radial and tangential lens distortion.
The camera matrix and distortion parameters can be determined using
cvCalibrateCamera2.
For every pixel in the output image the function computes coordinates of the corresponding location in the input
image using the formulae in the section beginning. Then, the pixel value is computed using bilinear interpolation.
If the resolution of images is different from what was used at the calibration stage,
fx, fy, cx
and cy
need to be adjusted appropriately, while
the distortion coefficients remain the same.
In the undistorted image the principal point will be at the image center.
Computes undistortion map
void cvInitUndistortMap( const CvMat* camera_matrix, const CvMat* distortion_coeffs, CvArr* mapx, CvArr* mapy );
The function cvInitUndistortMap
precomputes the undistortion map
 coordinates of the corresponding pixel in the distorted image for every pixel in the corrected image.
Then, the map (together with input and output images) can be passed
to cvRemap function.
In the undistorted image the principal point will be at the image center.
Computes undistortion+rectification transformation map a head of stereo camera
void cvInitUndistortRectifyMap( const CvMat* camera_matrix, const CvMat* dist_coeffs, const CvMat* R, const CvMat* new_camera_matrix, CvArr* mapx, CvArr* mapy );
R1
or R2
, computed
by cvStereoRectify can be passed here.
If the parameter is NULL, the identity matrix is used.
The function cvInitUndistortRectifyMap
is an extended version of
cvInitUndistortMap. That is, in addition to the correction of lens distortion,
the function can also apply arbitrary perspective transformation R and finally it
can scale and shift the image according to the new camera matrix. That is, in pseudo code the
transformation can be represented as:
// (u,v) is the input point, // camera_matrix=[fx 0 cx; 0 fy cy; 0 0 1] // new_camera_matrix=[fx' 0 cx'; 0 fy' cy'; 0 0 1] x = (u  cx')/fx' y = (v  cy')/fy' [X,Y,W]^{T} = R^{1}*[x y 1]^{T} x' = X/W, y' = Y/W x" = x'*(1 + k_{1}r^{2} + k_{2}r^{4} + k_{3}r^{6}) + 2*p_{1}x'*y' + p_{2}(r^{2}+2*x'^{2}) y" = y'*(1 + k_{1}r^{2} + k_{2}r^{4} + k_{3}r^{6}) + p_{1}(r^{2}+2*y'^{2}) + 2*p_{2}*x'*y' mapx(u,v) = x"*fx + cx mapy(u,v) = y"*fy + cyNote that the code above does the reverse transformation from the target image (i.e. the ideal one, after undistortion and rectification) to the original "raw" image straight from the camera. That's for bilinear interpolation purposes and in order to fill the whole destination image w/o gaps using cvRemap.
Normally, this function is called [twice, once for each head of stereo camera] after cvStereoRectify.
But it is also possible to compute the rectification transformations directly from the fundamental matrix, e.g.
by using cvStereoRectifyUncalibrated. Such functions
work with pixels and produce homographies as rectification transformations, not rotation matrices
R
in 3D space. In this case, the R
can be computed from the homography matrix
H
as
R = inv(camera_matrix)*H*camera_matrix
Computes the ideal point coordinates from the observed point coordinates
void cvUndistortPoints( const CvMat* src, CvMat* dst, const CvMat* camera_matrix, const CvMat* dist_coeffs, const CvMat* R=NULL, const CvMat* P=NULL);
R1
or R2
, computed
by cvStereoRectify can be passed here.
If the parameter is NULL, the identity matrix is used.
P1
or P2
, computed
by cvStereoRectify can be passed here.
If the parameter is NULL, the identity matrix is used.
The function cvUndistortPoints
is similar to
cvInitUndistortRectifyMap and is opposite to it at the same time.
The functions are similar in that they both are used to correct lens distortion and to perform
the optional perspective (rectification) transformation. They are opposite because the
function cvInitUndistortRectifyMap does actually perform
the reverse transformation in order to initialize the maps properly, while this function
does the forward transformation. That is, in pseudocode it can be expressed as:
// (u,v) is the input point, (u', v') is the output point // camera_matrix=[fx 0 cx; 0 fy cy; 0 0 1] // P=[fx' 0 cx' tx; 0 fy' cy' ty; 0 0 1 tz] x" = (u  cx)/fx y" = (v  cy)/fy (x',y') = undistort(x",y",dist_coeffs) [X,Y,W]^{T} = R*[x' y' 1]^{T} x = X/W, y = Y/W u' = x*fx' + cx' v' = y*fy' + cy',where undistort() is approximate iterative algorithm that estimates the normalized original point coordinates out of the normalized distorted point coordinates ("normalized" means that the coordinates do not depend on the camera matrix).
The function can be used as for stereo cameras, as well as for individual cameras when R=NULL.
Finds positions of internal corners of the chessboard
int cvFindChessboardCorners( const void* image, CvSize pattern_size, CvPoint2D32f* corners, int* corner_count=NULL, int flags=CV_CALIB_CB_ADAPTIVE_THRESH );
CV_CALIB_CB_ADAPTIVE_THRESH
 use adaptive thresholding to convert the
image to blacknwhite, rather than a fixed threshold level (computed from the average image brightness).CV_CALIB_CB_NORMALIZE_IMAGE
 normalize the image using
cvNormalizeHist before applying fixed or adaptive thresholding.CV_CALIB_CB_FILTER_QUADS
 use additional criteria (like contour area, perimeter,
squarelike shape) to filter out false quads that are extracted at the contour retrieval stage.
The function cvFindChessboardCorners
attempts to determine whether the input
image is a view of the chessboard pattern and locate internal chessboard
corners. The function returns nonzero value if all the corners have been found
and they have been placed in a certain order (row by row, left to right in every
row), otherwise, if the function fails to find all the corners or reorder them,
it returns 0. For example, a regular chessboard has 8 x 8 squares and 7
x 7 internal corners, that is, points, where the black squares touch each other.
The coordinates detected are approximate, and to determine their position more accurately,
the user may use the function cvFindCornerSubPix.
Renders the detected chessboard corners
void cvDrawChessboardCorners( CvArr* image, CvSize pattern_size, CvPoint2D32f* corners, int count, int pattern_was_found );
The function cvDrawChessboardCorners
draws the individual chessboard corners detected (as red circles)
in case if the board was not found (pattern_was_found
=0) or the colored corners connected with lines
when the board was found (pattern_was_found
≠0).
Initializes structure containing object information
CvPOSITObject* cvCreatePOSITObject( CvPoint3D32f* points, int point_count );
The function cvCreatePOSITObject
allocates memory for the object structure and
computes the object inverse matrix.
The preprocessed object data is stored in the structure CvPOSITObject, internal for OpenCV, which means that the user cannot directly access the structure data. The user may only create this structure and pass its pointer to the function.
Object is defined as a set of points given in a coordinate system. The function
cvPOSIT computes a vector that begins at a camerarelated coordinate system center
and ends at the points[0]
of the object.
Implements POSIT algorithm
void cvPOSIT( CvPOSITObject* posit_object, CvPoint2D32f* image_points, double focal_length, CvTermCriteria criteria, CvMatr32f rotation_matrix, CvVect32f translation_vector );
The function cvPOSIT
implements POSIT algorithm. Image coordinates are given in a
camerarelated coordinate system. The focal length may be retrieved using camera
calibration functions. At every iteration of the algorithm new perspective
projection of estimated pose is computed.
Difference norm between two projections is the maximal distance between
corresponding points. The parameter criteria.epsilon
serves to stop the
algorithm if the difference is small.
Deallocates 3D object structure
void cvReleasePOSITObject( CvPOSITObject** posit_object );
CvPOSIT
structure.
The function cvReleasePOSITObject
releases memory previously allocated by the
function cvCreatePOSITObject.
Calculates homography matrix for oblong planar object (e.g. arm)
void cvCalcImageHomography( float* line, CvPoint3D32f* center, float* intrinsic, float* homography );
The function cvCalcImageHomography
calculates the homography matrix for the initial
image transformation from image plane to the plane, defined by 3D oblong object line (See
Figure 610 in OpenCV Guide 3D Reconstruction Chapter).
Calculates fundamental matrix from corresponding points in two images
int cvFindFundamentalMat( const CvMat* points1, const CvMat* points2, CvMat* fundamental_matrix, int method=CV_FM_RANSAC, double param1=3., double param2=0.99, CvMat* status=NULL);
2xN, Nx2, 3xN
or Nx3
size
(where N
is number of points).
Multichannel 1xN
or Nx1
array is also acceptable.
The point coordinates should be floatingpoint (single or double precision)
points1
The epipolar geometry is described by the following equation:
p_{2}^{T}*F*p_{1}=0,
where F
is fundamental matrix, p_{1}
and p_{2}
are corresponding
points in the first and the second images, respectively.
The function cvFindFundamentalMat
calculates fundamental matrix using one of four
methods listed above and returns the number of fundamental matrices found (1 or 3) and 0,
if no matrix is found.
The calculated fundamental matrix may be passed further to cvComputeCorrespondEpilines
that finds epipolar lines corresponding to the specified points.
int point_count = 100; CvMat* points1; CvMat* points2; CvMat* status; CvMat* fundamental_matrix; points1 = cvCreateMat(1,point_count,CV_32FC2); points2 = cvCreateMat(1,point_count,CV_32FC2); status = cvCreateMat(1,point_count,CV_8UC1); /* Fill the points here ... */ for( i = 0; i < point_count; i++ ) { points1>data.db[i*2] = <x_{1,i}>; points1>data.db[i*2+1] = <y_{1,i}>; points2>data.db[i*2] = <x_{2,i}>; points2>data.db[i*2+1] = <y_{2,i}>; } fundamental_matrix = cvCreateMat(3,3,CV_32FC1); int fm_count = cvFindFundamentalMat( points1,points2,fundamental_matrix, CV_FM_RANSAC,3,0.99,status );
For points in one image of stereo pair computes the corresponding epilines in the other image
void cvComputeCorrespondEpilines( const CvMat* points, int which_image, const CvMat* fundamental_matrix, CvMat* correspondent_lines);
2xN, Nx2, 3xN
or Nx3
array (where N
number of points).
Multichannel 1xN
or Nx1
array is also acceptable.
points
3xN
or Nx3
array
For every point in one of the two images of stereopair the function
cvComputeCorrespondEpilines
finds equation of a line that contains
the corresponding point (i.e. projection of the same 3D point) in the other image.
Each line is encoded by a vector of 3 elements l=[a,b,c]^{T}
, so that:
l^{T}*[x, y, 1]^{T}=0, or a*x + b*y + c = 0From the fundamental matrix definition (see cvFindFundamentalMatrix discussion), line
l_{2}
for a point p_{1}
in the first image (which_image
=1) can be computed as:
l_{2}=F*p_{1}and the line
l_{1}
for a point p_{2}
in the second image (which_image
=1) can be computed as:
l_{1}=F^{T}*p_{2}
Line coefficients are defined up to a scale.
They are normalized (a^{2}+b^{2}=1
)
are stored into correspondent_lines
.
Convert points to/from homogeneous coordinates
void cvConvertPointsHomogeneous( const CvMat* src, CvMat* dst );
2xN, Nx2, 3xN, Nx3, 4xN or Nx4
(where N
is the number of points).
Multichannel 1xN
or Nx1
array is also acceptable.
The function cvConvertPointsHomogeneous
converts 2D or 3D points
from/to homogeneous coordinates, or simply copies or transposes the array.
In case if the input array dimensionality is larger than the output,
each point coordinates are divided by the last coordinate:
(x,y[,z],w) > (x',y'[,z']): x' = x/w y' = y/w z' = z/w (if output is 3D)If the output array dimensionality is larger, an extra 1 is appended to each point.
(x,y[,z]) > (x,y[,z],1)Otherwise, the input array is simply copied (with optional transposition) to the output. Note that, because the function accepts a large variety of array layouts, it may report an error when input/output array dimensionality is ambiguous. It is always safe to use the function with number of points
N
>=5, or
to use multichannel Nx1
or 1xN
arrays.
The structure for block matching stereo correspondence algorithm
typedef struct CvStereoBMState { //pre filters (normalize input images): int preFilterType; // 0 for now int preFilterSize; // ~5x5..21x21 int preFilterCap; // up to ~31 //correspondence using Sum of Absolute Difference (SAD): int SADWindowSize; // Could be 5x5..21x21 int minDisparity; // minimum disparity (=0) int numberOfDisparities; // maximum disparity  minimum disparity //post filters (knock out bad matches): int textureThreshold; // areas with no texture are ignored float uniquenessRatio;// filter out pixels if there are other close matches // with different disparity int speckleWindowSize;// Disparity variation window (not used) int speckleRange; // Acceptable range of variation in window (not used) // internal buffers, do not modify (!) CvMat* preFilteredImg0; CvMat* preFilteredImg1; CvMat* slidingSumBuf; } CvStereoBMState;
The block matching stereo correspondence algorithm, by Kurt Konolige, is very fast
onepass stereo matching algorithm that uses sliding sums of absolute differences
between pixels in the left image and the pixels in the right image,
shifted by some varying amount of pixels (from minDisparity
to
minDisparity+numberOfDisparities
). On a pair of images WxH the algorithm
computes disparity in O(W*H*numberOfDisparities)
time. In order to improve
quality and reability of the disparity map, the algorithm includes prefiltering and
postfiltering procedures.
Note that the algorithm searches for the corresponding blocks in x direction only. It means that the supplied stereo pair should be rectified. Vertical stereo layout is not directly supported, but in such a case the images could be transposed by user.
Creates block matching stereo correspondence structure
#define CV_STEREO_BM_BASIC 0 #define CV_STEREO_BM_FISH_EYE 1 #define CV_STEREO_BM_NARROW 2 CvStereoBMState* cvCreateStereoBMState( int preset=CV_STEREO_BM_BASIC, int numberOfDisparities=0 );
The function cvCreateStereoBMState creates the stereo correspondence structure and initializes it. It is possible to override any of the parameters at any time between the calls to cvFindStereoCorrespondenceBM.
Releases block matching stereo correspondence structure
void cvReleaseStereoBMState( CvStereoBMState** state );
The function cvReleaseStereoBMState releases the stereo correspondence structure and all the associated internal buffers.
Computes the disparity map using block matching algorithm
void cvFindStereoCorrespondenceBM( const CvArr* left, const CvArr* right, CvArr* disparity, CvStereoBMState* state );
The function cvFindStereoCorrespondenceBM computes disparity map for the input rectified stereo pair.
The structure for graph cutsbased stereo correspondence algorithm
typedef struct CvStereoGCState { int Ithreshold; // threshold for piecewise linear data cost function (5 by default) int interactionRadius; // radius for smoothness cost function (1 by default; means Potts model) float K, lambda, lambda1, lambda2; // parameters for the cost function // (usually computed adaptively from the input data) int occlusionCost; // 10000 by default int minDisparity; // 0 by default; see CvStereoBMState int numberOfDisparities; // defined by user; see CvStereoBMState int maxIters; // number of iterations; defined by user. // internal buffers CvMat* left; CvMat* right; CvMat* dispLeft; CvMat* dispRight; CvMat* ptrLeft; CvMat* ptrRight; CvMat* vtxBuf; CvMat* edgeBuf; } CvStereoGCState;
The graph cuts stereo correspondence algorithm, described in [Kolmogorov03] (as KZ1), is nonrealtime stereo correpsondence algorithm that usually gives very accurate depth map with welldefined object boundaries. The algorithm represents stereo problem as a sequence of binary optimization problems, each of those is solved using maximum graph flow algorithm. The state structure above should not be allocated and initialized manually; instead, use cvCreateStereoGCState and then override necessary parameters if needed.
Creates the state of graph cutbased stereo correspondence algorithm
CvStereoGCState* cvCreateStereoGCState( int numberOfDisparities, int maxIters );
state>minDisparity ≤ disparity < state>minDisparity + state>numberOfDisparities
The function cvCreateStereoGCState creates the stereo correspondence structure and initializes it. It is possible to override any of the parameters at any time between the calls to cvFindStereoCorrespondenceGC.
Releases the state structure of the graph cutbased stereo correspondence algorithm
void cvReleaseStereoGCState( CvStereoGCState** state );
The function cvReleaseStereoGCState releases the stereo correspondence structure and all the associated internal buffers.
Computes the disparity map using graph cutbased algorithm
void cvFindStereoCorrespondenceGC( const CvArr* left, const CvArr* right, CvArr* dispLeft, CvArr* dispRight, CvStereoGCState* state, int useDisparityGuess CV_DEFAULT(0) );
The function cvFindStereoCorrespondenceGC computes disparity maps for the input rectified stereo pair. Note that the left disparity image will contain values in the following range:
state>numberOfDisparitiesstate>minDisparity < dispLeft(x,y) ≤ state>minDisparity, or dispLeft(x,y) == CV_STEREO_GC_OCCLUSION,where as for the right disparity image the following will be true:
state>minDisparity ≤ dispRight(x,y) < state>minDisparity+state>numberOfDisparities, or dispRight(x,y) == CV_STEREO_GC_OCCLUSION,that is, the range for the left disparity image will be inversed, and the pixels for which no good match has been found, will be marked as occlusions.
Here is how the function can be called:
// image_left and image_right are the input 8bit singlechannel images // from the left and the right cameras, respectively CvSize size = cvGetSize(image_left); CvMat* disparity_left = cvCreateMat( size.height, size.width, CV_16S ); CvMat* disparity_right = cvCreateMat( size.height, size.width, CV_16S ); CvStereoGCState* state = cvCreateStereoGCState( 16, 2 ); cvFindStereoCorrespondenceGC( image_left, image_right, disparity_left, disparity_right, state, 0 ); cvReleaseStereoGCState( &state ); // now process the computed disparity images as you want ...and this is the output left disparity image computed from the wellknown Tsukuba stereo pair and multiplied by 16 (because the values in the left disparity images are usually negative):
CvMat* disparity_left_visual = cvCreateMat( size.height, size.width, CV_8U ); cvConvertScale( disparity_left, disparity_left_visual, 16 ); cvSave( "disparity.png", disparity_left_visual );
Reprojects disparity image to 3D space
void cvReprojectImageTo3D( const CvArr* disparity, CvArr* _3dImage, const CvMat* Q );
The function cvReprojectImageTo3D transforms 1channel disparity map to 3channel image, a 3D surface.
That is, for each pixel (x,y)
and the corresponding disparity
d=disparity(x,y)
it computes:
[X Y Z W]^{T} = Q*[x y d 1]^{T} _3dImage(x,y) = (X/W, Y/W, Z/W)
The matrix Q can be arbitrary, e.g. the one, computed by cvStereoRectify. To reproject a sparse set of points {(x,y,d),...} to 3D space, use cvPerspectiveTransform.
2DRotationMatrix 
Acc  ApproxChains  ArcLength 
AdaptiveThreshold  ApproxPoly 
BoundingRect  BoxPoints 
Dilate  DistTransform  DrawChessBoardCorners 
EndFindContours  EqualizeHist  Erode 
HaarDetectObjects  HoughCircles  HoughLines2 
InitUndistortMap  Inpaint  
InitUndistortRectifyMap  Integral 
KalmanCorrect  KalmanPredict 
Laplace  LoadHaarClassifierCascade  LogPolar 
MakeHistHeaderForArray  MaxRect  Moments 
MatchContourTrees  MeanShift  MorphologyEx 
MatchShapes  MinAreaRect2  MultiplyAcc 
MatchTemplate  MinEnclosingCircle 
NormalizeHist 
POSIT  PreCornerDetect  PyrMeanShiftFiltering 
PointPolygonTest  ProjectPoints2  PyrSegmentation 
PointSeqFromMat  PyrDown  PyrUp 
QueryHistValue_*D 
ThreshHist  Threshold 
Undistort2  UndistortPoints  UpdateMotionHistory 
WarpAffine  WarpPerspective  Watershed 
This bibliography provides a list of publications that were might be useful to the OpenCV users. This list is not complete; it serves only as a starting point.