Miscellaneous Image Transformations

cv::adaptiveThreshold

Comments from the Wiki

void adaptiveThreshold(const Mat& src, Mat& dst, double maxValue, int adaptiveMethod, int thresholdType, int blockSize, double C)

Applies an adaptive threshold to an array.

Parameters:
  • src – Source 8-bit single-channel image
  • dst – Destination image; will have the same size and the same type as src
  • maxValue – The non-zero value assigned to the pixels for which the condition is satisfied. See the discussion
  • adaptiveMethod – Adaptive thresholding algorithm to use, ADAPTIVE_THRESH_MEAN_C or ADAPTIVE_THRESH_GAUSSIAN_C (see the discussion)
  • thresholdType – Thresholding type; must be one of THRESH_BINARY or THRESH_BINARY_INV
  • blockSize – The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, and so on
  • C – The constant subtracted from the mean or weighted mean (see the discussion); normally, it’s positive, but may be zero or negative as well

The function transforms a grayscale image to a binary image according to the formulas:

  • THRESH_BINARY

    dst(x,y) =  \fork{\texttt{maxValue}}{if $src(x,y) > T(x,y)$}{0}{otherwise}

  • THRESH_BINARY_INV

    dst(x,y) =  \fork{0}{if $src(x,y) > T(x,y)$}{\texttt{maxValue}}{otherwise}

where T(x,y) is a threshold calculated individually for each pixel.

  1. For the method ADAPTIVE_THRESH_MEAN_C the threshold value T(x,y) is the mean of a \texttt{blockSize} \times \texttt{blockSize} neighborhood of (x, y) , minus C .
  2. For the method ADAPTIVE_THRESH_GAUSSIAN_C the threshold value T(x, y) is the weighted sum (i.e. cross-correlation with a Gaussian window) of a \texttt{blockSize} \times \texttt{blockSize} neighborhood of (x, y) , minus C . The default sigma (standard deviation) is used for the specified blockSize , see getGaussianKernel() .

The function can process the image in-place.

See also: threshold() , blur() , GaussianBlur()

cv::cvtColor

Comments from the Wiki

void cvtColor(const Mat& src, Mat& dst, int code, int dstCn=0)

Converts image from one color space to another

Parameters:
  • src – The source image, 8-bit unsigned, 16-bit unsigned ( CV_16UC... ) or single-precision floating-point
  • dst – The destination image; will have the same size and the same depth as src
  • code – The color space conversion code; see the discussion
  • dstCn – The number of channels in the destination image; if the parameter is 0, the number of the channels will be derived automatically from src and the code

The function converts the input image from one color space to another. In the case of transformation to-from RGB color space the ordering of the channels should be specified explicitly (RGB or BGR).

The conventional ranges for R, G and B channel values are:

  • 0 to 255 for CV_8U images
  • 0 to 65535 for CV_16U images and
  • 0 to 1 for CV_32F images.

Of course, in the case of linear transformations the range does not matter, but in the non-linear cases the input RGB image should be normalized to the proper value range in order to get the correct results, e.g. for RGB \rightarrow L*u*v* transformation. For example, if you have a 32-bit floating-point image directly converted from 8-bit image without any scaling, then it will have 0..255 value range, instead of the assumed by the function 0..1. So, before calling cvtColor , you need first to scale the image down:

img *= 1./255;
cvtColor(img, img, CV_BGR2Luv);

The function can do the following transformations:

  • Transformations within RGB space like adding/removing the alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5), as well as conversion to/from grayscale using:

    \text{RGB[A] to Gray:} \quad Y  \leftarrow 0.299  \cdot R + 0.587  \cdot G + 0.114  \cdot B

    and

    \text{Gray to RGB[A]:} \quad R  \leftarrow Y, G  \leftarrow Y, B  \leftarrow Y, A  \leftarrow 0

    The conversion from a RGB image to gray is done with:

    cvtColor(src, bwsrc, CV_RGB2GRAY);
    

    Some more advanced channel reordering can also be done with mixChannels() .

  • RGB \leftrightarrow CIE XYZ.Rec 709 with D65 white point ( CV_BGR2XYZ, CV_RGB2XYZ, CV_XYZ2BGR, CV_XYZ2RGB ):

    \begin{bmatrix} X  \\ Y  \\ Z
  \end{bmatrix} \leftarrow \begin{bmatrix} 0.412453 & 0.357580 & 0.180423 \\ 0.212671 & 0.715160 & 0.072169 \\ 0.019334 & 0.119193 & 0.950227
  \end{bmatrix} \cdot \begin{bmatrix} R  \\ G  \\ B
  \end{bmatrix}

    \begin{bmatrix} R  \\ G  \\ B
  \end{bmatrix} \leftarrow \begin{bmatrix} 3.240479 & -1.53715 & -0.498535 \\ -0.969256 &  1.875991 & 0.041556 \\ 0.055648 & -0.204043 & 1.057311
  \end{bmatrix} \cdot \begin{bmatrix} X  \\ Y  \\ Z
  \end{bmatrix}

    X , Y and Z cover the whole value range (in the case of floating-point images Z may exceed 1).

  • RGB \leftrightarrow YCrCb JPEG (a.k.a. YCC) ( CV_BGR2YCrCb, CV_RGB2YCrCb, CV_YCrCb2BGR, CV_YCrCb2RGB )

    Y  \leftarrow 0.299  \cdot R + 0.587  \cdot G + 0.114  \cdot B

    Cr  \leftarrow (R-Y)  \cdot 0.713 + delta

    Cb  \leftarrow (B-Y)  \cdot 0.564 + delta

    R  \leftarrow Y + 1.403  \cdot (Cr - delta)

    G  \leftarrow Y - 0.344  \cdot (Cr - delta) - 0.714  \cdot (Cb - delta)

    B  \leftarrow Y + 1.773  \cdot (Cb - delta)

    where

    delta =  \left \{ \begin{array}{l l} 128 &  \mbox{for 8-bit images} \\ 32768 &  \mbox{for 16-bit images} \\ 0.5 &  \mbox{for floating-point images} \end{array} \right .

    Y, Cr and Cb cover the whole value range.

  • RGB \leftrightarrow HSV ( CV_BGR2HSV, CV_RGB2HSV, CV_HSV2BGR, CV_HSV2RGB )

    in the case of 8-bit and 16-bit images R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range

    V  \leftarrow max(R,G,B)

    S  \leftarrow \fork{\frac{V-min(R,G,B)}{V}}{if $V \neq 0$}{0}{otherwise}

    H  \leftarrow \forkthree{{60(G - B)}/{S}}{if $V=R$}{{120+60(B - R)}/{S}}{if $V=G$}{{240+60(R - G)}/{S}}{if $V=B$}

    if H<0 then H \leftarrow H+360 On output 0 \leq V \leq 1 , 0 \leq S \leq 1 , 0 \leq H \leq 360 .

    The values are then converted to the destination data type:

    • 8-bit images

      V  \leftarrow 255 V, S  \leftarrow 255 S, H  \leftarrow H/2  \text{(to fit to 0 to 255)}

    • 16-bit images (currently not supported)

      V <- 65535 V, S <- 65535 S, H <- H

    • 32-bit images

      H, S, V are left as is

  • RGB \leftrightarrow HLS ( CV_BGR2HLS, CV_RGB2HLS, CV_HLS2BGR, CV_HLS2RGB ).

    in the case of 8-bit and 16-bit images R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range.

    V_{max}  \leftarrow {max}(R,G,B)

    V_{min}  \leftarrow {min}(R,G,B)

    L  \leftarrow \frac{V_{max} + V_{min}}{2}

    S  \leftarrow \fork { \frac{V_{max} - V_{min}}{V_{max} + V_{min}} }{if  $L < 0.5$ }
    { \frac{V_{max} - V_{min}}{2 - (V_{max} + V_{min})} }{if  $L \ge 0.5$ }

    H  \leftarrow \forkthree {{60(G - B)}/{S}}{if  $V_{max}=R$ }
  {{120+60(B - R)}/{S}}{if  $V_{max}=G$ }
  {{240+60(R - G)}/{S}}{if  $V_{max}=B$ }

    if H<0 then H \leftarrow H+360 On output 0 \leq L \leq 1 , 0 \leq S \leq 1 , 0 \leq H \leq 360 .

    The values are then converted to the destination data type:

    • 8-bit images

      V  \leftarrow 255 \cdot V, S  \leftarrow 255 \cdot S, H  \leftarrow H/2 \; \text{(to fit to 0 to 255)}

    • 16-bit images (currently not supported)

      V <- 65535 \cdot V, S <- 65535 \cdot S, H <- H

    • 32-bit images

      H, S, V are left as is

  • RGB \leftrightarrow CIE L*a*b* ( CV_BGR2Lab, CV_RGB2Lab, CV_Lab2BGR, CV_Lab2RGB )

    in the case of 8-bit and 16-bit images R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range

    \vecthree{X}{Y}{Z} \leftarrow \vecthreethree{0.412453}{0.357580}{0.180423}{0.212671}{0.715160}{0.072169}{0.019334}{0.119193}{0.950227} \cdot \vecthree{R}{G}{B}

    X  \leftarrow X/X_n,  \text{where} X_n = 0.950456

    Z  \leftarrow Z/Z_n,  \text{where} Z_n = 1.088754

    L  \leftarrow \fork{116*Y^{1/3}-16}{for $Y>0.008856$}{903.3*Y}{for $Y \le 0.008856$}

    a  \leftarrow 500 (f(X)-f(Y)) + delta

    b  \leftarrow 200 (f(Y)-f(Z)) + delta

    where

    f(t)= \fork{t^{1/3}}{for $t>0.008856$}{7.787 t+16/116}{for $t\leq 0.008856$}

    and

    delta =  \fork{128}{for 8-bit images}{0}{for floating-point images}

    On output 0 \leq L \leq 100 , -127 \leq a \leq 127 , -127 \leq b \leq 127 The values are then converted to the destination data type:

    • 8-bit images

      L  \leftarrow L*255/100, \; a  \leftarrow a + 128, \; b  \leftarrow b + 128

    • 16-bit images

      currently not supported

    • 32-bit images

      L, a, b are left as is

  • RGB \leftrightarrow CIE L*u*v* ( CV_BGR2Luv, CV_RGB2Luv, CV_Luv2BGR, CV_Luv2RGB )

    in the case of 8-bit and 16-bit images R, G and B are converted to floating-point format and scaled to fit 0 to 1 range

    \vecthree{X}{Y}{Z} \leftarrow \vecthreethree{0.412453}{0.357580}{0.180423}{0.212671}{0.715160}{0.072169}{0.019334}{0.119193}{0.950227} \cdot \vecthree{R}{G}{B}

    L  \leftarrow \fork{116 Y^{1/3}}{for $Y>0.008856$}{903.3 Y}{for $Y\leq 0.008856$}

    u'  \leftarrow 4*X/(X + 15*Y + 3 Z)

    v'  \leftarrow 9*Y/(X + 15*Y + 3 Z)

    u  \leftarrow 13*L*(u' - u_n)  \quad \text{where} \quad u_n=0.19793943

    v  \leftarrow 13*L*(v' - v_n)  \quad \text{where} \quad v_n=0.46831096

    On output 0 \leq L \leq 100 , -134 \leq u \leq 220 , -140 \leq v \leq 122 .

    The values are then converted to the destination data type:

    • 8-bit images

      L  \leftarrow 255/100 L, \; u  \leftarrow 255/354 (u + 134), \; v  \leftarrow 255/256 (v + 140)

    • 16-bit images

      currently not supported

    • 32-bit images

      L, u, v are left as is

    The above formulas for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from the Charles Poynton site http://www.poynton.com/ColorFAQ.html

  • Bayer \rightarrow RGB ( CV_BayerBG2BGR, CV_BayerGB2BGR, CV_BayerRG2BGR, CV_BayerGR2BGR, CV_BayerBG2RGB, CV_BayerGB2RGB, CV_BayerRG2RGB, CV_BayerGR2RGB ) The Bayer pattern is widely used in CCD and CMOS cameras. It allows one to get color pictures from a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:

    \newcommand{\Rcell}{\color{red}R} \newcommand{\Gcell}{\color{green}G} \newcommand{\Bcell}{\color{blue}B} \definecolor{BackGray}{rgb}{0.8,0.8,0.8} \begin{array}{ c c c c c } \Rcell & \Gcell & \Rcell & \Gcell & \Rcell \\ \Gcell & \colorbox{BackGray}{\Bcell} & \colorbox{BackGray}{\Gcell} & \Bcell & \Gcell \\ \Rcell & \Gcell & \Rcell & \Gcell & \Rcell \\ \Gcell & \Bcell & \Gcell & \Bcell & \Gcell \\ \Rcell & \Gcell & \Rcell & \Gcell & \Rcell \end{array}

    The output RGB components of a pixel are interpolated from 1, 2 or 4 neighbors of the pixel having the same color. There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. The two letters C_1 and C_2 in the conversion constants CV_Bayer C_1 C_2 2BGR and CV_Bayer C_1 C_2 2RGB indicate the particular pattern type - these are components from the second row, second and third columns, respectively. For example, the above pattern has very popular “BG” type.

cv::distanceTransform

Comments from the Wiki

void distanceTransform(const Mat& src, Mat& dst, int distanceType, int maskSize)
void distanceTransform(const Mat& src, Mat& dst, Mat& labels, int distanceType, int maskSize)

Calculates the distance to the closest zero pixel for each pixel of the source image.

Parameters:
  • src – 8-bit, single-channel (binary) source image
  • dst – Output image with calculated distances; will be 32-bit floating-point, single-channel image of the same size as src
  • distanceType – Type of distance; can be CV_DIST_L1, CV_DIST_L2 or CV_DIST_C
  • maskSize – Size of the distance transform mask; can be 3, 5 or CV_DIST_MASK_PRECISE (the latter option is only supported by the first of the functions). In the case of CV_DIST_L1 or CV_DIST_C distance type the parameter is forced to 3, because a 3\times 3 mask gives the same result as a 5\times 5 or any larger aperture.
  • labels – The optional output 2d array of labels - the discrete Voronoi diagram; will have type CV_32SC1 and the same size as src . See the discussion

The functions distanceTransform calculate the approximate or precise distance from every binary image pixel to the nearest zero pixel. (for zero image pixels the distance will obviously be zero).

When maskSize == CV_DIST_MASK_PRECISE and distanceType == CV_DIST_L2 , the function runs the algorithm described in Felzenszwalb04 .

In other cases the algorithm Borgefors86 is used, that is, for pixel the function finds the shortest path to the nearest zero pixel consisting of basic shifts: horizontal, vertical, diagonal or knight’s move (the latest is available for a 5\times 5 mask). The overall distance is calculated as a sum of these basic distances. Because the distance function should be symmetric, all of the horizontal and vertical shifts must have the same cost (that is denoted as a ), all the diagonal shifts must have the same cost (denoted b ), and all knight’s moves must have the same cost (denoted c ). For CV_DIST_C and CV_DIST_L1 types the distance is calculated precisely, whereas for CV_DIST_L2 (Euclidian distance) the distance can be calculated only with some relative error (a 5\times 5 mask gives more accurate results). For a , b and c OpenCV uses the values suggested in the original paper:

CV_DIST_C (3\times 3) a = 1, b = 1
CV_DIST_L1 (3\times 3) a = 1, b = 2
CV_DIST_L2 (3\times 3) a=0.955, b=1.3693
CV_DIST_L2 (5\times 5) a=1, b=1.4, c=2.1969

Typically, for a fast, coarse distance estimation CV_DIST_L2 , a 3\times 3 mask is used, and for a more accurate distance estimation CV_DIST_L2 , a 5\times 5 mask or the precise algorithm is used. Note that both the precise and the approximate algorithms are linear on the number of pixels.

The second variant of the function does not only compute the minimum distance for each pixel (x, y) , but it also identifies the nearest the nearest connected component consisting of zero pixels. Index of the component is stored in \texttt{labels}(x, y) . The connected components of zero pixels are also found and marked by the function.

In this mode the complexity is still linear. That is, the function provides a very fast way to compute Voronoi diagram for the binary image. Currently, this second variant can only use the approximate distance transform algorithm.

cv::floodFill

Comments from the Wiki

int floodFill(Mat& image, Point seed, Scalar newVal, Rect* rect=0, Scalar loDiff=Scalar(), Scalar upDiff=Scalar(), int flags=4)
int floodFill(Mat& image, Mat& mask, Point seed, Scalar newVal, Rect* rect=0, Scalar loDiff=Scalar(), Scalar upDiff=Scalar(), int flags=4)

Fills a connected component with the given color.

Parameters:
  • image – Input/output 1- or 3-channel, 8-bit or floating-point image. It is modified by the function unless the FLOODFILL_MASK_ONLY flag is set (in the second variant of the function; see below)
  • mask – (For the second function only) Operation mask, should be a single-channel 8-bit image, 2 pixels wider and 2 pixels taller. The function uses and updates the mask, so the user takes responsibility of initializing the mask content. Flood-filling can’t go across non-zero pixels in the mask, for example, an edge detector output can be used as a mask to stop filling at edges. It is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap. Note : because the mask is larger than the filled image, a pixel (x, y) in image will correspond to the pixel (x+1, y+1) in the mask
  • seed – The starting point
  • newVal – New value of the repainted domain pixels
  • loDiff – Maximal lower brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component
  • upDiff – Maximal upper brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component
  • rect – The optional output parameter that the function sets to the minimum bounding rectangle of the repainted domain
  • flags

    The operation flags. Lower bits contain connectivity value, 4 (by default) or 8, used within the function. Connectivity determines which neighbors of a pixel are considered. Upper bits can be 0 or a combination of the following flags:

    • FLOODFILL_FIXED_RANGE if set, the difference between the current pixel and seed pixel is considered, otherwise the difference between neighbor pixels is considered (i.e. the range is floating)
    • FLOODFILL_MASK_ONLY (for the second variant only) if set, the function does not change the image ( newVal is ignored), but fills the mask

The functions floodFill fill a connected component starting from the seed point with the specified color. The connectivity is determined by the color/brightness closeness of the neighbor pixels. The pixel at (x,y) is considered to belong to the repainted domain if:

  • grayscale image, floating range

    \texttt{src} (x',y')- \texttt{loDiff} \leq \texttt{src} (x,y)  \leq \texttt{src} (x',y')+ \texttt{upDiff}

  • grayscale image, fixed range

    \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)- \texttt{loDiff} \leq \texttt{src} (x,y)  \leq \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)+ \texttt{upDiff}

  • color image, floating range

    \texttt{src} (x',y')_r- \texttt{loDiff} _r \leq \texttt{src} (x,y)_r \leq \texttt{src} (x',y')_r+ \texttt{upDiff} _r

    \texttt{src} (x',y')_g- \texttt{loDiff} _g \leq \texttt{src} (x,y)_g \leq \texttt{src} (x',y')_g+ \texttt{upDiff} _g

    \texttt{src} (x',y')_b- \texttt{loDiff} _b \leq \texttt{src} (x,y)_b \leq \texttt{src} (x',y')_b+ \texttt{upDiff} _b

  • color image, fixed range

    \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_r- \texttt{loDiff} _r \leq \texttt{src} (x,y)_r \leq \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_r+ \texttt{upDiff} _r

    \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_g- \texttt{loDiff} _g \leq \texttt{src} (x,y)_g \leq \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_g+ \texttt{upDiff} _g

    \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_b- \texttt{loDiff} _b \leq \texttt{src} (x,y)_b \leq \texttt{src} ( \texttt{seed} .x, \texttt{seed} .y)_b+ \texttt{upDiff} _b

where src(x',y') is the value of one of pixel neighbors that is already known to belong to the component. That is, to be added to the connected component, a pixel’s color/brightness should be close enough to the:

  • color/brightness of one of its neighbors that are already referred to the connected component in the case of floating range
  • color/brightness of the seed point in the case of fixed range.

By using these functions you can either mark a connected component with the specified color in-place, or build a mask and then extract the contour or copy the region to another image etc. Various modes of the function are demonstrated in floodfill.c sample.

See also: findContours()

cv::inpaint

Comments from the Wiki

void inpaint(const Mat& src, const Mat& inpaintMask, Mat& dst, double inpaintRadius, int flags)

Inpaints the selected region in the image.

Parameters:
  • src – The input 8-bit 1-channel or 3-channel image.
  • inpaintMask – The inpainting mask, 8-bit 1-channel image. Non-zero pixels indicate the area that needs to be inpainted.
  • dst – The output image; will have the same size and the same type as src
  • inpaintRadius – The radius of a circlular neighborhood of each point inpainted that is considered by the algorithm.
  • flags

    The inpainting method, one of the following:

    • INPAINT_NS Navier-Stokes based method.
    • INPAINT_TELEA The method by Alexandru Telea Telea04

The function reconstructs the selected image area from the pixel near the area boundary. The function may be used to remove dust and scratches from a scanned photo, or to remove undesirable objects from still images or video. See http://en.wikipedia.org/wiki/Inpainting for more details.

cv::integral

Comments from the Wiki

void integral(const Mat& image, Mat& sum, int sdepth=-1)
void integral(const Mat& image, Mat& sum, Mat& sqsum, int sdepth=-1)
void integral(const Mat& image, Mat& sum, Mat& sqsum, Mat& tilted, int sdepth=-1)

Calculates the integral of an image.

Parameters:
  • image – The source image, W \times H , 8-bit or floating-point (32f or 64f)
  • sum – The integral image, (W+1)\times (H+1) , 32-bit integer or floating-point (32f or 64f)
  • sqsum – The integral image for squared pixel values, (W+1)\times (H+1) , double precision floating-point (64f)
  • tilted – The integral for the image rotated by 45 degrees, (W+1)\times (H+1) , the same data type as sum
  • sdepth – The desired depth of the integral and the tilted integral images, CV_32S , CV_32F or CV_64F

The functions integral calculate one or more integral images for the source image as following:

\texttt{sum} (X,Y) =  \sum _{x<X,y<Y}  \texttt{image} (x,y)

\texttt{sqsum} (X,Y) =  \sum _{x<X,y<Y}  \texttt{image} (x,y)^2

\texttt{tilted} (X,Y) =  \sum _{y<Y,abs(x-X+1) \leq Y-y-1}  \texttt{image} (x,y)

Using these integral images, one may calculate sum, mean and standard deviation over a specific up-right or rotated rectangular region of the image in a constant time, for example:

\sum _{x_1 \leq x < x_2,  \, y_1  \leq y < y_2}  \texttt{image} (x,y) =  \texttt{sum} (x_2,y_2)- \texttt{sum} (x_1,y_2)- \texttt{sum} (x_2,y_1)+ \texttt{sum} (x_1,x_1)

It makes possible to do a fast blurring or fast block correlation with variable window size, for example. In the case of multi-channel images, sums for each channel are accumulated independently.

As a practical example, the next figure shows the calculation of the integral of a straight rectangle Rect(3,3,3,2) and of a tilted rectangle Rect(5,1,2,3) . The selected pixels in the original image are shown, as well as the relative pixels in the integral images sum and tilted .

begin{center}

_images/integral.png

end{center}

cv::threshold

Comments from the Wiki

double threshold(const Mat& src, Mat& dst, double thresh, double maxVal, int thresholdType)

Applies a fixed-level threshold to each array element

Parameters:
  • src – Source array (single-channel, 8-bit of 32-bit floating point)
  • dst – Destination array; will have the same size and the same type as src
  • thresh – Threshold value
  • maxVal – Maximum value to use with THRESH_BINARY and THRESH_BINARY_INV thresholding types
  • thresholdType – Thresholding type (see the discussion)

The function applies fixed-level thresholding to a single-channel array. The function is typically used to get a bi-level (binary) image out of a grayscale image ( compare() could be also used for this purpose) or for removing a noise, i.e. filtering out pixels with too small or too large values. There are several types of thresholding that the function supports that are determined by thresholdType :

  • THRESH_BINARY

    \texttt{dst} (x,y) =  \fork{\texttt{maxVal}}{if $\texttt{src}(x,y) > \texttt{thresh}$}{0}{otherwise}

  • THRESH_BINARY_INV

    \texttt{dst} (x,y) =  \fork{0}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{maxVal}}{otherwise}

  • THRESH_TRUNC

    \texttt{dst} (x,y) =  \fork{\texttt{threshold}}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{src}(x,y)}{otherwise}

  • THRESH_TOZERO

    \texttt{dst} (x,y) =  \fork{\texttt{src}(x,y)}{if $\texttt{src}(x,y) > \texttt{thresh}$}{0}{otherwise}

  • THRESH_TOZERO_INV

    \texttt{dst} (x,y) =  \fork{0}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{src}(x,y)}{otherwise}

Also, the special value THRESH_OTSU may be combined with one of the above values. In this case the function determines the optimal threshold value using Otsu’s algorithm and uses it instead of the specified thresh . The function returns the computed threshold value. Currently, Otsu’s method is implemented only for 8-bit images.

_images/threshold.png

See also: adaptiveThreshold() , findContours() , compare() , min() , max()

cv::watershed

Comments from the Wiki

void watershed(const Mat& image, Mat& markers)

Does marker-based image segmentation using watershed algrorithm

Parameters:
  • image – The input 8-bit 3-channel image.
  • markers – The input/output 32-bit single-channel image (map) of markers. It should have the same size as image

The function implements one of the variants of watershed, non-parametric marker-based segmentation algorithm, described in Meyer92 . Before passing the image to the function, user has to outline roughly the desired regions in the image markers with positive ( >0 ) indices, i.e. every region is represented as one or more connected components with the pixel values 1, 2, 3 etc (such markers can be retrieved from a binary mask using findContours() and drawContours() , see watershed.cpp demo). The markers will be “seeds” of the future image regions. All the other pixels in markers , which relation to the outlined regions is not known and should be defined by the algorithm, should be set to 0’s. On the output of the function, each pixel in markers is set to one of values of the “seed” components, or to -1 at boundaries between the regions.

Note, that it is not necessary that every two neighbor connected components are separated by a watershed boundary (-1’s pixels), for example, in case when such tangent components exist in the initial marker image. Visual demonstration and usage example of the function can be found in OpenCV samples directory; see watershed.cpp demo.

See also: findContours()

cv::grabCut

Comments from the Wiki

void grabCut(const Mat& image, Mat& mask, Rect rect, Mat& bgdModel, Mat& fgdModel, int iterCount, int mode)

Runs GrabCut algorithm

Parameters:
  • image – The input 8-bit 3-channel image.
  • mask

    The input/output 8-bit single-channel mask. Its elements may have one of four values. The mask is initialize when mode==GC_INIT_WITH_RECT

    • GC_BGD Certainly a background pixel
    • GC_FGD Certainly a foreground (object) pixel
    • GC_PR_BGD Likely a background pixel
    • GC_PR_BGD Likely a foreground pixel
  • rect – The ROI containing the segmented object. The pixels outside of the ROI are marked as “certainly a background”. The parameter is only used when mode==GC_INIT_WITH_RECT
  • fgdModel (bgdModel,) – Temporary arrays used for segmentation. Do not modify them while you are processing the same image
  • iterCount – The number of iterations the algorithm should do before returning the result. Note that the result can be refined with further calls with the mode==GC_INIT_WITH_MASK or mode==GC_EVAL
  • mode

    The operation mode

    • GC_INIT_WITH_RECT The function initializes the state and the mask using the provided rectangle. After that it runs iterCount iterations of the algorithm
    • GC_INIT_WITH_MASK The function initializes the state using the provided mask. Note that GC_INIT_WITH_RECT and GC_INIT_WITH_MASK can be combined, then all the pixels outside of the ROI are automatically initialized with GC_BGD

    .

    • GC_EVAL The value means that algorithm should just resume.

The function implements the GrabCut image segmentation algorithm. See the sample grabcut.cpp on how to use the function.