0
0
mirror of https://github.com/opencv/opencv.git synced 2026-01-18 17:21:42 +01:00

Merge pull request #23190 from JonasPerolini:pr-output-marker-score

Include pixel-based confidence in ArUco marker detection #23190

The aim of this pull request is to compute a **pixel-based confidence** of the marker detection. The confidence [0;1] is defined as the percentage of correctly detected pixels, with 1 describing a pixel perfect detection. Currently it is possible to get the normalized Hamming distance between the detected marker and the dictionary ground truth [Dictionary::getDistanceToId()](https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/aruco/aruco_dictionary.cpp#L114) However, this distance is based on the extracted bits and we lose information in the [majority count step](https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/aruco/aruco_detector.cpp#L487). For example, even if each cell has 49% incorrect pixels, we still obtain a perfect Hamming distance.

**Implementation tests**: Generate 36 synthetic images containing 4 markers each (with different ids) so a total of 144 markers. Invert a given percentage of pixels in each cell of the marker to simulate uncertain detection. Assuming a perfect detection, define the ground truth uncertainty as the percentage of inverted pixels. The test is passed if `abs(computedConfidece - groundTruthConfidence) < 0.05` where `0.05` accounts for minor detection inaccuracies.

- Performed for both regular and inverted markers
- Included perspective-distorted markers
- Markers in all 4 possible rotations [0, 90, 180, 270]
- Different set of detection params:
    - `perspectiveRemovePixelPerCell`
    - `perspectiveRemoveIgnoredMarginPerCell`
    - `markerBorderBits`

![TestCases](https://github.com/user-attachments/assets/1113abd3-ff7a-45c8-8b4b-a9d2182eda82)


The code properly builds locally and `opencv_test_objdetect` and `opencv_test_core` passed. Please let me know if there are any further modifications needed. 

Thanks!


I've also pushed minor unrelated improvement (let me know if you want a separate PR) in the [bit extraction method](https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/aruco/aruco_detector.cpp#L435). `CV_Assert(perspectiveRemoveIgnoredMarginPerCell <=1)` should be `< 0.5`. Since there are margins on both sides of the cell, the margins must be smaller than half of the cell. When setting `perspectiveRemoveIgnoredMarginPerCell >= 0.5`, `opencv_test_objdetect` fails. Note: 0.499 is ok because `int()` will floor the result, thus `cellMarginPixels = int(cellMarginRate * cellSize)` will be smaller than `cellSize / 2`



### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The PR is proposed to the proper branch
- [x] The feature is well documented and sample code can be built with the project CMake
This commit is contained in:
Jonas Perolini
2025-12-22 18:54:40 +01:00
committed by GitHub
parent ea9b183d9b
commit 2bca09a191
5 changed files with 573 additions and 24 deletions

View File

@@ -318,6 +318,33 @@ public:
CV_WRAP void detectMarkers(InputArray image, OutputArrayOfArrays corners, OutputArray ids,
OutputArrayOfArrays rejectedImgPoints = noArray()) const;
/** @brief Marker detection with confidence computation
*
* @param image input image
* @param corners vector of detected marker corners. For each marker, its four corners
* are provided, (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers,
* the dimensions of this array is Nx4. The order of the corners is clockwise.
* @param ids vector of identifiers of the detected markers. The identifier is of type int
* (e.g. std::vector<int>). For N detected markers, the size of ids is also N.
* The identifiers have the same order than the markers in the imgPoints array.
* @param markersConfidence contains the normalized confidence [0;1] of the markers' detection,
* defined as 1 minus the normalized uncertainty (percentage of incorrect pixel detections),
* with 1 describing a pixel perfect detection. The confidence values are of type float
* (e.g. std::vector<float>)
* @param rejectedImgPoints contains the imgPoints of those squares whose inner code has not a
* correct codification. Useful for debugging purposes.
*
* Performs marker detection in the input image. Only markers included in the first specified dictionary
* are searched. For each detected marker, it returns the 2D position of its corner in the image
* and its corresponding identifier.
* Note that this function does not perform pose estimation.
* @note The function does not correct lens distortion or takes it into account. It's recommended to undistort
* input image with corresponding camera model, if camera parameters are known
* @sa undistort, estimatePoseSingleMarkers, estimatePoseBoard
*/
CV_WRAP void detectMarkersWithConfidence(InputArray image, OutputArrayOfArrays corners, OutputArray ids, OutputArray markersConfidence,
OutputArrayOfArrays rejectedImgPoints = noArray()) const;
/** @brief Refine not detected markers based on the already detected and the board layout
*
* @param image input image

View File

@@ -71,7 +71,6 @@ class CV_EXPORTS_W_SIMPLE Dictionary {
*/
CV_WRAP int getDistanceToId(InputArray bits, int id, bool allRotations = true) const;
/** @brief Generate a canonical marker image
*/
CV_WRAP void generateImageMarker(int id, int sidePixels, OutputArray _img, int borderBits = 1) const;
@@ -84,7 +83,7 @@ class CV_EXPORTS_W_SIMPLE Dictionary {
/** @brief Transform list of bytes to matrix of bits
*/
CV_WRAP static Mat getBitsFromByteList(const Mat &byteList, int markerSize);
CV_WRAP static Mat getBitsFromByteList(const Mat &byteList, int markerSize, int rotationId = 0);
};

View File

@@ -9,6 +9,7 @@
#include "opencv2/objdetect/aruco_board.hpp"
#include "apriltag/apriltag_quad_thresh.hpp"
#include "aruco_utils.hpp"
#include <algorithm>
#include <cmath>
#include <map>
@@ -313,10 +314,11 @@ static void _detectInitialCandidates(const Mat &grey, vector<vector<Point2f> > &
* the border bits
*/
static Mat _extractBits(InputArray _image, const vector<Point2f>& corners, int markerSize,
int markerBorderBits, int cellSize, double cellMarginRate, double minStdDevOtsu) {
int markerBorderBits, int cellSize, double cellMarginRate, double minStdDevOtsu,
OutputArray _cellPixelRatio = noArray()) {
CV_Assert(_image.getMat().channels() == 1);
CV_Assert(corners.size() == 4ull);
CV_Assert(markerBorderBits > 0 && cellSize > 0 && cellMarginRate >= 0 && cellMarginRate <= 1);
CV_Assert(markerBorderBits > 0 && cellSize > 0 && cellMarginRate >= 0 && cellMarginRate <= 0.5);
CV_Assert(minStdDevOtsu >= 0);
// number of bits in the marker
@@ -353,9 +355,16 @@ static Mat _extractBits(InputArray _image, const vector<Point2f>& corners, int m
bits.setTo(1);
else
bits.setTo(0);
if(_cellPixelRatio.needed()) bits.convertTo(_cellPixelRatio, CV_32F);
return bits;
}
Mat cellPixelRatio;
if (_cellPixelRatio.needed()) {
_cellPixelRatio.create(markerSizeWithBorders, markerSizeWithBorders, CV_32FC1);
cellPixelRatio = _cellPixelRatio.getMatRef();
}
// now extract code, first threshold using Otsu
threshold(resultImg, resultImg, 125, 255, THRESH_BINARY | THRESH_OTSU);
@@ -369,6 +378,9 @@ static Mat _extractBits(InputArray _image, const vector<Point2f>& corners, int m
// count white pixels on each cell to assign its value
size_t nZ = (size_t) countNonZero(square);
if(nZ > square.total() / 2) bits.at<unsigned char>(y, x) = 1;
// define the cell pixel ratio as the ratio of the white pixels. For inverted markers, the ratio will be inverted.
if(_cellPixelRatio.needed()) cellPixelRatio.at<float>(y, x) = (nZ / (float)square.total());
}
}
@@ -403,6 +415,52 @@ static int _getBorderErrors(const Mat &bits, int markerSize, int borderSize) {
}
/** @brief Given a matrix containing the percentage of white pixels in each marker cell, returns the normalized marker confidence [0;1].
* The confidence is defined as 1 - normalized uncertainty, where 1 describes a pixel perfect detection.
* The rotation is set to 0,1,2,3 for [0, 90, 180, 270] deg CCW rotations.
*/
static float _getMarkerConfidence(const Mat& groundTruthbits, const Mat &cellPixelRatio, const int markerSize, const int borderSize) {
CV_Assert(markerSize == groundTruthbits.cols && markerSize == groundTruthbits.rows);
const int sizeWithBorders = markerSize + 2 * borderSize;
CV_Assert(markerSize > 0 && cellPixelRatio.cols == sizeWithBorders && cellPixelRatio.rows == sizeWithBorders);
// Get border uncertainty. cellPixelRatio has the opposite color as the borders --> it is the uncertainty.
float tempBorderUnc = 0.f;
for(int y = 0; y < sizeWithBorders; y++) {
for(int k = 0; k < borderSize; k++) {
// Left and right vertical sides
tempBorderUnc += cellPixelRatio.ptr<float>(y)[k];
tempBorderUnc += cellPixelRatio.ptr<float>(y)[sizeWithBorders - 1 - k];
}
}
for(int x = borderSize; x < sizeWithBorders - borderSize; x++) {
for(int k = 0; k < borderSize; k++) {
// Top and bottom horizontal sides
tempBorderUnc += cellPixelRatio.ptr<float>(k)[x];
tempBorderUnc += cellPixelRatio.ptr<float>(sizeWithBorders - 1 - k)[x];
}
}
// Get the inner marker uncertainty. For a white or black cell, the uncertainty is the ratio of black or white pixels respectively.
float tempInnerUnc = 0.f;
for(int y = borderSize; y < markerSize + borderSize; y++) {
for(int x = borderSize; x < markerSize + borderSize; x++) {
tempInnerUnc += std::abs(groundTruthbits.ptr<float>(y - borderSize)[x - borderSize] - cellPixelRatio.ptr<float>(y)[x]);
}
}
// Compute the overall normalized marker uncertainty and convert it to confidence
const float area = static_cast<float>(sizeWithBorders) * sizeWithBorders;
const float normalizedMarkerUnc = (tempInnerUnc + tempBorderUnc) / area;
const float normalizedMarkerConfidence = 1.f - normalizedMarkerUnc;
return std::max(0.f, std::min(1.f, normalizedMarkerConfidence));
}
/**
* @brief Tries to identify one candidate given the dictionary
* @return candidate typ. zero if the candidate is not valid,
@@ -412,6 +470,7 @@ static int _getBorderErrors(const Mat &bits, int markerSize, int borderSize) {
static uint8_t _identifyOneCandidate(const Dictionary& dictionary, const Mat& _image,
const vector<Point2f>& _corners, int& idx,
const DetectorParameters& params, int& rotation,
float &markerConfidence, bool confidenceNeeded,
const float scale = 1.f) {
CV_DbgAssert(params.markerBorderBits > 0);
uint8_t typ=1;
@@ -423,10 +482,12 @@ static uint8_t _identifyOneCandidate(const Dictionary& dictionary, const Mat& _i
scaled_corners[i].y = _corners[i].y * scale;
}
Mat cellPixelRatio;
Mat candidateBits =
_extractBits(_image, scaled_corners, dictionary.markerSize, params.markerBorderBits,
params.perspectiveRemovePixelPerCell,
params.perspectiveRemoveIgnoredMarginPerCell, params.minOtsuStdDev);
params.perspectiveRemoveIgnoredMarginPerCell, params.minOtsuStdDev,
cellPixelRatio);
// analyze border bits
int maximumErrorsInBorder =
@@ -441,6 +502,7 @@ static uint8_t _identifyOneCandidate(const Dictionary& dictionary, const Mat& _i
int invBError = _getBorderErrors(invertedImg, dictionary.markerSize, params.markerBorderBits);
// white marker
if(invBError<borderErrors){
cellPixelRatio = 1.f - cellPixelRatio;
borderErrors = invBError;
invertedImg.copyTo(candidateBits);
typ=2;
@@ -458,6 +520,14 @@ static uint8_t _identifyOneCandidate(const Dictionary& dictionary, const Mat& _i
if(!dictionary.identify(onlyBits, idx, rotation, params.errorCorrectionRate))
return 0;
// compute the candidate's confidence
if(confidenceNeeded) {
Mat groundTruthbits;
Mat bitsUints = dictionary.getBitsFromByteList(dictionary.bytesList.rowRange(idx, idx + 1), dictionary.markerSize, rotation);
bitsUints.convertTo(groundTruthbits, CV_32F);
markerConfidence = _getMarkerConfidence(groundTruthbits, cellPixelRatio, dictionary.markerSize, params.markerBorderBits);
}
return typ;
}
@@ -657,7 +727,7 @@ struct ArucoDetector::ArucoDetectorImpl {
* @brief Detect markers either using multiple or just first dictionary
*/
void detectMarkers(InputArray _image, OutputArrayOfArrays _corners, OutputArray _ids,
OutputArrayOfArrays _rejectedImgPoints, OutputArray _dictIndices, DictionaryMode dictMode) {
OutputArrayOfArrays _rejectedImgPoints, OutputArray _dictIndices, OutputArray _markersConfidence, DictionaryMode dictMode) {
CV_Assert(!_image.empty());
CV_Assert(detectorParams.markerBorderBits > 0);
@@ -717,6 +787,7 @@ struct ArucoDetector::ArucoDetectorImpl {
vector<vector<Point2f> > candidates;
vector<vector<Point> > contours;
vector<int> ids;
vector<float> markersConfidence;
/// STEP 2.a Detect marker candidates :: using AprilTag
if(detectorParams.cornerRefinementMethod == (int)CORNER_REFINE_APRILTAG){
@@ -738,7 +809,7 @@ struct ArucoDetector::ArucoDetectorImpl {
/// STEP 2: Check candidate codification (identify markers)
identifyCandidates(grey, grey_pyramid, selectedCandidates, candidates, contours,
ids, dictionary, rejectedImgPoints);
ids, dictionary, rejectedImgPoints, markersConfidence, _markersConfidence.needed());
/// STEP 3: Corner refinement :: use corner subpix
if (detectorParams.cornerRefinementMethod == (int)CORNER_REFINE_SUBPIX) {
@@ -766,7 +837,7 @@ struct ArucoDetector::ArucoDetectorImpl {
// temporary variable to store the current candidates
vector<vector<Point2f>> currentCandidates;
identifyCandidates(grey, grey_pyramid, candidatesPerDictionarySize.at(currentDictionary.markerSize), currentCandidates, contours,
ids, currentDictionary, rejectedImgPoints);
ids, currentDictionary, rejectedImgPoints, markersConfidence, _markersConfidence.needed());
if (_dictIndices.needed()) {
dictIndices.insert(dictIndices.end(), currentCandidates.size(), dictIndex);
}
@@ -849,6 +920,9 @@ struct ArucoDetector::ArucoDetectorImpl {
if (_dictIndices.needed()) {
Mat(dictIndices).copyTo(_dictIndices);
}
if (_markersConfidence.needed()) {
Mat(markersConfidence).copyTo(_markersConfidence);
}
}
/**
@@ -982,9 +1056,10 @@ struct ArucoDetector::ArucoDetectorImpl {
*/
void identifyCandidates(const Mat& grey, const vector<Mat>& image_pyr, vector<MarkerCandidateTree>& selectedContours,
vector<vector<Point2f> >& accepted, vector<vector<Point> >& contours,
vector<int>& ids, const Dictionary& currentDictionary, vector<vector<Point2f>>& rejected) const {
vector<int>& ids, const Dictionary& currentDictionary, vector<vector<Point2f>>& rejected, vector<float>& markersConfidence, bool confidenceNeeded) const {
size_t ncandidates = selectedContours.size();
vector<float> markersConfidenceTmp(ncandidates, 0.f);
vector<int> idsTmp(ncandidates, -1);
vector<int> rotated(ncandidates, 0);
vector<uint8_t> validCandidates(ncandidates, 0);
@@ -1018,11 +1093,11 @@ struct ArucoDetector::ArucoDetectorImpl {
}
const float scale = detectorParams.useAruco3Detection ? img.cols / static_cast<float>(grey.cols) : 1.f;
validCandidates[v] = _identifyOneCandidate(currentDictionary, img, selectedContours[v].corners, idsTmp[v], detectorParams, rotated[v], scale);
validCandidates[v] = _identifyOneCandidate(currentDictionary, img, selectedContours[v].corners, idsTmp[v], detectorParams, rotated[v], markersConfidenceTmp[v], confidenceNeeded, scale);
if (validCandidates[v] == 0 && checkCloseContours) {
for (const MarkerCandidate& closeMarkerCandidate: selectedContours[v].closeContours) {
validCandidates[v] = _identifyOneCandidate(currentDictionary, img, closeMarkerCandidate.corners, idsTmp[v], detectorParams, rotated[v], scale);
validCandidates[v] = _identifyOneCandidate(currentDictionary, img, closeMarkerCandidate.corners, idsTmp[v], detectorParams, rotated[v], markersConfidenceTmp[v], confidenceNeeded, scale);
if (validCandidates[v] > 0) {
selectedContours[v].corners = closeMarkerCandidate.corners;
selectedContours[v].contour = closeMarkerCandidate.contour;
@@ -1052,17 +1127,24 @@ struct ArucoDetector::ArucoDetectorImpl {
for (size_t i = 0ull; i < selectedContours.size(); i++) {
if (validCandidates[i] > 0) {
// shift corner positions to the correct rotation
correctCornerPosition(selectedContours[i].corners, rotated[i]);
// shift corner positions to the correct rotation
correctCornerPosition(selectedContours[i].corners, rotated[i]);
accepted.push_back(selectedContours[i].corners);
contours.push_back(selectedContours[i].contour);
ids.push_back(idsTmp[i]);
}
else {
accepted.push_back(selectedContours[i].corners);
contours.push_back(selectedContours[i].contour);
ids.push_back(idsTmp[i]);
} else {
rejected.push_back(selectedContours[i].corners);
}
}
if(confidenceNeeded) {
for (size_t i = 0ull; i < selectedContours.size(); i++) {
if (validCandidates[i] > 0) {
markersConfidence.push_back(markersConfidenceTmp[i]);
}
}
}
}
void performCornerSubpixRefinement(const Mat& grey, const vector<Mat>& grey_pyramid, int closest_pyr_image_idx, const vector<vector<Point2f>>& candidates, const Dictionary& dictionary) const {
@@ -1103,14 +1185,19 @@ ArucoDetector::ArucoDetector(const vector<Dictionary> &_dictionaries,
arucoDetectorImpl = makePtr<ArucoDetectorImpl>(_dictionaries, _detectorParams, _refineParams);
}
void ArucoDetector::detectMarkersWithConfidence(InputArray _image, OutputArrayOfArrays _corners, OutputArray _ids, OutputArray _markersConfidence,
OutputArrayOfArrays _rejectedImgPoints) const {
arucoDetectorImpl->detectMarkers(_image, _corners, _ids, _rejectedImgPoints, noArray(), _markersConfidence, DictionaryMode::Single);
}
void ArucoDetector::detectMarkers(InputArray _image, OutputArrayOfArrays _corners, OutputArray _ids,
OutputArrayOfArrays _rejectedImgPoints) const {
arucoDetectorImpl->detectMarkers(_image, _corners, _ids, _rejectedImgPoints, noArray(), DictionaryMode::Single);
arucoDetectorImpl->detectMarkers(_image, _corners, _ids, _rejectedImgPoints, noArray(), noArray(), DictionaryMode::Single);
}
void ArucoDetector::detectMarkersMultiDict(InputArray _image, OutputArrayOfArrays _corners, OutputArray _ids,
OutputArrayOfArrays _rejectedImgPoints, OutputArray _dictIndices) const {
arucoDetectorImpl->detectMarkers(_image, _corners, _ids, _rejectedImgPoints, _dictIndices, DictionaryMode::Multi);
arucoDetectorImpl->detectMarkers(_image, _corners, _ids, _rejectedImgPoints, _dictIndices, noArray(), DictionaryMode::Multi);
}
/**

View File

@@ -194,17 +194,24 @@ Mat Dictionary::getByteListFromBits(const Mat &bits) {
}
Mat Dictionary::getBitsFromByteList(const Mat &byteList, int markerSize) {
Mat Dictionary::getBitsFromByteList(const Mat &byteList, int markerSize, int rotationId) {
CV_Assert(byteList.total() > 0 &&
byteList.total() >= (unsigned int)markerSize * markerSize / 8 &&
byteList.total() <= (unsigned int)markerSize * markerSize / 8 + 1);
CV_Assert(rotationId >=0 && rotationId < 4);
Mat bits(markerSize, markerSize, CV_8UC1, Scalar::all(0));
unsigned char base2List[] = { 128, 64, 32, 16, 8, 4, 2, 1 };
// Use a base offset for the selected rotation
int nbytes = (bits.cols * bits.rows + 8 - 1) / 8; // integer ceil
int base = rotationId * nbytes;
int currentByteIdx = 0;
// we only need the bytes in normal rotation
unsigned char currentByte = byteList.ptr()[0];
unsigned char currentByte = byteList.ptr()[base + currentByteIdx];
int currentBit = 0;
for(int row = 0; row < bits.rows; row++) {
for(int col = 0; col < bits.cols; col++) {
if(currentByte >= base2List[currentBit]) {
@@ -214,7 +221,7 @@ Mat Dictionary::getBitsFromByteList(const Mat &byteList, int markerSize) {
currentBit++;
if(currentBit == 8) {
currentByteIdx++;
currentByte = byteList.ptr()[currentByteIdx];
currentByte = byteList.ptr()[base + currentByteIdx];
// if not enough bits for one more byte, we are in the end
// update bit position accordingly
if(8 * (currentByteIdx + 1) > (int)bits.total())

View File

@@ -321,6 +321,358 @@ void CV_ArucoDetectionPerspective::run(int) {
}
}
// Helper struct and functions for CV_ArucoDetectionConfidence
// Inverts a square subregion inside selected cells of a marker to simulate a confidence drop
enum class MarkerRegionToTemper {
BORDER, // Only invert cells within the marker border bits
INNER, // Only invert cells in the inner part of the marker (excluding borders)
ALL // Invert any cells
};
// Define the characteristics of cell inversions
struct MarkerTemperingConfig {
float cellRatioToTemper; // [0,1] ratio of the cell to invert
int numCellsToTemper; // Number of cells to invert
MarkerRegionToTemper markerRegionToTemper; // Which cells to invert (BORDER, INNER, ALL)
};
// Test configs for CV_ArucoDetectionConfidence
struct ArucoConfidenceTestConfig {
MarkerTemperingConfig markerTemperingConfig; // Configuration of cells to invert (percentage, number and markerRegionToTemper)
float perspectiveRemoveIgnoredMarginPerCell; // Width of the margin of pixels on each cell not considered for the marker identification
int markerBorderBits; // Number of bits of the marker border
float distortionRatio; // Percentage of offset used for perspective distortion, bigger means more distorted
};
enum class markerRot
{
NONE = 0,
ROT_90,
ROT_180,
ROT_270
};
struct markerDetectionGT {
int id; // Marker identification
double confidence; // Pixel-based confidence defined as 1 - (inverted area / total area)
bool expectDetection; // True if we expect to detect the marker
};
struct MarkerCreationConfig {
int id; // Marker identification
int markerSidePixels; // Marker size (in pixels)
markerRot rotation; // Rotation of the marker in degrees (0, 90, 180, 270)
};
void rotateMarker(Mat &marker, const markerRot rotation)
{
if(rotation == markerRot::NONE)
return;
if (rotation == markerRot::ROT_90) {
cv::transpose(marker, marker);
cv::flip(marker, marker, 0);
} else if (rotation == markerRot::ROT_180) {
cv::flip(marker, marker, -1);
} else if (rotation == markerRot::ROT_270) {
cv::transpose(marker, marker);
cv::flip(marker, marker, 1);
}
}
void distortMarker(Mat &marker, const float distortionRatio)
{
if (distortionRatio < FLT_EPSILON)
return;
// apply a distortion (a perspective warp) to simulate a non-ideal capture
vector<Point2f> src = { {0, 0},
{static_cast<float>(marker.cols), 0},
{static_cast<float>(marker.cols), static_cast<float>(marker.rows)},
{0, static_cast<float>(marker.rows)} };
float offset = marker.cols * distortionRatio; // distortionRatio % offset for distortion
vector<Point2f> dst = { {offset, offset},
{marker.cols - offset, 0},
{marker.cols - offset, marker.rows - offset},
{0, marker.rows - offset} };
Mat M = getPerspectiveTransform(src, dst);
warpPerspective(marker, marker, M, marker.size(), INTER_LINEAR, BORDER_CONSTANT, Scalar(255));
}
/**
* @brief Inverts a square subregion inside selected cells of a marker image to simulate confidence degradation.
*
* The function computes the marker grid parameters and then applies a bitwise inversion
* on a square markerRegionToTemper inside the chosen cells. The number of cells to be inverted is determined by
* the parameter 'numCellsToTemper'. The candidate cells can be filtered to only include border cells,
* inner cells, or all cells according to the parameter 'markerRegionToTemper'.
*
* @param marker The marker image
* @param markerSidePixels The total size of the marker in pixels (inner and border).
* @param markerId The id of the marker
* @param params The Aruco detector configuration (provides border bits, margin ratios, etc.).
* @param dictionary The Aruco marker dictionary (used to determine marker grid size).
* @param cellTempConfig Cell tempering config as defined in MarkerTemperingConfig
* @return Cell tempering ground truth as defined in markerDetectionGT
*/
markerDetectionGT applyTemperingToMarkerCells(cv::Mat &marker,
const int markerSidePixels,
const int markerId,
const aruco::DetectorParameters &params,
const aruco::Dictionary &dictionary,
const MarkerTemperingConfig &cellTempConfig)
{
// nothing to invert
if(cellTempConfig.numCellsToTemper <= 0 || cellTempConfig.cellRatioToTemper <= FLT_EPSILON)
return {markerId, 1.0, true};
// compute the overall grid dimensions.
const int markerSizeWithBorders = dictionary.markerSize + 2 * params.markerBorderBits;
const int cellSidePixelsSize = markerSidePixels / markerSizeWithBorders;
// compute the margin within each cell used for identification.
const int cellMarginPixels = static_cast<int>(params.perspectiveRemoveIgnoredMarginPerCell * cellSidePixelsSize);
const int innerCellSizePixels = cellSidePixelsSize - 2 * cellMarginPixels;
// determine the size of the square that will be inverted in each cell.
// (cellSidePixelsInvert / innerCellSizePixels)^2 should equal cellRatioToTemper.
const int cellSidePixelsInvert = min(cellSidePixelsSize, static_cast<int>(innerCellSizePixels * std::sqrt(cellTempConfig.cellRatioToTemper)));
const int inversionOffsetPixels = (cellSidePixelsSize - cellSidePixelsInvert) / 2;
// nothing to invert
if(cellSidePixelsInvert <= 0)
return {markerId, 1.0, true};
int cellsTempered = 0;
int borderErrors = 0;
int innerCellsErrors = 0;
// iterate over each cell in the grid.
for (int row = 0; row < markerSizeWithBorders; row++) {
for (int col = 0; col < markerSizeWithBorders; col++) {
// decide if this cell falls in the markerRegionToTemper to temper.
const bool isBorder = (row < params.markerBorderBits ||
col < params.markerBorderBits ||
row >= markerSizeWithBorders - params.markerBorderBits ||
col >= markerSizeWithBorders - params.markerBorderBits);
const bool inRegion = (cellTempConfig.markerRegionToTemper == MarkerRegionToTemper::ALL ||
(isBorder && cellTempConfig.markerRegionToTemper == MarkerRegionToTemper::BORDER) ||
(!isBorder && cellTempConfig.markerRegionToTemper == MarkerRegionToTemper::INNER));
// apply the inversion to simulate tempering.
if (inRegion && (cellsTempered < cellTempConfig.numCellsToTemper)) {
const int xStart = col * cellSidePixelsSize + inversionOffsetPixels;
const int yStart = row * cellSidePixelsSize + inversionOffsetPixels;
cv::Rect cellRect(xStart, yStart, cellSidePixelsInvert, cellSidePixelsInvert);
cv::Mat cellROI = marker(cellRect);
cv::bitwise_not(cellROI, cellROI);
++cellsTempered;
// cell too tempered, no detection expected
if(cellTempConfig.cellRatioToTemper > 0.5f) {
if(isBorder){
++borderErrors;
} else {
++innerCellsErrors;
}
}
}
if(cellsTempered >= cellTempConfig.numCellsToTemper)
break;
}
if(cellsTempered >= cellTempConfig.numCellsToTemper)
break;
}
// compute the ground-truth confidence
const double invertedArea = cellsTempered * cellSidePixelsInvert * cellSidePixelsInvert;
const double totalDetectionArea = markerSizeWithBorders * innerCellSizePixels * markerSizeWithBorders * innerCellSizePixels;
const double groundTruthConfidence = std::max(0.0, 1.0 - invertedArea / totalDetectionArea);
// check if marker is expected to be detected
const int maximumErrorsInBorder = static_cast<int>(dictionary.markerSize * dictionary.markerSize * params.maxErroneousBitsInBorderRate);
const int maxCorrectionRecalculed = static_cast<int>(dictionary.maxCorrectionBits * params.errorCorrectionRate);
const bool expectDetection = static_cast<bool>(borderErrors <= maximumErrorsInBorder && innerCellsErrors <= maxCorrectionRecalculed);
return {markerId, groundTruthConfidence, expectDetection};
}
/**
* @brief Create an image of a marker with inverted (tempered) regions to simulate detection confidence
*
* Applies an optional rotation and an optional perspective warp to simulate a distorted marker.
* Inverts a square subregion inside selected cells of a marker image to simulate a drop in confidence.
* Computes the ground-truth confidence as one minus the ratio of inverted area to the total marker area used for identification.
*
*/
markerDetectionGT generateTemperedMarkerImage(Mat &marker, const MarkerCreationConfig &markerConfig, const MarkerTemperingConfig &markerTemperingConfig,
const aruco::DetectorParameters &params, const aruco::Dictionary &dictionary, const float distortionRatio = 0.f)
{
// generate the synthetic marker image
aruco::generateImageMarker(dictionary, markerConfig.id, markerConfig.markerSidePixels,
marker, params.markerBorderBits);
// rotate marker if necessary
rotateMarker(marker, markerConfig.rotation);
// temper with cells to simulate detection confidence drops
markerDetectionGT groundTruth = applyTemperingToMarkerCells(marker, markerConfig.markerSidePixels, markerConfig.id, params, dictionary, markerTemperingConfig);
// apply a distortion (a perspective warp) to simulate a non-ideal capture
distortMarker(marker, distortionRatio);
return groundTruth;
}
/**
* @brief Copies a marker image into a larger image at the given top-left position.
*/
void placeMarker(Mat &img, const Mat &marker, const Point2f &topLeft)
{
Rect roi(Point(static_cast<int>(topLeft.x), static_cast<int>(topLeft.y)), marker.size());
marker.copyTo(img(roi));
}
/**
* @brief Test the marker confidence computations
*
* Loops over a set of detector configurations (e.g. expected confidence, distortion, DetectorParameters)
* For each configuration, it creates a synthetic image containing four markers arranged in a 2x2 grid.
* Each marker is generated with its own configuration (id, size, rotation).
* Finally, it runs the detector and checks that each marker is detected and
* that its computed confidence is close to the ground truth value.
*
*/
static void runArucoDetectionConfidence(ArucoAlgParams arucoAlgParam) {
aruco::DetectorParameters params;
// make sure there are no bits have any detection errors
params.maxErroneousBitsInBorderRate = 0.0;
params.errorCorrectionRate = 0.0;
params.perspectiveRemovePixelPerCell = 8; // esnsure that there is enough resolution to properly handle distortions
aruco::ArucoDetector detector(aruco::getPredefinedDictionary(aruco::DICT_6X6_250), params);
const bool detectInvertedMarker = (arucoAlgParam == ArucoAlgParams::DETECT_INVERTED_MARKER);
// define several detector configurations to test different settings
// {{MarkerTemperingConfig}, perspectiveRemoveIgnoredMarginPerCell, markerBorderBits, distortionRatio}
vector<ArucoConfidenceTestConfig> detectorConfigs = {
// No margins, No distortion
{{0.f, 64, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.f},
{{0.01f, 64, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.f},
{{0.05f, 100, MarkerRegionToTemper::ALL}, 0.0f, 2, 0.f},
{{0.1f, 64, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.f},
{{0.15f, 30, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.f},
{{0.20f, 55, MarkerRegionToTemper::ALL}, 0.0f, 2, 0.f},
// Margins, No distortion
{{0.f, 26, MarkerRegionToTemper::BORDER}, 0.05f, 1, 0.f},
{{0.01f, 56, MarkerRegionToTemper::BORDER}, 0.05f, 2, 0.f},
{{0.05f, 144, MarkerRegionToTemper::ALL}, 0.1f, 3, 0.f},
{{0.10f, 49, MarkerRegionToTemper::ALL}, 0.15f, 1, 0.f},
// No margins, distortion
{{0.f, 36, MarkerRegionToTemper::INNER}, 0.0f, 1, 0.01f},
{{0.01f, 36, MarkerRegionToTemper::INNER}, 0.0f, 1, 0.02f},
{{0.05f, 12, MarkerRegionToTemper::INNER}, 0.0f, 2, 0.05f},
{{0.1f, 64, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.1f},
{{0.1f, 81, MarkerRegionToTemper::ALL}, 0.0f, 2, 0.2f},
// Margins, distortion
{{0.f, 81, MarkerRegionToTemper::ALL}, 0.05f, 2, 0.01f},
{{0.01f, 64, MarkerRegionToTemper::ALL}, 0.05f, 1, 0.02f},
{{0.05f, 81, MarkerRegionToTemper::ALL}, 0.1f, 2, 0.05f},
{{0.1f, 64, MarkerRegionToTemper::ALL}, 0.15f, 1, 0.1f},
{{0.1f, 64, MarkerRegionToTemper::ALL}, 0.0f, 1, 0.2f},
// no marker detection, too much tempering
{{0.9f, 1, MarkerRegionToTemper::ALL}, 0.05f, 2, 0.0f},
{{0.9f, 1, MarkerRegionToTemper::BORDER}, 0.05f, 2, 0.0f},
{{0.9f, 1, MarkerRegionToTemper::INNER}, 0.05f, 2, 0.0f},
};
// define marker configurations for the 4 markers in each image
const int markerSidePixels = 480; // To simplify the cell division, markerSidePixels is a multiple of 8. (6x6 dict + 2 border bits)
vector<MarkerCreationConfig> markerCreationConfig = {
{0, markerSidePixels, markerRot::ROT_90}, // {id, markerSidePixels, rotation}
{1, markerSidePixels, markerRot::ROT_270},
{2, markerSidePixels, markerRot::NONE},
{3, markerSidePixels, markerRot::ROT_180}
};
// loop over each detector configuration
for (size_t cfgIdx = 0; cfgIdx < detectorConfigs.size(); cfgIdx++) {
ArucoConfidenceTestConfig detCfg = detectorConfigs[cfgIdx];
SCOPED_TRACE(cv::format("detectorConfig=%zu", cfgIdx));
// update detector parameters
params.perspectiveRemoveIgnoredMarginPerCell = detCfg.perspectiveRemoveIgnoredMarginPerCell;
params.markerBorderBits = detCfg.markerBorderBits;
params.detectInvertedMarker = detectInvertedMarker;
detector.setDetectorParameters(params);
// create a blank image large enough to hold 4 markers in a 2x2 grid
const int margin = markerSidePixels / 2;
const int imageSize = (markerSidePixels * 2) + margin * 3;
Mat img(imageSize, imageSize, CV_8UC1, Scalar(255));
vector<markerDetectionGT> groundTruths;
const aruco::Dictionary &dictionary = detector.getDictionary();
// place each marker into the image
for (int row = 0; row < 2; row++) {
for (int col = 0; col < 2; col++) {
int index = row * 2 + col;
MarkerCreationConfig markerCfg = markerCreationConfig[index];
// adjust marker id to be unique for each detector configuration
markerCfg.id += static_cast<int>(cfgIdx * markerCreationConfig.size());
// generate img
Mat markerImg;
markerDetectionGT gt = generateTemperedMarkerImage(markerImg, markerCfg, detCfg.markerTemperingConfig, params, dictionary, detCfg.distortionRatio);
groundTruths.push_back(gt);
// place marker in the image
Point2f topLeft(static_cast<float>(margin + col * (markerSidePixels + margin)),
static_cast<float>(margin + row * (markerSidePixels + margin)));
placeMarker(img, markerImg, topLeft);
}
}
// if testing inverted markers globally, invert the whole image
if (detectInvertedMarker) {
bitwise_not(img, img);
}
// run detection.
vector<vector<Point2f>> corners, rejected;
vector<int> ids;
vector<float> markerConfidence;
detector.detectMarkersWithConfidence(img, corners, ids, markerConfidence, rejected);
ASSERT_EQ(ids.size(), corners.size());
ASSERT_EQ(ids.size(), markerConfidence.size());
std::map<int, float> confidenceById;
for (size_t i = 0; i < ids.size(); i++) {
confidenceById[ids[i]] = markerConfidence[i];
}
// verify that every marker is detected and its confidence is within tolerance
for (const auto& currentGT : groundTruths) {
const bool detected = confidenceById.find(currentGT.id) != confidenceById.end();
EXPECT_EQ(currentGT.expectDetection, detected) << "Marker id: " << currentGT.id;
if (currentGT.expectDetection && detected) {
EXPECT_NEAR(currentGT.confidence, confidenceById[currentGT.id], 0.05)
<< "Marker id: " << currentGT.id;
}
}
}
}
/**
* @brief Check max and min size in marker detection parameters
@@ -552,6 +904,83 @@ TEST(CV_ArucoBitCorrection, algorithmic) {
test.safe_run();
}
TEST(CV_ArucoDetectionConfidence, algorithmic) {
runArucoDetectionConfidence(ArucoAlgParams::USE_DEFAULT);
}
TEST(CV_InvertedArucoDetectionConfidence, algorithmic) {
runArucoDetectionConfidence(ArucoAlgParams::DETECT_INVERTED_MARKER);
}
TEST(CV_InvertedFlagArucoDetectionConfidence, algorithmic) {
aruco::DetectorParameters params;
params.maxErroneousBitsInBorderRate = 0.0;
params.errorCorrectionRate = 0.0;
params.perspectiveRemovePixelPerCell = 8;
params.detectInvertedMarker = false;
const aruco::Dictionary dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
// create a blank image large enough to hold 4 markers in a 2x2 grid
const int markerSidePixels = 480;
const int margin = markerSidePixels / 2;
const int imageSize = (markerSidePixels * 2) + margin * 3;
Mat img(imageSize, imageSize, CV_8UC1, Scalar(255));
// place 4 markers into the image
for (int row = 0; row < 2; row++) {
for (int col = 0; col < 2; col++) {
const int id = row * 2 + col;
Mat markerImg;
aruco::generateImageMarker(dictionary, id, markerSidePixels, markerImg, params.markerBorderBits);
Point2f topLeft(static_cast<float>(margin + col * (markerSidePixels + margin)),
static_cast<float>(margin + row * (markerSidePixels + margin)));
placeMarker(img, markerImg, topLeft);
}
}
// run detection with detectInvertedMarker = false (baseline)
aruco::ArucoDetector detector(dictionary, params);
vector<vector<Point2f>> corners, rejected;
vector<int> ids;
vector<float> confidenceDefault;
detector.detectMarkersWithConfidence(img, corners, ids, confidenceDefault, rejected);
ASSERT_EQ(ids.size(), corners.size());
ASSERT_EQ(ids.size(), confidenceDefault.size());
std::map<int, float> confidenceByIdDefault;
for (size_t i = 0; i < ids.size(); i++) {
confidenceByIdDefault[ids[i]] = confidenceDefault[i];
}
// run detection with detectInvertedMarker = true, without inverting the image
params.detectInvertedMarker = true;
aruco::ArucoDetector detectorInvertedFlag(dictionary, params);
vector<float> confidenceInvertedFlag;
detectorInvertedFlag.detectMarkersWithConfidence(img, corners, ids, confidenceInvertedFlag, rejected);
ASSERT_EQ(ids.size(), corners.size());
ASSERT_EQ(ids.size(), confidenceInvertedFlag.size());
std::map<int, float> confidenceByIdInvertedFlag;
for (size_t i = 0; i < ids.size(); i++) {
confidenceByIdInvertedFlag[ids[i]] = confidenceInvertedFlag[i];
}
// detectInvertedMarker should not invert/flip confidence for non-inverted markers.
for (int id = 0; id < 4; id++) {
ASSERT_NE(confidenceByIdDefault.find(id), confidenceByIdDefault.end()) << "Marker id: " << id;
ASSERT_NE(confidenceByIdInvertedFlag.find(id), confidenceByIdInvertedFlag.end()) << "Marker id: " << id;
const float confDefault = confidenceByIdDefault[id];
const float confInvertedFlag = confidenceByIdInvertedFlag[id];
EXPECT_GT(confDefault, 0.8f) << "Marker id: " << id;
EXPECT_GT(confInvertedFlag, 0.8f) << "Marker id: " << id;
EXPECT_NEAR(confDefault, confInvertedFlag, 0.2f) << "Marker id: " << id;
}
}
TEST(CV_ArucoDetectMarkers, regression_3192)
{
aruco::ArucoDetector detector(aruco::getPredefinedDictionary(aruco::DICT_4X4_50));