In my last post we discussed blob extraction and event tracking. We will continue with that project by adding support for two-dimensional fiducial tracking. We will attempt to implement the fiducial detection algorithm used on the Topolo Surface1. We will first describe the fiducials and how their properties are encoded in their structure, and we will add a class to our project to support fiducial detection and rendering. When finished we will obtain the following renderings:
Below is a fiducial we will be attempting to detect in our project. We will be using the blob extractor developed in my previous post to generate a container hierarchy of the blobs in the fiducial. From this we will evaluate the properties of the fiducial including location, orientation, and identification. Go here to generate fiducials in SVG format.
Evaluation of the center black bob and its white child provide the necessary information to determine location and orientation. In the diagram below the center of the black blob yields the fiducial center and the center of the white blob provides the offset necessary to generate an orientation vector.
The remaining 16 child blobs (in fiducial number 12345 there are 6 white and 10 black) provide the 16-bit encoding to yield an identification number. We will encode the white blobs as ones and the black blobs as zeroes. These blobs will be evaluated counter-clockwise beginning at the orientation vector. For fiducial 12345 this encoding yields 00110000001110012 = 1234510.
We will first modify our cBlob
class to overload the strictly less than operator for sorting purposes. We will initially sort our blobs from smallest to largest area (smaller blobs must be contained in larger blobs). In addition we will add two properties to this class, a children vector and node height property. Lastly, the contains()
method determines whether a node is contained within another and, if so, appends itself to the children vector and updates the height property accordingly.
#ifndef BLOB_H #define BLOB_H #include <vector> #include <math.h> enum { BLOB_NULL, BLOB_DOWN, BLOB_MOVE, BLOB_UP }; // event types struct point { double x, y; }; class cBlob { private: protected: public: point location, origin; // current location and origin for defining a drag vector point min, max; // to define our axis-aligned bounding box int event; // event type: one of BLOB_NULL, BLOB_DOWN, BLOB_MOVE, BLOB_UP bool tracked; // a flag to indicate this blob has been processed std::vector<int> children; int height; bool operator<(const cBlob& b) const; bool contains(const cBlob& b, int i); }; #endif
The definitions of the two methods are fairly straightforward. We use the area of the axis-aligned bounding box to sort the blobs. We could use the area of a circle, but this would yield additional overhead (a smaller circle will yield a smaller bounding box). The contains()
method uses the property that a smaller circle is contained within a larger circle provided the distance between center points plus the smaller radius is less than the larger radius.
#include "blob.h" bool cBlob::operator<(const cBlob& b) const { return ( (max.x-min.x)*(max.y-min.y) < (b.max.x-b.min.x)*(b.max.y-b.min.y) ); } bool cBlob::contains(const cBlob& b, int i) { double radius0 = (max.x-min.x)/2 > (max.y-min.y)/2 ? (max.x-min.x)/2 : (max.y-min.y)/2; double radius1 = (b.max.x-b.min.x)/2 > (b.max.y-b.min.y)/2 ? (b.max.x-b.min.x)/2 : (b.max.y-b.min.y)/2; double dist = sqrt(pow(b.location.x-location.x,2.0) + pow(b.location.y-location.y,2.0)); if (dist < radius0 - radius1) { children.push_back(i); if (!(b.height < height)) height = b.height + 1; return true; } return false; }
Below we define a fiducial
data type to store a fiducial's properties. The cFiducialIdentifier
is a helper class to store the 16-bit encodings and sort them by orientation (relative to fiducial orientation). Lastly, the cFiducial
object declares two properties, a vector of detected fiducials and a vector of identifiers (for evaluating identification via the generateID()
method) and a method, extractFiducials()
, which accepts the blob vector generated by our tracker and returns a reference to the detected fiducials.
#ifndef FIDUCIAL_H #define FIDUCIAL_H #include <algorithm> #include <vector> #include <math.h> #include "blob.h" struct fiducial { point center; point dimensions; double orientation; int id; }; class cFiducialIdentifier { private: double orientation; bool value; protected: public: cFiducialIdentifier(double orientation, bool value); ~cFiducialIdentifier(); bool operator<(const cFiducialIdentifier& f) const; friend class cFiducial; }; class cFiducial { private: std::vector<fiducial> fiducials; std::vector<cFiducialIdentifier> identifier; protected: public: cFiducial(); ~cFiducial(); std::vector<fiducial>& extractFiducials(std::vector<cBlob>& blobs); int generateID(); }; #endif
In our extractFiducials()
method we first clear the fiducials from the previous pass. We need at least five blobs to detect a fiducial (no encoding bits yields an identification of zero). We then build the container hierarchy and process our blobs. In the loop we search for blobs with a node height of four (the white fiducial border used to isolate the fiducial from the background). The first child of this blob (the larger black circle) contains the white blobs (ones encoding) and the middle white circle. The middle white circle contains the black blobs (zeroes encoding) and the center black blob. At this point we evaluate the center of the blob and move on to the child white blob (the offset) to evaluate the orientation. Then we proceed to process the encoding by evaluating the orientation of the encoding blobs relative to the fiducial orientation. We push these onto the identifier for sorting, generate an id, and push the fiducial onto the vector.
#include "fiducial.h" cFiducialIdentifier::cFiducialIdentifier(double orientation, bool value) : orientation(orientation), value(value) { } cFiducialIdentifier::~cFiducialIdentifier() { } bool cFiducialIdentifier::operator<(const cFiducialIdentifier& f) const { return (orientation < f.orientation); } cFiducial::cFiducial() { } cFiducial::~cFiducial() { } std::vector<fiducial>& cFiducial::extractFiducials(std::vector<cBlob>& blobs) { fiducials.clear(); if (blobs.size() < 5) return fiducials; sort(blobs.begin(), blobs.end()); for (int i = 0; i < blobs.size() - 1; i++) { if (blobs[i].event == BLOB_UP) continue; for (int j = i + 1; j < blobs.size(); j++) { if (blobs[j].event == BLOB_UP) continue; if (blobs[j].contains(blobs[i], i)) break; } } fiducial temp; // center, dimensions, orientation, id double theta; double x, y; int child1, child2, child3, child4; identifier.clear(); for (int i = 0; i < blobs.size(); i++) { if (blobs[i].height == 4) { // possible fiducial child1 = blobs[i].children[0]; // first child (ones) if (blobs[child1].height != 3) break; child2 = blobs[child1].children[blobs[child1].children.size()-1]; // second child (zeroes) if (blobs[child2].height != 2) break; child3 = blobs[child2].children[blobs[child2].children.size()-1]; // third child if (blobs[child3].height != 1) break; temp.center.x = blobs[child3].location.x; // evaluate center temp.center.y = blobs[child3].location.y; child4 = blobs[child3].children[0]; x = blobs[child4].location.x - temp.center.x; y = blobs[child4].location.y - temp.center.y; temp.orientation = atan2(y, x); // evaluate orientation if (temp.orientation < 0) temp.orientation += 2 * M_PI; //process id for (int j = 0; j < blobs[child1].children.size() - 1; j++) { x = blobs[blobs[child1].children[j]].location.x - temp.center.x; y = blobs[blobs[child1].children[j]].location.y - temp.center.y; theta = atan2(y, x); if (theta < 0) theta += 2 * M_PI; theta -= temp.orientation; if (theta < 0) theta += 2 * M_PI; identifier.push_back(cFiducialIdentifier(theta, true)); // ones } for (int j = 0; j < blobs[child2].children.size() - 1; j++) { x = blobs[blobs[child2].children[j]].location.x - temp.center.x; y = blobs[blobs[child2].children[j]].location.y - temp.center.y; theta = atan2(y, x); if (theta < 0) theta += 2 * M_PI; theta -= temp.orientation; if (theta < 0) theta += 2 * M_PI; identifier.push_back(cFiducialIdentifier(theta, false)); // zeroes } sort(identifier.begin(), identifier.end()); temp.id = generateID(); fiducials.push_back(temp); } } return fiducials; } int cFiducial::generateID() { int id = 0; for (int i = 0; i < identifier.size(); i++) id += (identifier[i].value ? 1 : 0)*pow(2, i); return id; }
We will add an OpenGL helper function for rendering the identification number. We will use SDL_ttf for rendering text.
void renderText(const TTF_Font *font, const GLubyte& r, const GLubyte& g, const GLubyte& b, const double& x, const double& y, const double& z, const std::string& text) { SDL_Color color = {r, g, b}; //SDL_Surface *Message = TTF_RenderText_Solid(const_cast<TTF_Font*>(font), text.c_str(), color); //SDL_Surface *Message = TTF_RenderText_Shaded(const_cast<TTF_Font*>(font), text.c_str(), color); SDL_Surface *message = TTF_RenderText_Blended(const_cast<TTF_Font*>(font), text.c_str(), color); GLuint texture; setupTexture(texture); glTexImage2D(GL_TEXTURE_2D, 0, 4, message->w, message->h, 0, GL_BGRA, GL_UNSIGNED_BYTE, message->pixels); glEnable(GL_BLEND); glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); glBegin(GL_QUADS); glTexCoord2f(0, 0); glVertex3f(x, y, z); glTexCoord2f(1, 0); glVertex3f(x + message->w, y, z); glTexCoord2f(1, 1); glVertex3f(x + message->w, y + message->h, z); glTexCoord2f(0, 1); glVertex3f(x, y + message->h, z); glEnd(); glDisable(GL_BLEND); deleteTexture(texture); SDL_FreeSurface(message); }
In our main function we create an instance of cFiducial
and a vector of fiducials (returned from extractFiducials()
). We have added the initializing of SDL_ttf and loading a font. Finally, we call our extractFiducials()
method near the end of the event loop, render circles about the fiducials' centers, draw a line identifying the orientation, and render the identification number.
#include <sstream> #include "src/filter.h" #include "src/glhelper.h" #include "src/tracker.h" #include "src/fiducial.h" // capture parameters static const int CDEVICE = 0; static const int CWIDTH = 640; static const int CHEIGHT = 480; // filter parameters static const int FKERNELSIZE = 7; // for gaussian blur static const double FSTDDEV = 1.5; static const int FBLOCKSIZE = 63; // for adaptive threshold static const int FC = -5; // display (native resolution of projector) static const int WIDTH = 640; //1280 static const int HEIGHT = 480; //800 static const int BPP = 32; // tracker parameters static const double TMINAREA = 16; // minimum area of blob to track static const double TMAXRADIUS = 24; // a blob is identified with a blob in the previous frame if it exists within this radius //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// main int main(int argc, char *argv[]) { bool active = true, balance = true, fullscreen = false; int render = 0; cFilter filter(CDEVICE, CWIDTH, CHEIGHT, FKERNELSIZE, FSTDDEV, FBLOCKSIZE, FC); filter.balance(balance); cTracker tracker(TMINAREA, TMAXRADIUS); cFiducial fid; vector<fiducial> fiducials; SDL_Init(SDL_INIT_EVERYTHING); SDL_Surface *screen = SDL_SetVideoMode(WIDTH, HEIGHT, BPP, (fullscreen ? SDL_FULLSCREEN : 0) | SDL_HWSURFACE | SDL_OPENGL); SDL_Event event; TTF_Init(); TTF_Font *font = TTF_OpenFont("media/Coalition.ttf", 18); setupOrtho(WIDTH, HEIGHT); glClearColor(0.0f, 0.0f, 0.0f, 0.0f); GLuint texture; setupTexture(texture); while (active) { while (SDL_PollEvent(&event)) { switch (event.type) { case SDL_QUIT: active = false; break; case SDL_KEYDOWN: switch (event.key.keysym.sym) { case SDLK_f: // toggle fullscreen fullscreen ^= true; screen = SDL_SetVideoMode(WIDTH, HEIGHT, BPP, (fullscreen ? SDL_FULLSCREEN : 0) | SDL_HWSURFACE | SDL_OPENGL); break; case SDLK_b: // toggle balance balance ^= true; filter.balance(balance); break; case SDLK_1: // captured frame render = 0; break; case SDLK_2: // filtered frame render = 1; break; case SDLK_3: // gaussian blur render = 2; break; case SDLK_4: // adaptive threshold render = 3; break; } break; } } filter.filter(); // capture, filter, blur, (balance), threshold switch (render) { case 0: glClear(GL_COLOR_BUFFER_BIT); renderTexture(texture, filter.captureFrame(), false, WIDTH, HEIGHT); break; case 1: glClear(GL_COLOR_BUFFER_BIT); renderTexture(texture, filter.filterFrame(), true, WIDTH, HEIGHT); break; case 2: glClear(GL_COLOR_BUFFER_BIT); renderTexture(texture, filter.gaussianFrame(), true, WIDTH, HEIGHT); break; case 3: glClear(GL_COLOR_BUFFER_BIT); renderTexture(texture, filter.thresholdFrame(), true, WIDTH, HEIGHT); break; } tracker.trackBlobs(filter.thresholdFrame(), true); fiducials = fid.extractFiducials(tracker.getBlobs()); renderBlobs(tracker.getBlobs(), 0.0f, 1.0f, 1.0f, 1.0f, 1.0f, 0.0f); for (int i = 0; i < fiducials.size(); i++) { renderCircle(fiducials[i].center.x, fiducials[i].center.y, 16.0f, 30, 1.0f, 0.0f, 1.0f); glColor4f(0.0f, 1.0f, 0.0f, 1.0f); glBegin(GL_LINES); glVertex3f(fiducials[i].center.x, fiducials[i].center.y, 0.0f); glVertex3f(fiducials[i].center.x+128*cos(fiducials[i].orientation), fiducials[i].center.y+128*sin(fiducials[i].orientation), 0.0f); glEnd(); std::stringstream out; out << fiducials[i].id; std::string s = out.str(); glPushMatrix(); glTranslatef(fiducials[i].center.x, fiducials[i].center.y, 0.0f); glRotatef(fiducials[i].orientation*180/M_PI, 0.0f, 0.0f, 1.0f); renderText(font, 127, 255, 127, 0.0f, 0.0f, 0.0f, s); glPopMatrix(); } SDL_GL_SwapBuffers(); } deleteTexture(texture); TTF_CloseFont(font); TTF_Quit(); SDL_Quit(); return 0; }
The updated Makefile
follows:
all: main.cc lib/capture.o lib/filter.o lib/glhelper.o lib/blob.o lib/tracker.o lib/fiducial.o g++ main.cc lib/capture.o lib/filter.o lib/glhelper.o lib/blob.o lib/tracker.o lib/fiducial.o cvblobslib/libblob.a -o bin/main -L/usr/lib `sdl-config --cflags --libs` -lSDL_ttf `pkg-config opencv --cflags --libs` -lGL lib/capture.o: src/capture.h src/capture.cc g++ src/capture.cc -c -o lib/capture.o lib/filter.o: lib/capture.o src/filter.h src/filter.cc g++ src/filter.cc -c -o lib/filter.o lib/glhelper.o: lib/blob.o src/glhelper.h src/glhelper.cc g++ src/glhelper.cc -c -o lib/glhelper.o lib/blob.o: src/blob.h src/blob.cc g++ src/blob.cc -c -o lib/blob.o lib/tracker.o: lib/blob.o src/tracker.h src/tracker.cc g++ src/tracker.cc -c -o lib/tracker.o -I/usr/local/include/opencv/ lib/fiducial.o: lib/blob.o src/fiducial.h src/fiducial.cc g++ src/fiducial.cc -c -o lib/fiducial.o clean: @rm -f *~ src/*~ lib/* bin/*
In my next post I may revisit the algorithm used by cvBlobsLib, Connected Component Labeling, and offer an implementation using the Disjoint Set data structure. As always, comments are welcome.
Don't forget to run make
in the cvblobslib
folder to generate the libblob.a
file required to link this project.
Download this project: fiducial.tar.bz2
References:
1. Nishino, Hiroki (2010). Topolo Surface: A 2D Fiducial Tracking System Based on Topological Region Adjacency Information and Angle Information. Journal of Information Processing, 18:16-25.