Object Recognition in C#

John Keogh | March 29, 2013

This is related to the motion tracking blog post that I posted to my company's blog a few weeks ago. The problem I was trying to solve that lead to this article was how to teach an autonomous machine what an object is called and have it find the object. I'll be posting something about the autonomous machine (a robot controller called EyesBot Driver) when it is available, but this post is about simple, quick machine learning and object recognition.


As I look around a room and identify objects, there are three different things that I am using.

  • Color palette: The colors associated with an object. For example, an orange, a person, a specific chair will have a limited set of colors that are associated with them and help to visually key that object to an identity.
  • Visual texture: The colors associated with an object have a specific texture. For example, an orange is orange with small lighter colored dimples, glass is largely transparent and has specular (ie, mirror-like) reflections, the floor in my office is light colored wood with dark knots and diffuse reflections.
  • Morphology: The palette and texture associated with an object must appear in a specific morphology. So if a surface that has the color and visual texture of an orange but has the morphology of a chair, I'd probably go have a closer look, since it would likely be a very strange chair.
For what I am doing, real time object recognition, I'm going to use just the color palette, since it is by far the easiest to process quickly.


I wrote an application to test whether it would be possible to easily train an application to recognize objects and then test how well it worked. The application was written in C# and the source is available.


The learning UI is simple to use, you load an image of a background and then the same background with the image to learn added. What the application does is analyze the palette of the background and learning images and the difference is the palette of the object to learn. It then serializes that palette out to an XML file, along with the image name.


The palette of the image to be analyzed is loaded and then the palette of each image signature learned during training is compared. Any signature which has a palette that is mostly or wholly represented in the photo is deemed a match. This works well with objects that have an unusual color palette, but poorly with objects that have a palette that match other common objects. Bananas, for example, work quite well, but anything that is primarily black or primarily white will tend to match many common objects.

These screen captures show the object recognition correctly returning an identification of what is in the images, but also having false positives. False negative were rare if the light level was similar to the learning light level, but false positives were common for certain objects (particularly black or white objects).

Problems with the simplistic approach

I'm somewhat surprised this works at all, but it actually works reasonably well. Some things in particular, for example a blue sky, are very well identified using this method. If the image were decomposed into blobs, and the color signature, not just the palette but the amount of each color, were analyzed against the blob it would give much more accurate results. Additionally, the visual texture and morphology of the blobs would give better results, but would be much harder to learn and to analyze. Another issue is light level, but that can addressed with modifying the color palette based on light level of the learning image vs light level of the image to analyze.


The object recognition source code is available under a creative commons CC BY license. I intend to upload it to GitHub at some point in the near future, but for now it is just a zip file.

Eyesbot Company

Computer vision

Artificial intelligence

Effecting the physical world