aShademan: {discussion.vision} What can a machine see?

A general question is posed as the topic of this entry. There are different approaches to solve different problems in computer vision. I am specifically curious to know what problems each different approach in computer vision solves and how much a priori information each approach considers. I also like to know how far we are from replicating a biological system, in your opinion. Are we in the right direction to do so in computer vision?

The classification of the different approaches are not quite clear (see SoloGen's* comment on the previous entry), but we might trivially try to categorize them into (a) biological and (b) non-biological (artificial) vision. You may wonder why I am considering biological vision when writing on computer vision. I have two reasons: (1) nature is man's best teacher, and (2)artificial gadgets can be integrated in biological visual systems to correct/improve malfunctions (e.g., see Progress in Artificial Vision). Because of these two reasons, many are investigating how biological visual systems work. Having all these said, I will leave biological vision for another discussion and focus on non-biological vision.

Based on how much we know about the sensor (camera), non-biological vision can be sub-categorized to (b-1) calibrated, where parameters of the camera are known and (b-2) uncalibrated, where they are not known. We can also categorize them further by knowing about the world (how much we know about the scene model, and what type of images we are dealing with).

In this sense, we can assume that Geometrical approaches assume that the type of the camera is known and the scene model can be represented geometrically. For example, there is a calibrated projective camera and the scene consists of boxes of a solid texture.

Learning-based approaches step farther and assume that there will be no a priori knowledge of the scene model and everything is to be learned. It still makes sense if we know something about the camera, as without any a priori knowledge we will be looking at a near-infinite space of pluasible inferences for a scene.

I think this would start off the discussion. Feel free to elaborate more on these categories or propose others while we continue the discussion.

You can either write in the comments section, or if you prefer, you can write in your own blog and I link to you in the next entry.

P.S. I don't believe much in discussion forums. A blog is a better place to write and exchange ideas, I believe. I intend to summarize our ideas and prepare an online version for future reference.

---
* SoloGen maintains two blogs: Thesilog and Anti Memoirs.

Labels: academic life, computer vision

aShademan

May 31, 2006

{discussion.vision} What can a machine see?

2 Comments:

About

Previous