February 6, 2012

This is wonderful video [from Timo Arnall at Berg London] but I think seeing all of that computer vision at once makes it easy for us to project a bit too much facility on the robots and their vision.

Before we get too overwhelmed by the abilities of modern computer vision it’s worth remembering that everything without overlaid graphics is functionally invisible to those specific algorithms, and what is visible has zero higher perceptual understanding. A pattern has been matched, no more.

[I’m going to put a disclaimer here that I took CogSci 101 a long time ago, and then jumped right in to trying to read EEG, so my understanding is both shallow and beyond its shelf-life and in need of refresh]

There are a host of disorders in humans categorized as “agnosia” or ‘not knowing’. Most of what we know about our own visual system, like much else of what we know about our own cognition, comes mostly from examining cases like these where it has gone wrong. [This has a certain morbid fascination for me: somewhere out there are cognitive psychologists hoping against hope for someone in their vicinity to come down with a slightly novel stroke.]

One of the sub-disorders is “associative agnosia”, which is where someone who can see perfectly well has difficulty identifying objects visually (though allowing them to hold the object, or telling them the name of the object might trigger instant recognition). There is no semantic understanding of what is viewed. 

Another is ‘integrative agnosia”, where sufferers cannot recognize things at once, but must piece together an object by considering the recognizable parts. 

Then there are cases of category specific agnosia, where those afflicted have extreme difficulty recognizing something like animals, but might be quite good at recognizing man-made objects. This seems bizarre until we consider we have additional mental models of many man-made objects due to our interactions. I can’t say I’ve had many interactions with most of the animal kingdom (honest, guv’ner).

The thing to remember is robots suffer from all of these in particular, and agnosia in general: they don’t know what they’re looking at, and that makes it incredible difficult to program them to look for it reliably. 

In the grand scheme of things, we’re really just nibbling on the very edges of our own perceptual capabilities, from many fields all at once. I tend to think that solving any particular field of robotics like computer vision eventually expands until it’s an AI-complete task - that really solving computer vision requires solving artificial intelligence. There’s a feedback loop between the knowing and the seeing that just isn’t there. 

(Source: vimeo.com, via thewavingcat)

  1. killingdenouement reblogged this from blech
  2. blech reblogged this from augre and added:
    AUGRE, commenting on...Arnall’s Robot Readable World video.
  3. augre reblogged this from thewavingcat and added:
    This is wonderful video [from Timo Arnall at Berg London] but I think seeing all of that computer vision at once makes...
  4. thewavingcat posted this
Blog comments powered by Disqus