“Suppose now that I point my forefinger toward a cat, saying: /This is a cat/. […] At this point the referent-cat is no longer a mere physical object. It has already been transformed into a semiotic entity.”
Encountering this timeless example from Eco’s treatise A Theory of Semiotics (1977, p. 165) yesterday, I felt somewhat redeemed. For I had been spending a seemingly inordinate amount of time during this residency studying it, instead of tackling the hardships of software (neural networks and such) or, you know, being creative with objects on stage. This feeling stemmed from the fact that the cat quote, at last, concisely captured why I believe (for the moment) that when we are dealing artistically with computer vision, semiotics, old-fashioned as it may seem, can be an adequate theoretical framework to reflect on our practise.
(image from Wikimedia Commons)
Why semiotics? First, because it provides a theory of recognition: The process of making sense of perception. “Recognition occurs when a given object or event […] comes to be viewed […] as the expression of a given content” (221): This is a cat. — Which is precisely what a object recognition software like Google Vision gives us when it labels an image: A location within the image where it recognised something (“this”) and a label (“cat”). The image is transformed into a semiotic entity.
Second, semiotics as a theory of meaning doesn’t require the notion of a “subject” at all (314ff). Thus it should appear uniquely qualified to analyse processes of object recognition by present-day computer software, where the involvement of recognising subjects is, if it all, highly debatable.
Third, in experimenting with Google Vision (see previous posts) some of the labels we got for our photos had, I would argue, the effect of appearing “ridiculously or tragically meaningful” (64). Let’s have another look at the musician:
This image was taken from a spray bottle, a lighter, an umbrella, a lamp, a piece of aluminium foil and a cork (bottom to top), most of which where not physically touching, but where joined into one coherent shape in the picture by means of perspective. Google Vision labeled it (with a rather low confidence score, to be fair) as “Musician”, which gave us a great deal of amusement. Was it just amusement, though, or also a bit of a thrill that we got? From toying with a system that blatantly contradicted how we ourselves saw the image, the reason for which we didn’t even begin to understand? (We tried unsuccussfully to produce more musician images for a while.) That’s what I began asking myself after reading the following from Eco (discussing a nonsensical sentence):
“One laughs because even though one realizes that the situation is unthinkable, one understands the meaning of the sentence. One feels fear because, even though one realizes that the situation is possible, one does not like to accept such an alarming semantic organization of one’s experience.” (p. 64)
There is potential for humour in misrecognition. And for horror as well.
Carlos, July 28, 2020