The first time I wrote a neural network code for class assignment, I always want to know what input the model expected to maximize probability belong to certain class. To illustrate my point, when a photo a penguin is not classified as a penguin, how the photo will look like to be 99.99% classified as a penguin photo.
I tried Keras and it’s so easy to write this down into a code. I used a pretrained VGG16 model. What my model needs to do is doing gradient ascent and visualize the input space which maximize the probability belong to certain class. I picked a two classes: penguin and space shuttle.
(Not so) surprisingly, the inputs look like these,
A space shuttle
What we see and what convnets see are different because the way we human see and understand image is also different with convnet. Convnet works by understanding a decomposition of visual input space as a sequence of convolutional filters and understanding probabilistic mapping between the convoluted visual input and its label.
Hence, although an image is not a penguin in human’s sense, a convnet will perceive it as a penguin if it somehow has similar local texture or low level features similar to penguin. That’s (probably) how they see the world.