dog with RGB channels

Computer cannot directly "see" the image, but it can get the numbers that represent the image in the form of a RGB value matrix.

Each pixel in the image is represented by a 3D vector, which contains the RGB values of the pixel.

Given the matrix, different learning algorithms can be applied to classify the image. Just like humans distinguish different objects by looking at the shape, color, and texture of the object, the computer can also learn from the features in the numbers to classify the image.