Cropping. Perhaps It's Simply Array Slicing!

Cropping. Perhaps It's Simply Array Slicing!

Images As Arrays

By now we've already established that to a computer an image is just a bunch of numbers which represent individual pixels. If that's a bit confounding please check out my previous posts.

So essentially an image is an array! In Python programming, an array is simply a grid of numbers arranged in rows and columns. Infact, an image of 1080p resolution is actually an array of 1920 columns and 1080 rows (which yields an aspect ratio of 1920:1080 = 16:9! just a fun fact).

Cropping Images

Consider the image below, now this image is about 375 x 500 pixels implying 375 rows and 500 columns (thats a total of 375 x 500 = 187,500 pixels). Say we would like to get rid of the background and keep the face of the cat in focus, we would simply head to a photo editor, grab the cropping tool, draw a bounding box over the cat's face and voilà!, cropped image.

20220429_224436.png

Slicing Arrays

How could this be replicated at a lower level? In Python programming there's this concept called slicing (you already know all about it if you are familiar with Python), it's how you can select specific regions of a matrix or array, a bit like the bounding box in a cropping tool but much more manual. For instance if I want to select the first 3 rows and the first 2 columns in an array, I would do the following...

# displaying array
array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

#  selecting the first 3 rows and the first 2 columns
array[0:3, 0:2]

array([[0, 1],
       [4, 5],
       [8, 9]])

If you understood that, let's take a look at the image once again. Using the 'number lines' (axes) on the left and bottom of the image we can see that the cat's face is approximately centered between row 50 and 251, and column 170 and 401 (approximately). Let's slice the image array along those values and see what happens.

image = image.open('image.jpg')
image = np.array(image)
cropped = image[50:251, 170:401]
plt.imshow(cropped)

Just as you might have guessed, we end up with a cropped image as seen below. It's quite interesting how much control one can have over images once they are broken down to their pixel representation.

20220429_225038.png