How to pad numpy arrays

Sometimes , e.g. preparing the deep learning input, we need to pad the numpy array to a certain length. Suppose we got a numpy array looks like this:

1
2
3
4
5
a = [    
[1, 2, 3],
[4,5],
[6]
]

We are going to pad it to shape (3, 3) with the blank space filled with 0.

Intuitively we can iterate the array and pad each row, however, we don’t necessarily do so because numpy provides a powerful API called pad(you bet)

If the to-be padded array is a regular one which means its each row has the same length, np.pad is convenient in padding it in every direction and every dimension,

for instance

1
2
b = [[1, 2], [3, 4]]
np.pad(b, ((3, 2), (2, 3)), 'constant')

produces

1
2
3
4
5
6
7
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 0, 0, 0],
[0, 0, 3, 4, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])

However, our array is not the typical one, thus we need to think out of the box, instead of pad the array in one time, we may need to pad it row by row, in code it is

1
[np.pad(a, (0, max_len - len(a)), 'constant', constant_values=0) for a in arr]

here max_len = 3

After applying that, we finally get what we want

1
2
3
4
5
a = [    
[1, 2, 3],
[4, 5, 0],
[6, 0, 0]
]
0%