Numpy is a library that provides the tools to work with high performance multidimensional arrays
Characteristics
The first thing we do is to import the relevant library:
An array is a structure for storing and retrieving data.
For instance, if each element of the data were a number, we might visualize a “one-dimensional” array like a list:
1 | 2 | 3 | 4 | 5 |
A two-dimensional array would be like a table:
1 | 2 | 3 | 4 | 5 |
6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 |
A three-dimensional array would be like a set of tables, perhaps stacked as though they were printed on separate pages.
In NumPy, this idea is generalized to an arbitrary number of dimensions, and so the fundamental array class is called ndarray
It represents an “N-dimensional array”.
Most NumPy arrays have some restrictions.
The easiest way to create an array is to use np.array
For example, we can create an array below:
Here are operations that can tell us more information about these arrays:
shape
: tuple with the array dimensionsndim
: number of dimensions of the arraysize
: number of elements of the arraydtype
: type of data of arrayThis was our original array
Let us see how they work in practice
One of the main advantages of NumPy over native Python lists is performance.
NumPy arrays are implemented in C and optimized for high-performance computation
Python lists are more general and flexible but can be slower for numerical tasks.
Let’s compare the time taken to perform an element-wise addition on a large array using both Python lists and NumPy arrays.
Let’s compare the time taken to perform an element-wise addition on a large array using both Python lists and NumPy arrays.
Let’s compare the time taken to perform an element-wise addition
# Python list addition
start = time.time()
python_result = [x + x for x in python_list]
python_time = time.time() - start
python_time
0.023259878158569336
Contiguous Memory
Low-Level Optimization
Vectorization
Element-wise operations allow you to perform calculations on each corresponding element in two arrays or between an array and a scalar.
Key Examples:
+
): Adds corresponding elements.-
): Subtracts corresponding elements.*
): Multiplies corresponding elements./
): Divides corresponding elements.Here are examples of Addition and Multiplication
You can also apply operations between an array and a scalar, which applies the operation to each element of the array.
We learned about slicing and indexing in the case of lists
We can use similar methods in the case of numpy arrays.
The difference here is that any change here modifies the original array.
Let us create a new array:
We can easily replace elements within this array
If we still want to keep the original array, we can make a copy
We can now change b
This means that a will stay the same: see below
We can create multidimensional arrays in the following way:
We can create 8 blocks, each containing 2 rows and 2 columns
We can extract the first column from the first block
We can extract the second column in the first block.
What if we want to extract the first column in the eighth block?
Numpy has specific functions to create default values within an array:
Let us look at some examples:
Universal functions are functions that perform element-wise operations on data in arrays.
Generally, these types of functions are applied to each element of an array.
There are a set of mathematical functions that compute statistics on an entire array.
Functions include methods such as: sum, mean, standard deviation.
The previous operations have been performed on the entire matrix. It is possible to specify the axis, as shown below:
array([ 9, 12, 15])
The previous operations have been performed on the entire matrix. It is possible to specify the axis, as shown below:
array([ 3, 12, 21])
Let us imagine that we have a small dataset representing student scores in different subjects.
Tasks:
# Task 2: Compute the mean score for each subject
mean_scores = scores.mean(axis=0)
print("Mean scores for each subject:", mean_scores)
Mean scores for each subject: [82.4 84.6 85.8 89.4]
While working with NumPy, you may encounter some common errors.
Frequent issues: when the shapes of arrays do not match for operations like addition, multiplication, or broadcasting.
NumPy requires that arrays have the same shape.
Example:
Fix: Ensure that arrays are of compatible shapes or reshape them using .reshape() or .expand_dims() for broadcasting.
NumPy is strict about data types (dtype).
Sometimes, operations fail if the types are incompatible (e.g., trying to multiply a string array with an integer).
ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U1'), dtype('int64')) -> None
Fix: Convert the array to the correct type using .astype().
NumPy raises an IndexError if you try to access an element outside the bounds of the array.
index 5 is out of bounds for axis 0 with size 3
Fix: Ensure that you access elements within the valid range of indices.
Popescu (JCU): Lecture 8