The Basics of Matrix Multiplication
Whether you're processing RNA-seq count matrices, analyzing ChIP-seq peak intensities, or performing dimensionality reduction through PCA, matrices are the mathematical foundation of modern bioinformatics. From simple genomic data transformations to complex neural network operations in deep learning models, every computational step can be decomposed into matrix operations.
Understanding how matrices work—and how to multiply them efficiently—is essential for anyone building or interpreting machine learning and deep learning pipelines in genomics and computational biology. In this post, we will explore matrix multiplication from first principles, then implement it in plain Python and leverage the optimized NumPy library for practical bioinformatics applications.
Matrix Multiplication
To multiply matrices, you multiply row i of the first matrix by column j of the second matrix element by element. Therefore, the first matrix must have the same number of columns as the number of rows of the second matrix, and the result will be a matrix with the number of rows of the first matrix and the number of columns of the second matrix.
And with an example:
Therefore, we can represent a system of equations as a matrix of coefficients multiplied by a vector of variables:
Multiply Matrices in R
In R, matrix multiplication is performed using the %*% operator, which leverages optimized linear algebra libraries (BLAS/LAPACK) compiled in Fortran or C. This makes it significantly faster than implementing the algorithm in plain R.
A matrix can be created using the matrix() function:
1mat_1 <- matrix(c(4, 2, 3,
2 6, 5, 9), nrow = 2, ncol = 3, byrow = TRUE)
1mat_1
1 [,1] [,2] [,3]
2[1,] 4 2 3
3[2,] 6 5 9
1mat_2 <- matrix(c(11, 7,
2 24, 5,
3 8, 1), nrow = 3, ncol = 2, byrow = TRUE)
1mat_2
1 [,1] [,2]
2[1,] 11 7
3[2,] 24 5
4[3,] 8 1
Now we can multiply the matrices using the %*% operator:
1resultado <- mat_1 %*% mat_2
2resultado
1 [,1] [,2]
2[1,] 116 41
3[2,] 258 76
Multiply Matrices in Plain Python
Ptyhon does not come with a standard library for linear algebra. One option would be to implement it as follows.
A matrix can be defined as a list of rows (lists):
1mat_1 = [[4, 2, 3],
2 [6, 5, 9]]
1print(mat_1)
1[[4, 2, 3], [6, 5, 9]]
1mat_2 = [[11, 7],
2 [24, 5],
3 [8, 1]]
Now let's define a function to multiply matrices:
1def mat_mult(m1, m2):
2 # Get number of columns from first matrix and rows from second matrix
3 cols = len(m1[0])
4 rows = len(m2)
5
6 # Check if dimensions are compatible
7 if rows != cols:
8 print('[ERROR] The number of rows of the first matrix must be equal to the number of columns of the second matrix')
9 return
10
11 # Define matrix dimensions
12 m = len(m1)
13 n = rows
14 p = len(m2[0])
15
16 # Perform the multiplication
17 res = []
18 for i in range(0, m):
19 # Calculate each row
20 new_row = []
21 for j in range(0, p):
22 # Calculate each element
23 new_value = 0
24 for k in range(0, n):
25 # Multiply elements and sum
26 new_value += m1[i][k] * m2[k][j]
27 # Add new element to the row
28 new_row.append(new_value)
29 # Add row to final matrix
30 res.append(new_row)
31 return(res)
1mat_1 = [[3, 45, 67, 89],
2 [2, 0, 45, 62],
3 [1, 34, 4, 6]]
4
5mat_2 = [[30, 5, 2],
6 [-9, 2, 4],
7 [14, 8, -46],
8 [-4, 0, 2]]
9
10mat_mult(mat_1, mat_2)
1[[267, 641, -2718], [442, 370, -1942], [-244, 105, -34]]
Multiply Matrices Using NumPy
However it is much easier and eficient to use the NumPy package. Remember to install it using uv as described in blog post
1import numpy as np
1np_1 = np.array([[3, 45, 67, 89],
2 [2, 0, 45, 62],
3 [1, 34, 4, 6]])
1np_1
1array([[ 3, 45, 67, 89],
2 [ 2, 0, 45, 62],
3 [ 1, 34, 4, 6]])
1np_2 = np.array([[30, 5, 2],
2 [-9, 2, 4],
3 [14, 8, -46],
4 [-4, 0, 2]])
1np_1.dot(np_2)
1array([[ 267, 641, -2718],
2 [ 442, 370, -1942],
3 [ -244, 105, -34]])
Take home messages
- Autoencoders can efficiently compress single cell expression data even with this quick and non-optimal configuaration
- Autoencoders are fast and scalable, making them well suited to the constant increase in dataset size in the single-cell field
- Learning how autoencoders work helps you understand the internal mechanisms of many deep-learning tools, enabling you to better evaluate and interpret their outputs on your own datasets