Linear Algebra for AI – ALL You Need To Know

Monith Ishanka
25 09 2023
No Comments

Linear Algebra is a fundamental mathematical knowledge you should have for other mathematical and AI concepts. In this article, I will give you a solid understanding of Linear Algebra and its mathematical operations with examples.

What is Linear Algebra?

Linear algebra is a branch of mathematics that deals with vectors, vector spaces, linear transformations, and systems of linear equations. In linear algebra, the fundamental objects are vectors and matrices.

Vectors are quantities that have both magnitude and direction. They are represented as arrays of numbers arranged in columns. Matrices, on the other hand, are rectangular arrays of numbers organized into rows and columns. Matrices are used to represent linear transformations and relationships between different sets of data.

Key concepts in Linear Algebra:

Vector Spaces
Linear Transformations
Systems of Linear Equations
Matrices and Matrix Operations
Eigenvalues and Eigenvectors

Linear Algebra provides the foundation for other mathematics branches and many advanced topics and it has broad applications across various fields: computer science, physics, engineering, economics, and data science. In computer science and data science, linear algebra is fundamental to machine learning, data analysis, computer graphics, and optimization algorithms.

Why does AI need Linear Algebra?

At the AI, Linear algebra helps us to represent and manipulate large amounts of data very efficiently and easily. Also, Linear Algebra is the key concept used for feature extraction, optimization, and dimensionality reduction.

Data Representation: When we work in AI, we have to deal with vast amounts of data. Here, we can use vectors and matrices to store the data. This makes data representation and manipulation very efficient and, enables efficient data storing and computation.
Optimization: Optimization algorithms are used to solve systems of equations and minimize objective functions. Optimization algorithms like gradient descent and matrix factorization which improve AI models performance and accuracy by adjusting its parameters, heavily rely on linear algebra techniques.
Dimensionality Reduction: AI algorithms often struggle with high-dimensional data due to the curse of dimensionality. Therefore, we are using linear algebra techniques such as singular value decomposition (SVD) and eigenvalue decomposition to reduce data dimensions while preserving important information and improving computational efficiency and model performance.
Model Representation: AI models like neural networks are doing mathematical operations on the input data by using neurons to capture the patterns and relationships. As those neurons’ weight and bias values are stored by vectors and matrices, matrix calculations are involved in this whole mathematical process.

How do we use Linear Algebra in AI?

We don’t have to do any math calculations in AI and Data Science. Computers do it for you. Here you just need to know what they are doing. It Only requires basic knowledge of Linear Algebra and how to represent scalars, vectors, and matrices in a computer and to do mathematical operations using codes.

let’s look at the main objects of linear algebra, basic arithmetic operations, and other mathematical operations we need.

Tensors

A Tensor is the fundamental tool in linear algebra that we use to encode and store our data. Generally, a tensor is like a container of numbers that can have various arrangments. Scalar, vector, and matrix are the specific names that we use to describe specific arrangements. In other words, scalar, vector, and matrix are distinct categories within the realm of tensors.

For better understanding, Let’s look at each tensor category and its qualities.

Throughout this article, I use Python and PyTorch combination for programming, You can follow our installation guide to set up your PC properly!

Scalar

A scalar is a tensor with just one element/a single number.

Here, I assign two scalars in PyTorch and perform the addition, multiplication, division, and exponentiation (arithmetic) operations on them.

torch.tensor() – This is used to create tensors(scalars, vectors, matrices). By entering a single value, it just creates a scalar.

Output:

Vector

A vector is a tensor that has an array of numbers with a fixed length. Generally, vectors are represented as columns. (vectors have only one axis)

In Data Science, each element of the vector represents an important feature in the dataset. For example, If you are training a model to predict the heart attack risk, each vector might represent a patient, and its components/values might correspond to their most recent vital signs, cholesterol levels, minutes of exercise per day, etc.

Like other programming languages, Python’s array indices start at 0. Here, torch.arange(3) creates a tensor(vector) with values ranging from 0 to 3 [0, 1, 2]

Output:

Numb of the element in a vector represents the dimensionality of that vector. You can see num of elements contained in the vector via Python’s built-in len() function

Output:

Python’s attribute called shape() gives a tuple that indicates a tensor’s number of axes and the number of elements in each axis. As vectors have only one axis, it outputs one number which indicates the number of elements in that axis.

Output:

Though vectors that are created by tensor.arange() seem like rows, They actually work as columns.

Matrix

Matrices are tensors characterized by having two axes.

A∈R^m×n indicates that a matrix A contains m×n real-valued scalars, arranged as m rows and n columns. When m=n we say that a matrix is square. Visually, we can illustrate any matrix as a table.

Here, I convert vector x into a Matrix with 3 rows and 2 columns and another Matrix with 2 rows and 3 columns.

.reshape(numb of rows, numb of columns) – Changes tensor shape as you want.

Output:

Higher-Rank Tensors

All the tensors except Scalars, vectors, and matrices which have more than two axes are defined as higher-rank tensors.

For example, we can create a tensor with 3 axes like this:

.reshape(num of blocks, num of rows, num of columns)

Output:

We can create a tensor with the values that we want by using torch.tensor()

Output:

Tensor Operations

We can apply various tensor manipulation and transformation methods to a tensor and these operations are very important in data science and AI. They include:

Matrix’s Transpose

Here we are flipping the axes of a tensor, effectively changing rows to columns and vice versa. We represent transposed X by X^T

In Pytorch we use .T to get the transpose of a tensor.

Output:

Hadamard Product(⊙)

This gives us the elementwise product of two tensors. To perform this operation, both tensors should be the same shape. In Pytorch, We use * to calculate the elementwise product of two tensors.

Output:

Dot Product

This produces a sum over the product of the elements at the same position in a tensor. If X and Y vectors (X,Y∈R^d), their dot product is represented as X^TY

$X^{T} Y = \sum_{i = 1}^{d} x_{i} y_{i}$

We use torch.dot() function in PyTorch to get the dot product between two vectors. Remember that this method only works for vectors and both vectors should be the same type(int, float..). There are other methods in PyTorch that are used to get the dot product between other tensor types.

torch.ones() – Creates a tensor that only contains ones.

Output:

Matrix-Vector Multiplication

Here we calculate the dot product between $m×n$ matrix X and vector Y with $n$ -dimension. For that, num of columns ( $n$ ) in the matrix should equal the number of rows( $n$ )/number of elements in the vector. The result is a vector with $m$ number of rows. In PyTorch, we use torch.mv(X,Y) or X@Y to calculate the dot product between matrices and vectors.

Output:

Though we code vectors as rows, they are columns!

For example: y = [0 1 2]

Matrix-Matrix Multiplication

Here we calculate the dot product between matrix A∈R^n×k and matrix B∈R^k×m. For that, the number of columns( $k$ ) in matrix A should equal to number of rows( $k$ ) in matrix B. Otherwise, it causes errors. The result is a matrix with $n$ rows and $m$ columns. In PyTorch we use torch.mm(A, B) or A@B to calculate the dot product between matrices(matrix multiplication).

Output:

Dimensionality reduction

In a tensor, the number of elements in each axis represents the dimensionality of that tensor. When it comes to machine learning and data science those elements are the features in a dataset that we use to train our AI model. So when we use a dataset with a lot of features or dimensions, our model looks for overfitting and requires a lot of computational power or sometimes we are unable to get meaningful information from the dataset.

So we have to reduce the dimensionality or number of elements in the dataset while preserving as much relevant information as possible. We call this process “Dimensionality Reduction”. Here are some Linear algebra methods we can use for dimensionality reduction.

sum()

We use .sum() to calculate the sum of the tensor’s elements. By summing we can reduce the number of dimensions without losing too much information in a dataset.

You can perform a summing operation along with a specific axis in the tensor by using axis=, axis = 0 to refer to columns, and axis = 1 to refer to rows.

Output:

If you want to get the sum and mean, while keeping wanted dimensions/axes, you can use keepdim=True option.

Output:

mean()

We use .mean() to calculate the mean of the tensor. This also reduces the number of dimensions without losing too much information. mean() only accepts tensors with float values to avoid information loss. Imagine, if mean() of matrix A is 2.5 and matrix A only consists of int values then it returns 2(int of 2.5), so there is an information loss. Therefore only you can use mean() for tensors with float values otherwise, it gives an error.

You can convert tensor elements into float values by using float() function or mean(dtype=torch.float). Also, you can turn into float when a tensor is created using torch.arange(6,dtype=torch.float32. A.mean() = A.sum() / A.numel() Here, numel() function gives us the number of elements in a tensor.

Output:

These are the essential Linear Algebra concepts you should know for data handling in AI and Data Science. Mathematical operations like Dot Product, Hadamard Product, Matrix Transpose, and Matrix Multiplication play a big role in Data Handling. You should keep practice with these linear algebra techniques to master them.

In this comprehensive guide, we’ve covered essential linear algebra concepts that empower data science and machine learning and we’ve explored the practical uses of Linear Algebra through examples and code. Understanding major linear algebra elements like vectors, matrices, and transformations, gives insight into algorithms, neural networks, and data processing. As our data-driven world advances, mastering Linear Algebra becomes crucial for those entering the AI and Data Science field.

Linear Algebra for AI – ALL You Need To Know

What is Linear Algebra?

Why does AI need Linear Algebra?

How do we use Linear Algebra in AI?

Tensors

Scalar

Vector

Matrix

Higher-Rank Tensors

Tensor Operations

Matrix’s Transpose

Hadamard Product(⊙)

Dot Product

Matrix-Vector Multiplication

Matrix-Matrix Multiplication

Dimensionality reduction

sum()

mean()

Leave a Reply Cancel reply

Catergorise

About

Community

Newsletter

Business Solutions