3. Linear Algebra#

By now, we can load datasets into Arrays and manipulate these Arrays with basic mathematical operations. To start building sophisticated models, we will also need a few tools from linear algebra. This section offers a gentle introduction to the most essential concepts, starting from scalar arithmetic and ramping up to matrix multiplication.

3.1. Scalars#

Scalars are implemented as numbers. Below, we assign two scalars and perform the familiar addition, multiplication, division, and exponentiation operations.

x = 3.0
y = 2.0

x + y, x * y, x / y, x^y
(5.0, 6.0, 1.5, 9.0)

3.2. Vectors#

For our purposes, you can think of vectors as fixed-length arrays of scalars.

x = collect(1:3)
3-element Vector{Int64}:
 1
 2
 3

Recall that we access a vector’s elements via indexing.

x[3]
3

In code, this corresponds to the vector’s length, accessible via Julia’s built-in length function.

length(x)
3

We can also access the length via the size function. The size is a tuple that indicates a array’s length along each axis. Arrays with just one axis have shapes with just one element.

size(x)
(3,)

3.3. Matrices#

We can construct any appropriately sized m x n matrix by passing the desired shape to reshape:

A = reshape((1:6),3,2)
3×2 reshape(::UnitRange{Int64}, 3, 2) with eltype Int64:
 1  4
 2  5
 3  6

In code, we can access any matrix’s transpose as follows:

A'
2×3 adjoint(reshape(::UnitRange{Int64}, 3, 2)) with eltype Int64:
 1  2  3
 4  5  6

The following matrix is symmetric:

A = [1 2 3;2 0 4;3 4 5]
A == A'
true

Matrices are useful for representing datasets. Typically, rows correspond to individual records and columns correspond to distinct attributes.

3.4. Arrays#

Multi-dimensional arrays are constructed analogously to vectors and matrices, by growing the number of shape components.

reshape(1:24,2,3,4)
2×3×4 reshape(::UnitRange{Int64}, 2, 3, 4) with eltype Int64:
[:, :, 1] =
 1  3  5
 2  4  6

[:, :, 2] =
 7   9  11
 8  10  12

[:, :, 3] =
 13  15  17
 14  16  18

[:, :, 4] =
 19  21  23
 20  22  24

3.5. Basic Properties of Array Arithmetic#

Scalars, vectors, matrices, and Multi-dimensional arrays all have some handy properties. For example, elementwise operations produce outputs that have the same shape as their operands.

A = reshape(Float32[1:6...],2,3)
B = copy(A)
display(A),display(A + B);
2×3 Matrix{Float32}:
 1.0  3.0  5.0
 2.0  4.0  6.0
2×3 Matrix{Float32}:
 2.0  6.0  10.0
 4.0  8.0  12.0

The elementwise product of two matrices is called their Hadamard product:

A .* B
2×3 Matrix{Float32}:
 1.0   9.0  25.0
 4.0  16.0  36.0

Adding or multiplying a scalar and an array produces a result with the same shape as the original array. Here, each element of the array is added to (or multiplied by) the scalar.

a = 2
X = reshape(1:24,2,3,4)
display(a .+ X)
size(a .* X)
2×3×4 Array{Int64, 3}:
[:, :, 1] =
 3  5  7
 4  6  8

[:, :, 2] =
  9  11  13
 10  12  14

[:, :, 3] =
 15  17  19
 16  18  20

[:, :, 4] =
 21  23  25
 22  24  26
(2, 3, 4)

3.6. Reduction#

Often, we wish to calculate the sum of a array’s elements.

x = Float32[1:3...]
x, sum(x)
(Float32[1.0, 2.0, 3.0], 6.0f0)

To express sums over the elements of arrays of arbitrary shape, we simply sum over all of its dimensions.

size(A),sum(A)
((2, 3), 21.0f0)

By default, the sum function is a non-reduction sum. Julia also allow us to specify the dimensions along which the array should be reduced. To sum over all elements along the rows (dimension 2), we specify dims=2 in sum. Since the input matrix sum along dimension 2 to generate the output 2 x 1 matrix, this dimension size reduced to 1. We can use dropdims to drop singleton dimensions in array.

size(A), size(sum(A,dims=2)),size(dropdims(sum(A,dims=2),dims=2))
((2, 3), (2, 1), (2,))

Specifying dims=1 in sum function will generate a 1 x 3 matrix.

size(A), size(sum(A,dims=1)),size(dropdims(sum(A,dims=1),dims=1))
((2, 3), (1, 3), (3,))

Summing a matrix along both rows and columns is equivalent to summing up all the elements of the matrix, although matrix shape will become (1,1) and has only one value 21.

first(sum(A,dims=(1,2))) == sum(A)
true

A related quantity is the mean, also called the average. We calculate the mean by dividing the sum by the total number of elements. This function is a part of Statistics package.

using Statistics
mean(A),sum(A)/length(A)
(3.5f0, 3.5f0)

Likewise, the function for calculating the mean will not reduce array along specific dimensions, this produces a 1x3 matrix.

mean(A,dims=1),sum(A,dims=1)/size(A,1)
(Float32[1.5 3.5 5.5], Float32[1.5 3.5 5.5])

3.7. Non-Reduction Sum#

Sometimes it can be useful to keep the number of axes unchanged when invoking the function for calculating the sum or mean. This matters when we want to use the broadcast mechanism. In fact, this is the default behavior of sum or mean.

sum_A = sum(A,dims=2)
display(sum_A)
size(sum_A)
2×1 Matrix{Float32}:
  9.0
 12.0
(2, 1)

For instance, since sum_A keeps its two dimensions after summing each row, we can divide A by sum_A with broadcasting to create a matrix where each row sums up to 1.

A ./ sum_A
2×3 Matrix{Float32}:
 0.111111  0.333333  0.555556
 0.166667  0.333333  0.5

If we want to calculate the cumulative sum of elements of A along some dimensions, say dims=1 (row by row), we can call the cumsum function. By design, this function does not reduce the input array along any dimensions.

cumsum(A,dims=1)
2×3 Matrix{Float32}:
 1.0  3.0   5.0
 3.0  7.0  11.0

3.8. Dot Products#

One of the most fundamental operations is the dot product. We can use operator from LinearAlgebra package (where can be typed by tab-completing \cdot in the REPL).

using LinearAlgebra
y = ones(Float32,3)
x, y, x⋅y
(Float32[1.0, 2.0, 3.0], Float32[1.0, 1.0, 1.0], 6.0f0)

Equivalently, we can calculate the dot product of two vectors by performing an elementwise multiplication followed by a sum:

sum(x.*y)
6.0f0

3.9. Matrix-Vector Products#

To express a matrix-vector product in code, we use the * operator. Note that the column dimension of A (its length along dimension 2) must be the same as the dimension of x (its length).

size(A),size(x),A*x
((2, 3), (3,), Float32[22.0, 28.0])

3.10. Matrix-Matrix Multiplication#

In the following snippet, we perform matrix multiplication on A and B. Here, A is a matrix with 2 rows and 3 columns, and B is a matrix with 3 rows and 4 columns. After multiplication, we obtain a matrix with 2 rows and 4 columns.

B = ones(3,4)
A*B
2×4 Matrix{Float64}:
  9.0   9.0   9.0   9.0
 12.0  12.0  12.0  12.0

The term matrix-matrix multiplication is often simplified to matrix multiplication, and should not be confused with the Hadamard product.

3.11. Norms#

The method norm calculates the \( \ell_2 \) norm.

u = [3.0, -4.0]
norm(u)
5.0

To compute the \( \ell_1 \) norm, we specify the second parameter for the norm function.

norm(u,1)
7.0

Invoking the following function will calculate the Frobenius norm of a matrix.

norm(ones(4,9))
6.0