point

Remember me

## Statistics with Julia: Linear Algebra with LAPACK

Thu, 31 May 2012 05:38:38 GMT

Linear algebra lies at the heart of many modern statistical methods. As such, continuing on my short series on using the Julia language for basic statistical analysis, I want to give a short review of some basic matrix and vector procedures which we will use subsequently when constructing some simple optimization routines.

[Note: The relevant Julia manual section lists all the relevant functions and should be considered the primary source.]

Linear algebra provides a mechanism for efficiently solving systems of equations. A single equation may be expressed as a row vector, such as:

$5 = 9x + 13y + 5z$

As:

$\begin{bmatrix} 9 & 13 & 5 \end{bmatrix}$

Similarly, systems of equations can be combined into a matrix:

$\begin{bmatrix} 9 & 13 & 5 \\ 1 & 11 & 7 \\ 3 & 9 & 2 \\ 6 & 0 & 7 \end{bmatrix}$

Which can then be solved more efficiently using various different theorems.

I recommend Gilbert Strang's MIT course on Linear Algebra as a reference (as I did in an extremely short introduction to Linear Algebra with R).

Julia, like R, uses LAPACK (which makes use of BLAS) for all the related linear algebra functionality. LAPACK (Linear Algebra PACKage) is a software library for numerical linear algebra, written in Fortran 90. Dirk Eddelbuettel wrote an excellent paper (highly recommended) and a related R package (gcbd)benchmarking different BLAS implementations. Dirk shows four benchmarks in his paper: matrix crossproducts, SVD decomposition, QR decomposition, and LU decomposition. I give examples of the latter three of these below as these are core linear algebra algorithms.

Before starting, I wanted point out some recent news with the language:

## Vector/Matrix Creation

I showed how to construct an Array in the last post on basic Julia syntax.

A matrix in Julia is nothing more than a 2-D array. There are many ways of constructing these. Some examples:

# Construction of arrays A = randn(10) # an array of size 10 A = [1:10]

 # Construction of matrices M = [1 2 3; 4 5 6] # a 2x3 matrix M1 = randn(3, 3) # a random 3x3 matrix M2 = reshape(fill(1, 9), 3, 3) # a 3x3 matrix of all 1's 

# Some special matrices ID = eye(10) # The identity matrix

We can get various different parts of matrices with simple commands:

triu(M) # Upper triangle of matrix M tril(M) # Lower triangle of matrix M diag(M) # The diagonal vector from matrix M

And do basic matrix math as expected

M1 + M2 M1 - M2 M1 * M2 M1 \ M2 # Matrix division using a polyalgorithm (i.e. optimized for the type of matrix). inv(M) # Inverse of matrix M, equivalent to 1/M

## Vector/Matrix Maths

There are several important matrix decomposition (or factorization) methods which are worth exploring further and are available through LAPACK. These all seek to break a matrix into other important canonical forms.

### LU Decomposition

The first method is LU decomposition, which was introduced by Alan Turing in 1948 in his paper "Rounding-off errors in matrix processes".

LU decomposition is a key step in several fundamental numerical algorithms, including as one method for solving a system of linear equations (as we shall see later), inverting a matrix (an extremely expensive operation), or computing the determinant of a matrix. This can be expressed in matrix notation as:

$A = L U$

In Julia, we can compute the simple example on wikipedia:

$A = \begin{bmatrix} 4 & 3 \\ 6 & 3 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0.67 & 1 \\ \end{bmatrix} \begin{bmatrix} 6 & 3 \\ 0 & 1 \\ \end{bmatrix}.$

With the lu function.
A = [4 3; 6 3] A = [1 -2 3; 2 -5 12; 0 2 -10] (L, U, p) = lu(A) L U

You will see that this matches the same output from R using the lu() function from Matrix:

library(Matrix) M = matrix(c(4, 6, 3, 3), nrow=2) expand(lu(M))

### QR Decomposition

QR decomposition breaks a matrix into an orthogonal matrix and an upper triangular matrix can also be used to solve linear least squares. This is usually represented in matrix notation as:

$A = Q R$

Wikipedia shows several methods for solving for this simple matrix:

$A = \begin{bmatrix} 12 & -51 & 4 \\ 6 & 167 & -68 \\ -4 & 24 & -41 \end{bmatrix}$

Which we can solve in Julia using the qr() function:

A = [12 -51 4; 6 167 -68; -4 24 -41] (Q, R, p) = qr(A) Q R

Similarly, in R we can use the base function qr() (note that R uses LINPACK by default):

x = t(matrix(c(12, -51, 4, 6, 167, -68, -4, 24, -41), nrow=3)) qr.R(qr(x, LAPACK=TRUE)) qr.Q(qr(x, LAPACK=TRUE))

### Singular Value Decomposition (SVD)

One of the most common routines is a singular value decomposition; this is also used in fitting a least squares model.

$M = U\Sigma V^*$

where U is a unitary matrix, the matrix Σ is a diagonal matrix with nonnegative real numbers on the diagonal, and the unitary matrix V* denotes the conjugate transpose of V.

Once again, we can solve for the example given on the wikipedia article as:

$M = \begin{bmatrix} 1 & 0 & 0 & 0 & 2\\ 0 & 0 & 3 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 4 & 0 & 0 & 0\end{bmatrix}$

Which solves into:

$U = \begin{bmatrix} 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1\\ 1 & 0 & 0 & 0\end{bmatrix} ,\; \Sigma = \begin{bmatrix} 4 & 0 & 0 & 0 & 0\\ 0 & 3 & 0 & 0 & 0\\ 0 & 0 & \sqrt{5} & 0 & 0\\ 0 & 0 & 0 & 0 & 0\end{bmatrix} ,\; V^* = \begin{bmatrix} 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ \sqrt{0.2} & 0 & 0 & 0 & \sqrt{0.8}\\ 0 & 0 & 0 & 1 & 0\\ \sqrt{0.8} & 0 & 0 & 0 & -\sqrt{0.2}\end{bmatrix}.$

This can be solved using the svd() function in Julia:

A = [1 0 0 0 2; 0 0 3 0 0; 0 0 0 0 0; 0 4 0 0 0] (U, S, V) = svd(A) U S V

### Cholesky Decomposition

Lastly, we will need to use Cholesky factorization (or Choleski, depending on your background). This is a special case of LU factorization (symmetric), but it requires half the memory and half the number operations of an LU decomposition. This can be expressed in matrix notation as:

$A = L L*$

And can be solve in Julia with the chol() function:

A = [5 1.2 0.3 -0.6; 1.2 6 -0.4 0.9; 0.3 -0.4 8 1.7; -0.6 0.9 1.7 10]; R = chol(A) R

It should be noted that I am showing these functions because they are important to solving statistical methods. But these functions are not relevant in so far as we are interested in Julia for the performance of the language, since they rely on LAPACK rather than native methods.

Now that we have briefly covered the notation for several matrix decompositions, we are ready to look at a least squares problem in the next post.