How to Calculate Eigenvalue Decomposition in Python
Eigenvalue decomposition (EVD) or eigendecomposition for short is an important matrix factorization method, which is extensively used in various science and engineering applications.
In this post, we will go through the most popular Python packages for calculating the eigenvalue decomposition. These packages are
- NumPy: Anyone doing scientific computation and engineering with Python should be aware of NumPy. It has a specific module for linear algebra, namely,
numpy.linalg
. This also contains convenient means to calculate eigenvalue decompositionnumpy.linalg.eig
. - SciPy: SciPy is usually the first place to check for more advanced scientific computation topics. It also provides good support for a variety of linear algebra algorithms via
scipy.linalg
. For the eigenvalue decomposition, we can use the methodscipy.linalg.eig
.
In addition to the external package, we also have a look at calculating the eigenvalue decomposition without NumPy or SciPy. We will have a look at the power iteration algorithm, which provides a fairly low complexity algorithm for calculating the eigenvalues and matrices.
Table of contents
- A quick introduction to eigenvalue decomposition
- Applications of eigenvalue decomposition
- Calculating Eigenvalue Decomposition in Python
- Eigenvalue Decomposition using NumPy
- Eigenvalue Decomposition using SciPy
- Pure Python Implementation for Eigenvalue Decomposition
- Summary
A quick introduction to eigenvalue decomposition
The eigenvalue decomposition (EVD) is used to decompose or factorize a square matrix into the canonical form. By canonical form, we mean
\[
\mathbf{A} = \mathbf{U}\mathbf{D}\mathbf{U}^\text{-1},
\]
where
- \(\mathbf{A}\) is a real square matrix with dimensions \(n \times n\). We use real numbers only for the sake of simplicity. Extension to complex numbers is straightforward.
- \(\mathbf{U}\) is \(n \times n\) square matrix with the eigenvectors of \(\mathbf{A}\) as column vectors.
- \(\mathbf{D}\) is a \(n \times n\) diagonal matrix. The diagonal values in \(\mathbf{D}\) are the eigenvalues of \(\mathbf{A}\). These are non-negative numbers that are usually presented in decreasing order.
When we multiply matrix \(\mathbf{A}\) with \(i^{th}\) eigenvector \(\mathbf{u}_i\), we get \(\mathbf{A}\mathbf{u}_i = d_i\mathbf{u}_i\). Here, \(d_i\) is the \(i^{th}\) eigenvalue.
Applications of eigenvalue decomposition
Eigenvalue decomposition is extensively used in various fields of science and engineering. These use cases vary from theoretical work and analysis all the way to real-life algorithms.
- Matrix factorization: Decomposition is the obvious application of eigenvalues decomposition, where we rewrite a matrix as a product of matrices with very specific structures. This provides means to do more efficient matrix inversion. It's also helpful, for instance, in sparse matrix analysis.
- Principal Component Analysis (PCA): Eigenvalue decomposition is often used in the principal component analysis. PCA is a versatile algorithm for data visualization, dimensionality reduction, and compression.
- Cluster analysis: many types of clustering problems can be solved by using eigenvalue decomposition.
- System matrix analysis: eigenvalue decomposition is popular in control systems analysis and design. It is one of the key tools to extract information from the system matrices.
- Natural language processing (NLP): eigenvalue decomposition is often associated with NLP, where it is used as one of the tools to analyze text structures.
- Machine vision, image process & signal processing: eigenvalue decomposition-based feature extraction can be used for edge and corner detection from images and machine vision applications. Similar techniques can be generalized for other signal-processing applications as well.
Calculating Eigenvalue Decomposition in Python
Now that we have the quick introduction out of the way, we can dig into actually calculating the eigenvalue decomposition in Python. We will have a look at NumPy and SciPy libraries for "production" ready interfaces to EVD calculation. In addition, we will implement our own algorithm using the power iteration method. Just like we did with the singular value decomposition (SVD).
Before we go into the details of the calculation, we will create a common data set. This makes it easy to compare the outputs and verify that we are indeed getting the same results.
We use the same simple input matrix as we did with the SVD.
Our test input matrix is defined as
\[
A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}
\]
NumPy
Calculating the eigenvalue decomposition with NumPy is very convenient. The interface for EVD exists within the linear algebra module np.linalg
. The EVD calculation is actually very simple, we call np.linalg.eig
with the input matrix as the parameter, and it provides us the eigenvector matrix and eigenvalues as returned values.
The next step is to check that the results are as expected. Based on the definition, we should be able to reconstruct the input matrix by calculation \[\mathbf{U}\mathbf{D}\mathbf{U}^\text{-1}.\]
One thing that we need to notice first is that the returned eigenvalues are within a vector so we need to build the diagonal matrix \(\mathbf{D}\) first.
The reconstructed matrix matches with the input matrix as expected.
Next, we can check that multiplying the input matrix with the first eigenvector produces \(d_1 u_1\).
This perfectly matches \(d_1 u_1\)
The results look good. Calculating the eigenvalue decomposition using NumPy turned out to be really simple.
SciPy
Whenever we need something a bit more advanced than NumPy, we usually look at SciPy. SciPy offers a very similar interface to NumPy's eigenvalue decomposition. The EVD method is provided by the linear algebra library scipy.linalg
. The actual method is called scipy.linalg.eig
.
Let's verify the results the same we did for the NumPy. That is, by reconstructing the input matrix from the factorization
This verifies that we have successfully calculated the eigenvalue decomposition using SciPy!
Pure Python Implementation for Eigenvalue Decomposition
At times you may face an environment, where you just cannot install external libraries such as NumPy or SciPy for whatever reason. Usually, it comes down to the overhead or you are running a that cannot support NumPy or SciPy.
Luckily there are algorithms that can be fairly straightforwardly used to calculate the eigenvalue decomposition. One such algorithm is the Power iteration, can be used to iterative calculate the eigenvectors and the corresponding eigenvalues. The process is iterative in the sense that we first calculate the largest one, remove that from the input matrix, and again calculate the largest of the remaining eigenvalues, etc...
This is the same algorithm that we have previously used for singular value decomposition (SVD).
Power iteration method basics
The power iteration method finds the dominant eigenvalue and vector for square matrix \(\mathbf{A}\).
Following this, what the power iteration method provides us is iteratively the matrix \(\mathbf{U}\) and \(\mathbf{D}\).
- Let \(n = 1\), inialize search space to \(\mathbf{B}_{base}^{n} = \mathbf{A}\), and randomly initialize vector \(\mathbf{b}^n_1\).
- Iterative until convergence or a fixed number of iterations \( k = 1,\ldots \)
\[\mathbf{b}_{k + 1} = {\mathbf{B}_{base}^{n}\mathbf{b}_{k} \over \|\mathbf{B}_{base}^{n}\mathbf{b}_{k} \|}.\] - Remove the last eigenvector direction from the search base: \[ \mathbf{B}_{base}^{n +} = \mathbf{B}_{base}^{n} - d^n \mathbf{b}^n_k (\mathbf{b}^n_k)^\text{T}. \]
- Get the values of the eigenvalues from the Rayleigh quotient
\[ d_n = \sqrt{{(\mathbf{b}^n_k)^\text{T} \mathbf{B}_{base}^n (\mathbf{b}^n_k) \over \|(\mathbf{b}^n_k)\|}} \] - The \(n^\text{th}\) eigenvector is \(\mathbf{b}_k^n.\)
Now, we have successfully recovered \(\mathbf{U}\) and \(\mathbf{D}\). This process is simple enough to be implemented in Python
Preliminaries
Before getting into the actual power iteration algorithm, we need to introduce a few simple utility functions. We will focus on code simplicity rather than optimal performance.
One thing that we need is matrix multiplication by a vector and vector multiplication by another vector. We call this dot
, short for dot-product
Another one, which we will need is a matrix transpose. This will simply flip the order of rows & columns for a given input array
Power iteration algorithm
Now, we are ready to move on to implementing the power iteration algorithm. Our implementation pretty much follows the steps listed above and should be quite self-explanatory.
This should print out similar results to the ones that, we got from the external packages
There is quite a bit of room for optimization, but it works and that's the main thing. Extension to complex values is also fairly straightforward. From the algorithm point-of-view nothing changes. Just the arithmetics need to be translated to the complex domain.
Summary
In this post, we check the most important libraries for calculating the eigenvalue decomposition, namely, NumPy and SciPy. We verified that the result we got made sense, which is always important when dealing with complete algorithms. In case you ever find yourself in a situation, where it is not possible to utilize external packages, we implemented a custom algorithm for eigenvalue decomposition using the power iteration method.
Further resources
- NumPy eigenvalue decomposition reference manual.
- SciPy eigenvalue decomposition reference manual.
- Eigenvalue decomposition Wikipedia page.
- Power iteration method Wikipedia page.