Transition Matrices#

Here, we will continue to discuss some of the foundations of HMMs. We will cover…

  • Transition matrices

  • Manipulating transition matrices to make inferences


What is a transition matrix?#

As seen in the weather example found within the Introduction, the visual used to present the weather example is called a transition diagram. However this is inefficent to read when working with more states and more transitions. To do this, we can use a transition matrix which I will explain further below.

If we are trying to demonstrate the possible transitional probabilites from three states \(A, B, C\), how can we do this without having to draw out the diagram? We can use a transition matrix.

\(A\)

\(B\)

\(C\)

\(A\)

0.50

0.25

0.25

\(B\)

0.33

0.33

0.33

\(C\)

0.9

0.05

0.05

This is what a transition matrix may look like. What is the probability of transitioning to \(B\) if the current state is \(A\)? Well, we first look at the left-most column, which represents all the current states. Stop at \(A\). Then we continue into \(A\)’s row until we hit \(B\). The probability in that cell is 0.25. So there is a 0.25 probability that we will transition to state \(B\) if we are currently on state \(A\).


But how can we take this further?#

Let \(S_0\) represent the initial state. This can be taken further such that

\[\begin{split} S_0 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} \end{split}\]

Where \(1\) in the vector is the initial state \(A\), i.e., where you are currently at. The subsequent \(0\)s are where you are not. This equation is mostly formalism for encoding the initial state of the system.

Now let’s transition to \(S_1\) such that

\[\begin{split} S_1 = \begin{pmatrix} 0.50 \\ 0.25 \\ 0.25 \end{pmatrix} \end{split}\]

But how do you go from \(S_1\) to \(S_2\)? What about \(S_{500}\)? Let’s go back to our matrix. As seen in the matrix, if you begin at \(A\) there is 0.50 likelihood that you will remain at \(A\) and so on for the remaining states. This is essentially multiplying \(S_0\) by the \(P(S_1)\). Let’s break it down in the equation below.

\[\begin{split} \begin{pmatrix} 0.50 \\ 0.25 \\ 0.25 \end{pmatrix} = \begin{pmatrix} 0.50 & 0.25 & 0.25 \\ 0.33 & 0.33 & 0.33 \\ 0.90 & 0.05 & 0.05 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} \end{split}\]

Now what about moving from \(S_1\) to \(S_2\)? It’s simple! \(S_2 = P(S_1)\) Let’s see this below.

\[\begin{split} \begin{pmatrix} 0.375 \\ 0.33 \\ 0.475 \end{pmatrix} = \begin{pmatrix} 0.50 & 0.25 & 0.25 \\ 0.33 & 0.33 & 0.33 \\ 0.90 & 0.05 & 0.05 \end{pmatrix} \begin{pmatrix} 0.50 \\ 0.25 \\ 0.25 \end{pmatrix} \end{split}\]

This can also be completed in Python as well.

Implementing in python#

# import numpy
import numpy as np
from IPython.display import display, Math

# create our transition matrix
X = np.array([[.5, .25, .25], [.33, .33, .33], [.9, .05, .05 ]])
X

# our S1 vector
s1 = np.array([[0.50], [0.25], [0.25]])
s1

# calculate answer
ans = X @ s1
#or
#ans = np.dot(s1, X)

# display
display(Math(r'S_2 =\begin{pmatrix} %g \\ %g \\ %g \end{pmatrix}' %(ans[0][0],ans[1][0], ans[2][0])))
\[\begin{split}\displaystyle S_2 =\begin{pmatrix} 0.375 \\ 0.33 \\ 0.475 \end{pmatrix}\end{split}\]

But how do we continue this to, say, \(P_{500}\)? Well if \(S_2 = P(S_1)\) and \(S_1 = P(S_0)\) then getting to the \(n^{th}\) state would be…

\[ S_n = P^n(S_0) \]

where \(n\) is the state you wish to identify and \(P\) is the transition matrix.

Let’s try this in Python.

Implementing in python#

import numpy as np
from numpy.linalg import matrix_power
from IPython.display import display, Math

# create our transition matrix
X = np.array([[.5, .25, .25], [.33, .33, .33], [.9, .05, .05 ]])
X

# create S_0
s0 = np.array([[1], [0], [0]])

# calculate 500th state
s500 = matrix_power(X, 500)@s0

# display
display(Math(r'S_{500} =\begin{pmatrix} %g \\ %g \\ %g \end{pmatrix}' %(s500[0][0], s500[1][0], s500[2][0])))
\[\begin{split}\displaystyle S_{500} =\begin{pmatrix} 0.180769 \\ 0.178828 \\ 0.181093 \end{pmatrix}\end{split}\]