Lecture 16 (17.11.2015)

Chapter 4: Algebraic Invariants

In this chapter, we will study another method of investigating knots/links. Our methods will be of algebraic nature, based on matrix manipulations. To begin with, we introduce a graphical calculus for matrices, and recall some properties of and operations on matrices.

As you know, an $(n\times m)$-matrix $R$ is a rectangular array with $n$ rows and $m$ columns, $$\begin{align*}R=\left(\begin{array}{cccc}R_{11} & R_{12} & … & R_{1m}\\ R_{21} & R_{22} & … & R_{2m}\\ \vdots & \vdots & \ddots & \vdots\\ R_{n1} & R_{n2} & … & R_{nm} \end{array}\right)\end{align*}$$

The entries of $R$ are often denoted $R_{ab}\in\mathbb C$, $a=1,…,n$, $b=1,…,m$. For our purposes, however, it will be more convenient to put the first index, i.e. the row index, on top, and the second index, i.e. the column index, at the bottom, and write $R^a_b$ instead of $R_{ab}$.

Moreover, it is not necessary to use the index set $I=\{1,…,n\}$ for the rows and $J=\{1,…,m\}$ for the columns – any finite sets $I,J$ will do. For example, $I=\{$ red, green, blue$\}$ is a perfectly acceptable index set to label the rows (or columns) of a matrix, when three labels are sufficient. For a general finite set $I$, we write $|I|$ to denote the number of elements of $I$.

We will denote by $M_{I,J}$ the family of all matrices $R$ whose rows are labeled by $I$ and whose columns are labeled by $J$. In the case of square matrices, we also use the shorthand notation $M_I:=M_{I,I}$.

After these remarks, we begin to introduce our graphical notation for matrices. As a first step, we agree to denote the  matrix elements $R^a_b$, $a\in I, b\in J$, of a matrix $R\in M_{I,J}$, by
R1As in our notation $R^a_b$, the top line corresponds to the top (row) index, and the bottom line corresponds to the bottom (column) index of $R$.

When we are not interested in a specific matrix element of $R$, but rather the whole matrix $R$, we also write
R2without the labels $a,b$ for the row and column of $R$. Nonetheless, the upper line corresponds to the rows, and the lower line to the columns of $R$, as will become clear soon.

There are several operations that you can do with matrices: You can add matrices, multiply them with numbers, multiply two matrices (when their sizes fit), take the transpose or adjoint, take the trace, form tensor products … we will recall these notions now, and translate them into graphical notation.

The sum of matrices. Let $R,S\in M_{I,J}$ for two arbitrary finite sets $I,J$. Then $R+S\in M_{I,J}$ is defined by $$(R+S)^i_j=R^i_j+S^i_j\,,\qquad i\in I, j\in J\,.$$ Graphically, we simply write
R3which corresponds exactly to the formula $(R+S)^i_j=R^i_j+S^i_j$.

The product of a matrix with a number. For a matrix $R\in M_{I,J}$ and some number $\lambda\in\mathbb C$, we have the product of $\lambda$ and $R$, $\lambda\cdot R\in M_{I,J}$, defined as $(\lambda\cdot R)^i_j=\lambda\cdot R^i_j$ for all $i,j$. Graphically, we write
R4The matrix product. If we have three index sets $I,J,K$, and matrices $R\in M_{I,J}$, $S\in M_{J,K}$, then we can form the product $R\cdot S$, which is a matrix in $M_{I,K}$, defined by $(R\cdot S)^i_k=\sum_{j\in J}R^i_j S^j_k$, for all $i\in I, k\in K$. We might write this graphically as
R5

However, we will use an abbreviated version of this: Since the index $j$ appears twice, once (as a column index) on $R$, and once (as a row index) on $S$, it is a good idea to connect these two lines — lines represent indices in our graphical notation, so if we have a single line between the boxes representing $R$ and $S$, this corresponds to a single index, as it should be.

Furthermore, the result of the multiplication $R\cdot S$ is again a matrix, and should therefore also be represented by a box with two lines, one at the top (row index) and one at the bottom (column index). We therefore might consider the picture
R6This is basically the notation we will use, with one more change to be made: We drop the sum symbol and agree that all “internal lines”, i.e. lines with no free ends, represent indices that are summed over. We thus write
R7for the matrix elements $(RS)^i_k$ of the product $RS$, and
R8for the product matrix itself. Let us make some remarks about this:

  • lines with one open end represent indices that are not fixed — given a matrix, you still have to fix a row and column to determine a matrix element.
  • lines with no open end represent indices that are summed over. The overall result therefore does not depend on the index represented by such an inner line, as in $(RS)^i_k=\sum_j R^i_j S^j_k$, which does not depend on $j$.
  • The precise shape and length of lines is irrelevant.
  • The order of the factors in a matrix product is important. In general, $RS$ is different from $SR$ (if that is defined at all). Correspondingly, in our last picture above, it is important that the first factor is represented by the box at the top, and the second factor is represented by the box at the bottom.

Some further additions will clarify this graphical notation more. For any index set $I$, we have the square identity matrix $1\in M_I$, with entries $$1^a_b=\delta^a_b=\begin{cases} 1 & a=b\\ 0& a\neq b\end{cases}, a,b\in I.$$ As you know, the identity matrix satisfies $$1\cdot S=S\,,\qquad R\cdot 1=R$$ for all matrices $S\in M_{I,J}$, $R\in M_{K,I}$.

We therefore drop the box around the identity matrix, and simply write
R9for the identity matrix, and
R10

for its entries. This fits very well with our graphical notation for the matrix product, because with $S\in M_{I,J}$, $R\in M_{K,I}$, we then have
R11

and
R12

which directly expresses $R\cdot 1=R$ and $1\cdot S=S$.

It is also instructive to consider vectors as well, i.e. matrices with either a single column or a single row only. A column vector $v$ is an $(n\times 1)$-matrix, $$v=\left(\begin{array}{c}v^1_1\\ v^2_1\\ \vdots \\ v^n_1\end{array}\right)=\left(\begin{array}{c}v^1\\ v^2\\ \vdots \\ v^n\end{array}\right),$$ where we first used the general matrix notation $v^i_j$ for the entries, but then dropped the lower (column) index $j$, since there is only one column here, and hence $j=1$ for all entries. In our graphical notation, we follow the same convention, and drop the line representing the column index, which carries no information in the case of a single column. We therefore write
R13for a column vector $v$. The line on top represents the row index $i$ of $v^i$.

The situation for row vectors $w$ is analogous. These are $(1\times m)$-matrices, consisting of a single row, but several columns, $$w=(w^1_1,w^1_2,…,w^1_m)=(w_1,w_2,…,w_m),$$where we now dropped the row index $i=1$ from $w^i_j$. Graphically, we write
R14

where the lower line represents the column index $j=1,…,m$, and the top line has been dropped because there is only a single value for the row index in this case.

Let $R\in M_{I,J}$ be a matrix, $v$ a column vector with $|J|$ components $v^1,…,v^{|J|}$, and $w$ a row vector with $|I|$ components $w_1,…,w_{|I|}$. Then the matrix products $Rv$ and $wR$ are defined and yield a column vector $Rv$ and row vector $wR$, respectively. This is also clear from our graphical notation:
R15 R16

Note how the resulting matrices have no lower line (no column index, hence a single column, i.e. a column vector) and no upper line (no row indexd, hence a single row, i.e. a row vector).

It is also worth considering the products of a row vector $w$ and a column vector $v$. If $v$ and $w$ have the same number of components, then their product $wv=\sum_i w_iv^i$ is defined, and is a number (without any index left). This again fits well to our graphical notation,
R17Here the absence of any external lines on the right hand side indicates that $wv$ has no indices, and represents a number (“$1\times 1$ matrix”).

If, on the other hand, we consider the product $vw$, the situation is different. Since $v$ has one column, and $w$ one row, this product is also well-defined, and yields a matrix — as is also clear from the graphical notation
R18

because in this order, two external lines appear, indicating a row as well as a column index.

 

We will also consider some other matrix operations.

The transpose. The transpose $R^T$ of a matrix $R\in M_{I,J}$ is a matrix $R^T\in M_{J,I}$, defined by $(R^T)^j_i:=R^i_j$. Since this corresponds to swapping rows and columns, our graphical notation for this is
R19

Note that the transpose of a row vector is a column vector, and draw the corresponding graphical expressions for that.

The adjoint. Closely related to the transpose of a matrix is the adjoint. For $R\in M_{I,J}$, its adjoint $R^*\in M_{J,I}$ is defined as $(R^*)^j_i=\overline{R^i_j}$, where the bar means complex conjugation. Graphically,

R20where the bar also means complex conjugation, and is coloured red in order not to confuse it with the lines of the diagram.

The trace. With the conventions introduced so far, what could
R21mean for a matrix $R$? Since the upper and lower line are connected, they only represent a single index (“one line = one index”). This only makes sense if the index sets for the upper and lower indices coincide, i.e. we must have a “square” matrix $R\in M_I=M_{I,I}$. Then, according to our conventions, we should sum over the index represented by the line, because the line has no open ends. Thus the above diagram represents $$\sum_{i\in I}R^i_i,$$ the sum of the diagonal entries of $R$, which is precisely the trace of $R$, Tr$(R)=\sum_{i\in I}R^i_i$.

 

The tensor product. There is one more operation that we will make heavy use of – the tensor product of matrices. Given matrices $R\in M_{I_1,J_1}$ and $S\in M_{I_2,J_2}$, with arbitrary index sets $I_1$, $I_2$, $J_1$, $J_2$, the tensor product of $R$ and $S$, written $R\otimes S$, is defined as a matrix in $M_{I_1\times I_2, J_1\times J_2}$. That means that the “rows” of $R\otimes S$ are indexed by ordered pairs $(i_1,i_2)$, where $i_1\in I_1$, $i_2\in I_2$, and the “columns” of $R\otimes S$ are indexed by ordered pairs $(j_1,j_2)$, where $j_1\in J_1$, $j_2\in J_2$. The definition of the tensor product is $$(R\otimes S)^{i_1 i_2}_{j_1 j_2} = R^{i_1}_{j_1}\cdot S^{i_2}_{j_2}.$$ Here $i_1i_2$ means the ordered pair $(i_1,i_2)$, and analogously, $j_1j_2$ means $(j_1,j_2)$.

Since the set of ordered pairs $I_1\times I_2=\{(i_1,i_2)\,:\,i_1\in I_1,\;i_2\in I_2\}$ has $|I_1\times I_2|=|I_1|\cdot|I_2|$ elements, the size of the matrices change under the tensor product — for example, the tensor product of a $(n_1\times m_1)$ matrix with a $(n_2\times m_2)$ matrix can be understood as a $(n_1\cdot n_2)\times (m_1\cdot m_2)$ matrix.

As an example, consider the $(2\times 2)$-matrices $$\begin{align*}R=\left(\begin{array}{cc} a&b\\ c&d\end{array}\right),\qquad S=\left(\begin{array}{cc} a’&b’\\ c’&d’\end{array}\right),\end{align*}$$ which are both elements of $M_I$ with $I=\{1,2\}$. Hence $I\times I=\{(1,1),(1,2),(2,1),(2,2)\}$, and we find for the tensor product (which is a $(4\times 4)$-matrix in this case) $$\begin{align*}R\otimes S=\left(\begin{array}{cc} aS&bS\\ cS&dS\end{array}\right)=\left(\begin{array}{cccc} aa’&ab’&ba’&bb’\\ ac’&ad’&bc’&bd’\\ca’&cb’&da’&db’\\cc’&cd’&dc’&dd’\end{array}\right),\end{align*}$$ when we agree to index the rows and columns of $R\otimes S$ by $I\times I$ in the order $(1,1),(1,2),(2,1),(2,2)$. In most cases, however, it will not be necessary to represent tensor products as such two-dimensional arrays again — rather, we think of them as objects with two upper indices $i_1,i_2$ and two lower indices $j_1,j_2$ , i.e. one has to specify the values of these four indices to extract a number, $(R\otimes S)^{i_1i_2}_{j_1j_2}$, from $R\otimes S$. From this point of view, $R\otimes S$ behaves as a “four-dimensional table”.

Graphically, we represent tensor products as
R22

where the two upper lines represent the two upper (“row”) indices of $R\otimes S$, and the two lower lines represent the two lower (“column”) indices of $R\otimes S$.

We can also consider sets of all matrices with two upper and lower indices, say, $M_{I_1\times I_2,J_1\times J_2}$. These can be added and multiplied by numbers as any matrix. Note, however, that not all matrices $X\in M_{I_1\times I_2,J_1\times J_2}$ are of the form $X=R\otimes S$ for some $R\in M_{I_1,J_1}$ and $S\in M_{I_2,J_2}$. (Exercise: Find a counterexample) But any$X\in M_{I_1\times I_2,J_1\times J_2}$ can be written as a finite sum of the form $X=\sum_{a=1}^N R_a\otimes S_a$ for suitable $R_a\in M_{I_1,J_1}$ and $S_a\in M_{I_2,J_2}$ (Exercise: Prove this.)

Graphically, any such tensor product matrix (or “tensor”, for short) $X\in M_{I_1\times I_2,J_1\times J_2}$ is written as
R23in complete analogy to what we did with $R\otimes S$ above.

As you can imagine, it does not stop here: Instead of pairs of indices, we can also consider ordered $n$-tuples $(i_1,…,i_n)$ of indices, and consider objects (tensors) $T$ with $n$ upper and $m$ lower indices, i.e. $T^{i_1…i_n}_{j_1…j_m}$. Graphically, these would be represented by boxes with $n$ upper and $m$ lower lines. If $T$ has as many lower indices as $U$ has upper indices, running over the same index sets, then we can consider the product $TU$ in complete analogy to the matrix product: “sum over all joint indices, and take products”: $$(TU)^{i_1…i_n}_{k_1…k_r}=\sum_{j_1,…,j_m}T^{i_1…i_n}_{j_1…j_m}U^{j_1…j_m}_{k_1…k_r}.$$

There are a couple of rules about calculating with tensors. For example, for $R\in M_{I_1,J_1}$, $S\in M_{I_2,J_2}$, $\tilde{R}\in M_{J_1,K_1}$, $\tilde{S}\in M_{J_2,K_2}$, there holds $$(R\otimes S)(\tilde{R}\otimes \tilde{S})=R\tilde{R}\otimes S\tilde{S}.$$Here and in the following, we do not indicate the matrix product with a symbol, i.e. $R\tilde{R}=R\cdot \tilde{R}$. The above equation can be proven in two different ways: 1) One considers arbitrary matrix elements of the left and right hand sides and checks, using the definitions, that the equality holds (which I recommend as an exercise, and see the solution below). Or: 2) One uses the graphical calculus:
R24In this graphical calculation, we first used the matrix product, then the tensor product, and then again the matrix product. In formulas, this amounts to $$\begin{align*}((R\otimes S)(\tilde{R}\otimes\tilde{S}))^{i_1i_2}_{k_1k_2}&=\sum_{j_1,j_2}(R\otimes S)^{i_1i_2}_{j_1j_2}(\tilde{R}\otimes\tilde{S})^{j_1j_2}_{k_1k_2}\\ &= \sum_{j_1,j_2}R^{i_1}_{j_1}S^{i_2}_{j_2}\tilde{R}^{j_1}_{k_1}\tilde{S}^{j_2}_{k_2}\\ &= \sum_{j_1}R^{i_1}_{j_1}\tilde{R}^{j_1}_{k_1}\sum_{j_2}S^{i_2}_{j_2}\tilde{S}^{j_2}_{k_2}\\ &= (R\tilde{R})^{i_1}_{k_1}\,(S\tilde{S})^{i_2}_{k_2}\\ &= ((R\tilde{R})\otimes(S\tilde{S}))^{i_1i_2}_{k_1k_2}.\end{align*}$$

Another property of the tensor product is that $R\otimes S$ is linear in both $R$ and $S$, as you can also check as an exercise.

We conclude this lecture with another exercise.

Exercise 4.1: Let $I$ be a finite index set and define for $a,b\in I$ the elementary matrices $E_{a,b}\in M_I$ by $$(E_{a,b})^i_j:=\delta^i_a\cdot\delta_{bj}.$$ That is, $E_{a,b}$ is that matrix that has a single 1 in row $a$, column $b$, and zeros in all other entries.

Now define a matrix $U\in M_{I^2}$, where $I^2=I\times I$, by $$U^{ab}_{cd}=\delta^{ab}\delta_{cd},\qquad a,b,c,d\in I.$$

Show

  • $E_{r,s} E_{i,j}=\delta_{is}\cdot E_{r,j}$.
  • $U=\sum_{i,j} E_{i,j}\otimes E_{i,j}$.
  • $(U\otimes 1)(1\otimes U)(U\otimes 1)=(U\otimes 1)$, where “1” denotes the identity matrix in $M_{I}$. Note that $U\otimes 1, 1\otimes U\in M_{I^3}$ have three upper and three lower indices. It is helpful to make use of the first two parts of the exercise to complete the third.