14.2 Linear codes

Hamming codes are special cases of a larger family of codes called linear codes: one in which the codewords form a vector space. These are constructed by judicious choices of the generators matrices or parity-check matrices (since one determines the other), and can offer different trade-offs between the code rates and distances. We have already seen the example of the Hamming [7,4,3] code, but let’s state the general framework a bit more abstractly.

An [n,k,d] linear code C is described by two matrices:

  • The generator matrix G, which is an (n\times k) matrix G = \begin{bmatrix} \begin{array}{c} \mathbf{1}_{k\times k} \\P \end{array} \end{bmatrix} for some ((n-k)\times k) matrix P. The columns of G are codewords, and and form a basis for the codespace \operatorname{range}(G).
  • The parity-check matrix H, which is an ((n-k)\times n) matrix H = \begin{bmatrix} \begin{array}{cc} Q & \mathbf{1}_{(n-k)\times(n-k)} \end{array} \end{bmatrix} for some ((n-k)\times k) matrix Q. The columns of H are error syndromes.

The matrices G and H have to satisfy one of the two equivalent conditions \operatorname{range}(G) = \ker(H) \qquad\text{or}\qquad \operatorname{range}(H^T) = \ker(G^T). We can ensure this by taking Q=-P (see Exercise 14.11.4).

One last piece of the puzzle that we need to understand is the notion of dual codes. We will write \operatorname{range}(G) to mean the vector space spanned by the columns of the matrix G. Given a code289 C=(G,H) expressed in terms of its generator matrix and parity-check matrix, we know that the columns of G span the kernel of H, i.e. that \operatorname{range}(G)=\ker(H). Why is this? Well, because this is equivalent to saying that a vector is in the span of the columns of G exactly when it is of the form G\mathbf{d} for some data vector \mathbf{d}, i.e. exactly when it is a codeword, and we have already shown this above. In particular then, H\cdot G = 0 (which merely says that \operatorname{range}(G)\subseteq\ker(H)).

But, taking the transpose of the above equality, we see that G^T\cdot H^T = 0 and so290 we can define a code C^\perp=(H^T,G^T), whose codewords are exactly the columns of H^T, i.e. the rows of H. This is known as the dual code of C. Since G has n rows and k columns, and H has n-k rows and n columns, we see that the dimension of the codespace of C (i.e. the span of the columns of G) and the dimension of the codespace of C^\perp (i.e. the span of the rows of H) must sum to n. In fact, if C is an [n,k,d] code, then C^\perp is an [n,n-k,d'] code, where d' is not usually related to d in any obvious way. A nice example is the Hamming [7,4,3] code, whose dual is a [7,3,4] code known as the shortened Hadamard code.

Given a code C=(G,H), its dual code is C^\perp=(H^T,G^T).

It is immediate that (C^\perp)^\perp=C, but it is interesting to ask about the relationship between C and C^\perp. Then we say that a code is weakly self-dual (or self-orthogonal) if \operatorname{range}(G)\subseteq\operatorname{range}(H^T), and self-dual if \operatorname{range}(G)=\operatorname{range}(H^T). In other words, a code is weakly-self dual if its codespace is a subspace of the codespace of its dual, and self-dual if these two codespaces are actually equal. We said above that \dim\operatorname{range}(G)+\dim\operatorname{range}(H^T)=n, so we see that for a code to be self-dual it must be the case that n is even, but this is only a necessary condition, not a sufficient one!

  1. Usually, for linear codes, people talk of the code C as being equal to the codespace, i.e. the span of the columns of G. For now, however, it is notationally simpler to denote a code by these two key matrices.↩︎

  2. This tells us that \operatorname{range}(H^T)\subseteq\ker(G^T), but we can show that this is an equality using the fact that \operatorname{range}(G)=\ker(H) is an equality.↩︎