Linear codes
Hamming codes are special cases of a larger family of codes called linear codes: one in which the codewords form a vector space.
These are constructed by judicious choices of the generators matrices or parity-check matrices (since one determines the other), and can offer different trade-offs between the code rates and distances.
We have already seen the example of the Hamming [7,4,3] code, but let’s state the general framework a bit more abstractly.
An [n,k,d] linear code C is described by two matrices:
- The generator matrix G, which is an (n\times k) matrix
G =
\begin{bmatrix}
\begin{array}{c}
\mathbf{1}_{k\times k}
\\P
\end{array}
\end{bmatrix}
for some ((n-k)\times k) matrix P.
The columns of G are codewords, and and form a basis for the codespace \operatorname{range}(G).
- The parity-check matrix H, which is an ((n-k)\times n) matrix
H =
\begin{bmatrix}
\begin{array}{cc}
Q & \mathbf{1}_{(n-k)\times(n-k)}
\end{array}
\end{bmatrix}
for some ((n-k)\times k) matrix Q.
The columns of H are error syndromes.
The matrices G and H have to satisfy one of the two equivalent conditions
\operatorname{range}(G)
= \ker(H)
\qquad\text{or}\qquad
\operatorname{range}(H^T)
= \ker(G^T).
We can ensure this by taking Q=-P (see Exercise 14.11.4).
One last piece of the puzzle that we need to understand is the notion of dual codes.
We will write \operatorname{range}(G) to mean the vector space spanned by the columns of the matrix G.
Given a code C=(G,H) expressed in terms of its generator matrix and parity-check matrix, we know that the columns of G span the kernel of H, i.e. that \operatorname{range}(G)=\ker(H).
Why is this?
Well, because this is equivalent to saying that a vector is in the span of the columns of G exactly when it is of the form G\mathbf{d} for some data vector \mathbf{d}, i.e. exactly when it is a codeword, and we have already shown this above.
In particular then,
H\cdot G
= 0
(which merely says that \operatorname{range}(G)\subseteq\ker(H)).
But, taking the transpose of the above equality, we see that
G^T\cdot H^T
= 0
and so we can define a code C^\perp=(H^T,G^T), whose codewords are exactly the columns of H^T, i.e. the rows of H.
This is known as the dual code of C.
Since G has n rows and k columns, and H has n-k rows and n columns, we see that the dimension of the codespace of C (i.e. the span of the columns of G) and the dimension of the codespace of C^\perp (i.e. the span of the rows of H) must sum to n.
In fact, if C is an [n,k,d] code, then C^\perp is an [n,n-k,d'] code, where d' is not usually related to d in any obvious way.
A nice example is the Hamming [7,4,3] code, whose dual is a [7,3,4] code known as the shortened Hadamard code.
Given a code C=(G,H), its dual code is C^\perp=(H^T,G^T).
It is immediate that (C^\perp)^\perp=C, but it is interesting to ask about the relationship between C and C^\perp.
Then we say that a code is weakly self-dual (or self-orthogonal) if \operatorname{range}(G)\subseteq\operatorname{range}(H^T), and self-dual if \operatorname{range}(G)=\operatorname{range}(H^T).
In other words, a code is weakly-self dual if its codespace is a subspace of the codespace of its dual, and self-dual if these two codespaces are actually equal.
We said above that \dim\operatorname{range}(G)+\dim\operatorname{range}(H^T)=n, so we see that for a code to be self-dual it must be the case that n is even, but this is only a necessary condition, not a sufficient one!