## Basic quantum coding and decoding

Suppose Alice randomly chooses one of the pre-agreed N signal states |s_k\rangle and sends it to Bob, who tries to identify the signal states by performing a measurement defined by the projectors P_l.
Let P be a projector on a subspace spanned by the signal states |s_k\rangle, i.e. P|s_k\rangle = |s_k\rangle.
The dimension d of this subspace is given by d = \operatorname{tr}P.
We shall assume, without any loss of generality, that Bob designed his measurement in such a way that, whenever he gets outcome P_k, he concludes that Alice sent state |s_k\rangle.
His probability of success is given by
\Pr(\text{success})
= \frac{1}{N} \sum_k \langle s_k|P_k|s_k\rangle
which is the probability that signal state |s_k\rangle is selected (here equal to 1/N, since all the signal states are equally likely) times the probability that the selected signal state is correctly identified by Bob (which is \langle s_k|P_k|s_k\rangle), and we sum over all signal states.

We have the following **trace identities**:

- \operatorname{tr}(ABC) = \operatorname{tr}(BCA) = \operatorname{tr}(CAB)
- \operatorname{tr}|a\rangle\langle b| = \langle b|a\rangle
- \operatorname{tr}A|a\rangle\langle b| = \langle b|A|a\rangle
- \operatorname{tr}BP \leqslant\operatorname{tr}B for any positive semi-definite B and projector P.

(To prove this last identity, consider the projector Q=\mathbf{1}-P, and note that
\begin{aligned}
\operatorname{tr}B
&= \operatorname{tr}B(P+Q)
\\&= \operatorname{tr}BP + \operatorname{tr}BQ
\end{aligned}
and that \operatorname{tr}BQ is non-negative.)

Let us use this case to practice some of the trace identities.
It is often convenient to write expressions such as \langle\psi|A|\psi\rangle in terms of the trace: for any vector |\psi\rangle and operator A we have
\begin{aligned}
\langle\psi|A|\psi\rangle
&= \operatorname{tr}(A|\psi\rangle\langle\psi|)
\\&= \operatorname{tr}(|\psi\rangle\langle\psi| A).
\end{aligned}
In our case,
\begin{aligned}
\Pr(\text{success})
&= \frac{1}{N} \sum_k \langle s_k|P_k|s_k\rangle
\\&= \frac{1}{N} \sum_k \langle s_k|PP_kP|s_k\rangle
\\&= \frac{1}{N} \sum_k \operatorname{tr}(PP_kP|s_k\rangle\langle s_k|)
\end{aligned}
where we have also used that P|s_k\rangle=|s_k\rangle.
Let us bound this expression above by using the aforementioned trace identities:
\begin{aligned}
\sum_k\frac{1}{N} \langle s_k|P_k|s_k\rangle
&= \frac{1}{N} \sum_k \operatorname{tr}(PP_kP|s_k\rangle\langle s_k|)
\\&\leqslant\frac{1}{N} \sum_k \operatorname{tr}(PP_kP)
\\&= \frac{1}{N} P\left(\sum_k P_k\right)P
\\&= \frac{1}{N} \operatorname{tr}(P\mathbf{1}P)
\\&= \frac{1}{N} \operatorname{tr}(P)
\\&= \frac{d}{N}.
\end{aligned}

So if Alice encodes N equally likely messages as states in a quantum system that, mathematically speaking, lives in the Hilbert space of dimension d, and if Bob decodes by performing a measurement and inferring the message from the result, then Bob’s probability of success is bounded by \frac{d}{N}.
If the number N of possible signals exceeds the dimension d, then Bob will not be able to reliably distinguish between the signals by any measurement.
In particular:

In this setting, one qubit can store *at most* one bit of information that can *reliably* be read by a measurement.