Per Wikipedia, in Grover's algorithm the probability of observing search string $\omega$ after $r$ iterations is: $$\left|\begin{bmatrix}\langle \omega | \omega \rangle &\langle \omega | s \rangle\end{bmatrix} (U_sU_\omega )^r \begin{pmatrix} 0 \\ 1\end{pmatrix}\right|^2$$ How does one derive this expression? Shouldn't it just have been $|\langle \omega |(U_sU_\omega)^r|s\rangle|^2$?
2 Answers
The mapping scheme set up by the author in this section is different from that of normal unitary operations. In the context of this equation, the author is referring to the (non-unitary) action of $U_s U_\omega$ on a particular two dimensional subspace of the full Hilbert space, specifically on the plane defined by the (non-orthogonal, non-complete) basis vectors $\lbrace \vert \omega \rangle, \vert s \rangle \rbrace$.
The action of $U_s U_\omega$ in this basis is shown to be $$U_s U_\omega = \begin{bmatrix} 1 & \frac{2}{\sqrt{N}} \\ \frac{-2}{\sqrt{N}} & 1- \frac{4}{N} \end{bmatrix},$$ which is clearly not unitary. $U_s U_\omega$ refers to this specific action in both the referenced equation and the following discussion. In this context, $U_s U_\omega$ is an element of $SL(2, \mathbb{R})$, and the system is set up geometrically such that $U_s U_\omega$ effects rotations within the plane spanned by $\lbrace \vert \omega \rangle, \vert s \rangle \rbrace$.
The mapping for the action of $U_s U_\omega$ is given by $$U_s U_\omega : a \vert \omega \rangle + b \vert s \rangle \mapsto \begin{bmatrix} \vert \omega \rangle & \vert s \rangle \end{bmatrix} U_s U_\omega \begin{bmatrix} a \\ b \end{bmatrix},$$ where $\begin{bmatrix} \vert \omega \rangle & \vert s \rangle \end{bmatrix}$ is an $N \times 2$ matrix. The author uses this mapping to rotate $\vert s \rangle$ into $\vert \tilde s \rangle$ (the latter designation introduced here for clarity) according to $$\vert s \rangle = 0 \vert \omega \rangle + 1 \vert s \rangle \xrightarrow{(U_s U_\omega)^r} \begin{bmatrix} \vert \omega \rangle & \vert s \rangle \end{bmatrix} (U_s U_\omega)^r \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \vert \tilde s \rangle, $$ where the transformed vector, $\vert \tilde s \rangle$, is $\vert s \rangle$ rotated by $(U_s U_\omega)^r$ within the desired plane. The inner product of the rotated vector $\vert \tilde s \rangle$ with the dual vector of the marked state $\vert \omega \rangle$ is evaluated, using the Jordan normal form provided in the article, as $$\langle \omega \vert \tilde s \rangle = \begin{bmatrix} \langle \omega \vert \omega \rangle & \langle \omega \vert s \rangle \end{bmatrix} U_s U_\omega \begin{bmatrix} 0 \\ 1 \end{bmatrix}$$ $$ = \begin{bmatrix} 1 & \frac{1}{\sqrt{N}} \end{bmatrix} \begin{bmatrix} -i & i \\ e^{i t} & e^{-i t} \end{bmatrix} \begin{bmatrix} e^{i \, 2rt} & 0 \\ 0 & e^{-i \, 2rt} \end{bmatrix} \left( \tfrac{i}{2 \cos{t}} \right) \begin{bmatrix} e^{-i t} & -i \\ -e^{i t} & -i \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix},$$ where $\langle \omega \vert s \rangle = \tfrac{1}{\sqrt{N}}$ from earlier in the article and $t \equiv \arcsin \tfrac{1}{\sqrt{N}}$. Carrying through the multiplication leads to $$\langle \omega \vert \tilde s \rangle = \tfrac{1}{2\cos t} \left(\tfrac{1}{\sqrt{N}}(e^{i(2rt+t)}+e^{-i(2rt+t)}) - i (e^{i\,2rt}-e^{-i\,2rt}) \right)$$ $$=\tfrac{1}{\cos t} \left(\sin t \cos(2rt+t) + \sin 2rt \right).$$ Using the angle sum identity for $\cos(2rt+t)$ gives $$\langle \omega \vert \tilde s \rangle = \tfrac{1}{\cos t} \left( \sin t (\cos t \cos 2rt - \sin t \sin 2rt ) + \sin 2rt \right)$$ $$=\sin t \cos 2rt + \sin 2rt \left( \tfrac{1-\sin{}^2t}{\cos t} \right) = \sin ((2r+1)t),$$ which is the identity the author was working towards. The probability of observing $\omega$ after $r$ iterations is then simply $$\vert \langle \omega \vert \tilde s \rangle \vert^2 = \sin{}^2((2r+1)t),$$ as desired. Note, however, that in this case the norm on the LHS could be replaced by parentheses, since the derivation above shows that $\langle \omega \vert \tilde s \rangle$ is guaranteed to be real. This is not surprising since the operator in the relevant plane is in $SL(2,\mathbb{R})$.
While not directly responsive to the question, there is some interesting group/representation theory underpinning the math above. Without going into detail, it seems worth noting that $\vert \text{Tr}(U_s U_\omega) \vert <2$ implies that $U_s U_\omega$ is elliptic and therefore conjugate to a rotation, and that $SL(2,\mathbb{R}) \cong SU(1,1) \cong Sp(2,\mathbb{R})$.
- 3,452
- 9
- 20
Another answer already addressed how the specific expression in the OP means and how to derive it. Here I'll show another way to derive the same result (very similar to the derivation in this other answer on another question, though with different notation).
Using the notation $\mathbb P_\psi\equiv|\psi\rangle\!\langle\psi|$, we have $U_s\equiv 2\mathbb P_s-I$ and $U_\omega\equiv I-2\mathbb P_\omega$, and thus $$U_s U_\omega=-I+2(\mathbb P_s+\mathbb P_\omega)-4\mathbb P_s\mathbb P_\omega.\tag{AA}$$
Consider the decomposition of $|s\rangle$ in a basis containing $|\omega\rangle$: $$|s\rangle=\sin\alpha |\omega\rangle+\cos\alpha e^{i\phi}|\omega_\perp\rangle,\tag{A}$$ for $\alpha,\phi\in\mathbb R$, and with $|\omega_\perp\rangle$ a state orthogonal to $|\omega\rangle$ ${}^{(1)}$. The state $|\omega_\perp\rangle$ here is uniquely determined by $|s\rangle,|\omega\rangle$, and the orthogonality requirement. Expanding the various elements of (AA) we then have $$I=\mathbb P_\omega + \mathbb P_{\omega_\perp} + \mathbb P_\perp,$$ $$\mathbb P_s=\sin^2(\alpha)\,\mathbb P_\omega + \cos^2(\alpha)\mathbb P_{\omega_\perp} + \sin\alpha\cos\alpha(e^{-i\phi}|\omega\rangle\!\langle\omega_\perp|+e^{i\phi}|\omega_\perp\rangle\!\langle\omega|),$$ $$\mathbb P_s\mathbb P_\omega=\sin^2(\alpha) \mathbb P_\omega + \sin\alpha\cos\alpha e^{i\phi}|\omega_\perp\rangle\!\langle\omega|,$$ where we defined $\mathbb P_\perp$ as the projector onto the space orthogonal to $|\omega\rangle$ and $|\omega_\perp\rangle$. Putting everything together we get $$U_s U_\omega = -\mathbb P_\perp + \cos(2\alpha)(\mathbb P_\omega+\mathbb P_{\omega_\perp}) +\sin(2\alpha)(e^{-i\phi}|\omega\rangle\!\langle\omega_\perp|-e^{i\phi}|\omega_\perp\rangle\!\langle\omega|).\tag B$$ This is telling us that $U_s U_\omega$ is really a rotation matrix in $\mathrm{span}(|\omega\rangle,|\omega_\perp\rangle\}$ (which we should have expected from it being a product of two reflections), and acts trivially (modulo minus sign) on the rest of the space.
If we assume to be working on states in $\mathrm{span}(|\omega\rangle,|\omega_\perp\rangle\}$ (which contains $|s\rangle$ by definition of $|\omega_\perp\rangle$), we can then ignore the $\mathbb P_\perp$ term. We can furthermore ease the notation by relabeling $|\omega\rangle\to|0\rangle$ and $|\omega_\perp\rangle\to|1\rangle$. Then, $$U_s U_\omega = \begin{pmatrix}\cos(2\alpha) & e^{-i\phi}\sin(2\alpha) \\ -e^{i\phi}\sin(2\alpha) & \cos(2\alpha)\end{pmatrix}\in\mathrm{SU}(2)\tag C$$ Note that all of this, going from (AA) to (C), doesn't have that much to do with Grover's algorithm or anything quantum: we are simply showing how the product of two reflections with respect to vectors with internal angle $\alpha$, amounts to a rotation by an angle of $2\alpha$.
The conclusion is now at hand: $U_s U_\omega$ is a rotation matrix in this space. From this observation, it is not hard to see that the matrix powers can be written as: $$(U_s U_\omega)^r = \begin{pmatrix}\cos(2r\alpha) & e^{-i\phi}\sin(2r\alpha) \\ -e^{i\phi}\sin(2r\alpha) & \cos(2r\alpha)\end{pmatrix}.$$ Finally, $$(U_s U_\omega)^r \begin{pmatrix}\sin\alpha \\ e^{i\phi}\cos\alpha\end{pmatrix} = \begin{pmatrix}\sin((2r+1)\alpha) \\ e^{i\phi}\cos((2r+1)\alpha)\end{pmatrix}.$$ The final probability after $r$ steps is therefore $\sin^2((2r+1)\alpha)$.
An interesting observation is that at no point in all of this we are using the fact that the dynamic happens in a high-dimensional space. Indeed, we could have identified $|\omega\rangle\sim|0\rangle$ and $|\omega_\perp\rangle\sim|1\rangle$ from the beginning and thought in terms of the evolution of a single qubit, and no generality would have been lost. In other words, Grover's algorithm really only involves an evolution in an effective two-dimensional space. You can then visualise what's going on in the Bloch sphere, and if you do that you find the state evolving along the geodesic path connecting initial and target point (which can also be understood as due to the fact that Grover's algorithm is indeed nothing but a discretisation of such geodesic path).
See also this answer on math.SE for a different way to prove this result.
${}^{(1)}$ I could write this decomposition more explicitly, being $|s\rangle$ in the OP a specific state (the balanced superposition over all states). However, I don't want to use properties of this particular $|s\rangle$, in order to make the result hold equally for any choice of $|s\rangle$.
- 27,510
- 7
- 37
- 125