11

I am trying to understand Grover's algorithm. I read that this algorithm is able to find an entry in an unsorted list in just $\sqrt N$ steps, and needs only $\log N$ space.

I understand entanglement and superposition, and I also understand most parts of Shor's Algorithm. When it comes to Grover's algorithm, I often read that this algorithm uses an oracle. But as far as I understand, oracles do not really exist. They are used to describe hypothetical nondeterministic machines. But quantum computers do exist. The algorithm running on a real quantum computer will not really use a clairvoyant ghost who knows things nobody can know at this time.

So, how does it work?

I do not want to read scientific papers that deal with all the details. I just want a brief but correct overview.

Let's use an example: All 7 books about Harry Potter contain approximately 1 million words ($N = 1.000.000$). I want to know if the word "teapot" exists in one of the Harry Potter books, and if so, I want to get the position of at least one occurrence within the text.

If the word doesn't exist in the text, it will take me 1 million comparisons on a normal computer, i.e. 1 million steps. Grover will need only $\sqrt N = 1000$ steps to search through a list of $N = 1.000.000$ unsorted words. How?


Addendum

(reaction to comments)

I am not interested in an explanation about how the oracle works. I do not believe in fairies, elves or oracles. I believe in mathematics and physics. Oracles came into computer science when someone wanted to explain how nondeterministic machines like nondeterministic pushdown automatons will work. But nondeterministic machines do not exist in the real world. Even quantum computers aren't nondeterministic. So, there is no need to use fantasy beings with superpowers to explain quantum computers, since quantum computers do exist while wizards, magicians and oracles don't.

What I want is an explanation of grover's algorithm that uses only things that really exist in reality, i.e. quantum bits, superposition, entanglement, unitary operators etc. plus all the classic parts that still will be needed to sum up to the full algorithm.

And please use my Harry Potter example. If it's easier for you, let's assume that the word "teapot" exists exactly one times in the whole story, so that we know that there is exactly 1 needle in the haystack.

One thing that I do not understand, is this: How can the algorithm read all seven harry potter books (the complete haystack) and tell where exactly the word "teapot" is written using a number of steps that is proportional to the square root of the input length? How can the algorithm even read an input of $N$ items in a time proportional to $\sqrt N$?


Addendum II

I have learned form Norbert's Answer that Grover's Algorithm is unable to search for items in a database (like words in a list of words). So, what can it be used for? Norbert suggested, that Grover's Algorithm might give a square-root speed up for brute-force algorithms used to solve NP-Problems.

Traveling Salesmen is a problem in NP. How can you use Grover's Algorithm to find the shortest path?

If Traveling Salesman is not feasible, please use any other NP problem as an example, and please explain, how Grover's Algorithm can be used to solve it.

I still want to have a brief explanation of Grover's Algorithm (i.e. Grover's algorithm in a nutshell) but didn't get a sufficient answer yet.

4 Answers4

5

I will attempt here to address the question of how Grover's algorithm would work, for the specific task of searching a target entry (let's say a word) in a database of such entries (shall we call it a book?). There have already been plenty of discussions about the general structure of the algorithm (e.g. Is there a layman's explanation for why Grover's algorithm works?, How does the Grover diffusion operator work and why is it optimal?, Why does Grover's search invert about the mean?), as well as how to use it to speed-up finding solutions to NP problems (e.g. here you can find an explicit circuit to apply Grover to a specific 3SAT problem, and here is a breakdown of the steps to apply Grover to a generic SAT problem, and here is a discussion on how to make a SAT problem into an oracle), so I won't focus on these points here.

The specific application of Grover to database searching has also been discussed e.g. in (Grover's algorithm: where is the list?), (How is the Grover-Algorithm applied to a database?), (Does the oracle in Grover's algorithm need to contain information about the entirety of the database?) and (Grover's algorithm: what to input to Oracle?). This answer will thus inevitably overlap with bits and pieces scattered around between those various posts.


Setup: the classical problem

Consider the following classical problem. We have an indexed list of $N$ words. It is thus easy to retrieve a target word $w_x$ given its index $x=1,...,N$. We want to find the index $x_0$ such that $w_0\equiv w_{x_0}$ is some fixed target word. Classically, being the database unsorted, there isn't much to do other than just try each $x$ until $w_x$ is found to be equal to $w_0$.

NOTE: Here, $x$ represents the index of an entry $w_x$. This means that what is usually denoted with $x$ when discussing Grover's algorithm, becomes here $w_x$ instead.

Querying the database

Going quantum, the first thing we need to do is figure out how to handle the database. Grover's algorithm requires as input a superposition state of the form $$|D\rangle=\frac{1}{\sqrt N}\sum_x |x,w_x\rangle\in\mathcal H_{\text{index}}\otimes \mathcal H_{\text{entry}},\tag A$$ where $|x\rangle$ is the index of the corresponding "word" entry $|w_x\rangle$.

Note that generating (A) from a corresponding classical database is highly nontrivial, as it amounts to being able to query the (usually classical) database as if it was a quantum state. One possibility that has been put forward is to use a "QRAM" with input $|+\rangle\equiv\sum_x |x\rangle$ to "load" the database into a superposition (see e.g. this answer), but I don't think the actual feasibility of this is fully understood, as of yet.

Anyhow, for this answer, let's assume that this problem has somehow been solved. We can thus use a query operator $\mathcal Q$ which implements the evolution $$\mathcal Q|x,0\rangle\mapsto|x,w_x\rangle,$$ so that $\mathcal Q|+,0\rangle=|D\rangle$.

We can equivalently write $|D\rangle$ as $$|D\rangle=\sin\alpha|x_0,w_0\rangle+\cos\alpha|X_\perp\rangle,$$ for some $\alpha\in\mathbb R$ (where $\alpha\equiv\arcsin(1/\sqrt N)$), where $|x_0,w_0\rangle$ is the target index/word pair, and $|X_\perp\rangle$ is everything else. We don't need to worry about what $|X_\perp\rangle$ actually looks like, it is enough to know that it is orthogonal to $|x_0,w_0\rangle$.

Apply the oracle

The next step is to apply the oracle operation. In this case this is pretty straightforward: we want to implement the mapping $\mathrm{Orac}_{w_0}:|w\rangle\mapsto(-1)^{\delta_{w=w_0}}|w\rangle$ on the $\mathcal H_{\text{entry}}$ register. We can obtain this operation, for example, by using a pair of additional ancilla registers, and implementing $\mathrm{Orac}_{w_0}$ as the reversible version of the classical circuit implementing the mapping $$\mathrm{Orac}_{w_0}(|w\rangle\otimes|w_0\rangle\otimes |s\rangle) \mapsto |w\rangle\otimes|w_0\rangle\otimes|s\oplus \delta_{w=w_0}\rangle.$$ Then, using as input $|s\rangle=|-\rangle\equiv\frac{1}{\sqrt2}(|0\rangle-|1\rangle)$ gives $$\mathrm{Orac}_{w_0}(|w\rangle\otimes|w_0\rangle\otimes |-\rangle) =(-1)^{\delta_{w=w_0}} (|w\rangle\otimes|w_0\rangle\otimes|-\rangle),$$ so that we can just write $\mathrm{Orac}_{w_0}|w\rangle=(-1)^{\delta_{w=w_0}}|w\rangle$ without worrying about the extra registers. Equivalently, we can write $\mathrm{Orac}_{w_0}=I-2\mathbb P(|w_0\rangle)$ (using the notation $\mathbb P(|\psi\rangle)\equiv|\psi\rangle\!\langle\psi\rvert$).

At this stage, the evolving state is thus $|\Psi_1\rangle\equiv N^{-1/2}\sum_x(-1)^{\delta_{w_x,w_0}}|x,w_x\rangle,$ or equivalently, $$|\Psi_1\rangle=-\sin\alpha|x_0,w_0\rangle+\cos\alpha|X_\perp\rangle.$$ Something that will be useful in the next step is that we can also write $$|\Psi_1\rangle=\cos(2\alpha)|D\rangle+\sin(2\alpha)|D_\perp\rangle,$$ where $|D_\perp\rangle\equiv -\cos(\alpha)|x_0,w_0\rangle + \sin(\alpha)|X_\perp\rangle$.

Diffusion step

The next step is to apply the "diffusion operator" $R_D\equiv 2|D\rangle\!\langle D|-I$ in the space $\mathcal H_{\text{index}}\otimes\mathcal H_{\text{entry}}$. This operator is such that $R_D|D\rangle=|D\rangle$, but $R_D=-1$ on the rest of the space. Equivalently, $R_D|X\rangle=(-1)^{1-\delta_{X,D}}|X\rangle$.

Doing this is actually a bit tricky, as it requires knowledge of $|D\rangle$, and therefore of the database. The only way I can think of to do it is using $\mathcal Q$ and $\mathcal Q^\dagger$. More specifically, we need $R_D=\mathcal QH^{(1)}\mathcal C^{(2)} H^{(1)}\mathcal Q^\dagger$, where $H^{(1)}$ is the Hadamard on the index register, and $\mathcal C$ denotes the reversible version of a classical circuit mantaining the sign of $|\boldsymbol 0,\boldsymbol 0\rangle$ and changing the sign of the rest of the basis states. Indeed, consider the action of such $R_D$ on an input state:

  1. If the input is $|D\rangle$, then we have $\mathcal C^{(2)} H^{(1)}\mathcal Q^\dagger|D\rangle=\mathcal C^{(2)} H^{(1)}|+,\boldsymbol 0\rangle=\mathcal C^{(2)} |\boldsymbol 0,\boldsymbol 0\rangle = |\boldsymbol 0,\boldsymbol 0\rangle$, and thus $R_D|D\rangle=|D\rangle$.
  2. If instead the input is some $|X\rangle$ orthogonal to $|D\rangle$, then $R_D|X\rangle=\mathcal Q H^{(2)}\mathcal C^{(1)}\sum_{Y\neq(0,0)}|Y\rangle=-|X\rangle$.

With this, we are pretty much done. Applying $R_D$ to $|\Psi_1\rangle$ gives $$|\Psi_2\rangle\equiv R_D|\Psi_1\rangle= \cos(2\alpha)|D\rangle-\sin(2\alpha)|D_\perp\rangle =\sin(3\alpha)|x_0,w_0\rangle+\cos(3\alpha)|X_\perp\rangle.$$

The rest of the algorithm then proceeds as usual: we go back and repeat oracle and diffusion steps $\mathcal O(\sqrt N)$ times before measuring. For further details about how the combined action of oracle and diffusion steps results in an effective rotation of $2\alpha$ in state space you can have a look at this other answer of mine.

glS
  • 27,510
  • 7
  • 37
  • 125
4

As this seems a recurring question, let me repeat my answer from Physics.SE:

This seems to be a common misunderstanding about Grover's algorithm. It is not about querying a magically encoded database. Rather, you have an efficiently computable function $f(x)\in\{0,1\}$ and you want to find some $x_0$ for which $f(x_0)=1$. Since you know how to realize $f(x)$ (i.e., you have a circuit), you can run $f$ on a quantum computer and use Grover to find such an $x_0$. This function can be seen as returning entries of a "database", which is encoded in a specific function, though I don't particularly like this picture.

The relevance is in the fact that a large number of interesting problems (namely, the class NP) are such that solutions might be hard to find, but they are easy to verify. Thus, Grover gives a square-root speed-up on any brute-force method to solve such a problem (i.e., any method which does not make use of any special structural property of $f$).

Differently speaking, Grover's algorithm is not, I repeat not, about searching Harry Potter books or the like. It is about speeding up finding solutions of unstructured NP problems (or problems where one does not know the structure), i.e. where the validity of a solution can be checked. This is often called a "search problem", but is has nothing to do with "databases" as we would usually think of them, and is thus not applicable to Harry Potter.

Norbert Schuch
  • 8,142
  • 1
  • 19
  • 30
2

From a software developer point of view, an Oracle is like a blackbox of function to be called. For the traveling salesman problem, if you have $n$ cities, there are up to $m=\frac{n^2-n}{2}$ direct routes between two cities, an Oracle can tell whether any subset of the $m$ routes is a valid path (visit each city exactly once) and the total path length is less than a given $x$. Even though classic method does call such naive Oracle directly, the best known algorithm requires $O(n^2 2^n)$ operations.

Now one can apply Grover‘s algorithm a few times. For the first time, classically find a path visiting each city at least once, let $x$ be the length of this path. If Grover‘s algorithm succeeds, use the length of the found path as $x$ for the next round. The total number of rounds shall be small (though I personally have not search diligently for a confirmation). Alternatively, one can at least use binary search of $x$. Since there are at most $m^n$ valid paths, we need at most $n\log{m}=O(n\log{n})$ rounds. Each round takes $O(2^\sqrt{n})$, hence total cost is $O(2^\sqrt{n} n\log{n})$, still much smaller than the classic one. Additional analysis probably will reduce the number of rounds.

czwang
  • 949
  • 1
  • 6
  • 17
1

Following is a manual step-wise computation of the 2 qubit Grover algorithm circuit for finding $|00\rangle$:

enter image description here

Step 1:

$$|00\rangle$$

Step 2:

$$\frac{1}{\sqrt2}(|0\rangle + |1\rangle)\frac{1}{\sqrt2}(|0\rangle + |1\rangle)$$ $$=\frac{1}{2}(|00\rangle + |01\rangle + |10\rangle + |11\rangle)$$

Step 3:

$$\frac{1}{2}(|11\rangle + |10\rangle + |01\rangle + |00\rangle)$$

Step 4:

$$\frac{1}{2}(|1\rangle\frac{1}{\sqrt2}(|0\rangle - |1\rangle) + |1\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + |0\rangle\frac{1}{\sqrt2}(|0\rangle - |1\rangle) + |0\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle)) $$ $$=\frac{1}{2\sqrt2}(|10\rangle - |11\rangle + |10\rangle + |11\rangle + |00\rangle - |01\rangle + |00\rangle + |01\rangle) $$

$$=\frac{1}{\sqrt2}(|00\rangle + |10\rangle)$$

Step 5:

$$\frac{1}{\sqrt2}(|00\rangle + |11\rangle)$$

Step 6:

$$\frac{1}{\sqrt2}(|0\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + |1\rangle\frac{1}{\sqrt2}(|0\rangle - |1\rangle))$$

$$=\frac{1}{2}(|00\rangle + |01\rangle + |10\rangle - |11\rangle)$$

Step 7:

$$\frac{1}{2}(|11\rangle + |10\rangle + |01\rangle - |00\rangle)$$

Step 8:

$$\frac{1}{2}(\frac{1}{\sqrt2}(|0\rangle - |1\rangle)\frac{1}{\sqrt2}(|0\rangle - |1\rangle) - \frac{1}{\sqrt2}(|0\rangle - |1\rangle)\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + \frac{1}{\sqrt2}(|0\rangle + |1\rangle)\frac{1}{\sqrt2}(|0\rangle - |1\rangle) - \frac{1}{\sqrt2}(|0\rangle + |1\rangle)\frac{1}{\sqrt2}(|0\rangle + |1\rangle))$$

$$=\frac{1}{4}((|0\rangle - |1\rangle)(|0\rangle - |1\rangle) + (|0\rangle - |1\rangle)(|0\rangle + |1\rangle) + (|0\rangle + |1\rangle)(|0\rangle - |1\rangle) - (|0\rangle + |1\rangle)(|0\rangle + |1\rangle))$$

$$=\frac{1}{4}(|00\rangle - |01\rangle - |10\rangle + |11\rangle + |00\rangle + |01\rangle - |10\rangle - |11\rangle + |00\rangle - |01\rangle + |10\rangle - |11\rangle - |00\rangle - |01\rangle - |10\rangle - |11\rangle)$$

$$ = \frac{1}{2}(|00\rangle - |01\rangle - |10\rangle -|11\rangle)$$

Step 9:

$$ \frac{1}{2}(|00\rangle + |01\rangle + |10\rangle -|11\rangle)$$

Step 10:

$$ \frac{1}{2}(|0\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + |0\rangle\frac{1}{\sqrt2}(|0\rangle - |1\rangle) + |1\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle) -|1\rangle\frac{1}{\sqrt2}(|0\rangle - |1\rangle))$$

$$ = \frac{1}{2\sqrt2}(|00\rangle + |01\rangle + |00\rangle - |01\rangle + |10\rangle + |11\rangle - |10\rangle + |11\rangle)$$

$$ = \frac{1}{\sqrt2}(|00\rangle + |11\rangle)$$

Step 11:

$$ \frac{1}{\sqrt2}(|00\rangle + |10\rangle)$$

Step 12:

$$ \frac{1}{\sqrt2}(|0\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + |1\rangle\frac{1}{\sqrt2}(|0\rangle + |1\rangle))$$

$$=\frac{1}{2}(|00\rangle + |01\rangle + |10\rangle + |11\rangle)$$

Step 13:

$$\frac{1}{2}(\frac{1}{\sqrt2}(|0\rangle + |1\rangle)\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + \frac{1}{\sqrt2}(|0\rangle + |1\rangle)\frac{1}{\sqrt2}(|0\rangle - |1\rangle) + \frac{1}{\sqrt2}(|0\rangle - |1\rangle)\frac{1}{\sqrt2}(|0\rangle + |1\rangle) + \frac{1}{\sqrt2}(|0\rangle - |1\rangle)\frac{1}{\sqrt2}(|0\rangle - |1\rangle))$$

$$=\frac{1}{4}((|0\rangle + |1\rangle)(|0\rangle + |1\rangle) + (|0\rangle + |1\rangle)(|0\rangle - |1\rangle) + (|0\rangle - |1\rangle)(|0\rangle + |1\rangle) + (|0\rangle - |1\rangle)(|0\rangle - |1\rangle))$$

$$=\frac{1}{4}(|00\rangle + |01\rangle + |10\rangle + |11\rangle + |00\rangle - |01\rangle + |10\rangle - |11\rangle + |00\rangle + |01\rangle - |10\rangle - |11\rangle + |00\rangle - |01\rangle - |10\rangle + |11\rangle)$$

$$ = |00\rangle$$

enter image description here


3-qubit single application of $|000\rangle$ inversion and amplification:

enter image description here

enter image description here

For multi-iterations, the $|000\rangle$ coefficient sign must be inverted again first each time, before applying the amplification.

enter image description here

from Wolfram demo.


4-qubit single application of $|0000\rangle$ inversion and amplification:

enter image description here

James
  • 551
  • 3
  • 11