3

As also discussed in (How does the induction step in the Grover-Rudolph scheme to prepare superpositions from probabilities work? and How does the uncomputation step work in the Grover-Rudolph scheme to prepare $\sum_i\sqrt{p_i}|i\rangle$?), the Grover-Rudolph scheme (https://arxiv.org/abs/quant-ph/0208112) is used to prepare an $n$-qubit superposition $\sum_{i=0}^{2^n-1} \sqrt{p_i}|i\rangle$ via an iterative procedure.

Each step of the procedure involves taking a superposition (possibly obtained from the previous iteration) of the form $\sum_{i=0}^{2^m-1}\sqrt{p_i'}|i\rangle$ for some $0\le m<n$, with $p_i'$ denoting some "coarse-graining" of the final target distribution, and implementing an evolution such that each $\sqrt{p_i'} |i\rangle$ evolves into some $|i\rangle(\sqrt{p_i''}|0\rangle+\sqrt{1-p_i''}|1\rangle)$ for a suitable "refinement" $p_i''$. See the links above for an example of what this means in practice for $n=2$.

Each refinement step works adding an ancilla containing an angle $\theta_i$, then performing a controlled operation conditioned on this angle, and finally uncomputing the ancilla with $\theta_i$.

The main claim of the paper is that such superpositions can be prepared efficiently as long as the underlying distribution is "efficiently integrable". This requirement comes, as far as I can tell, from the step where we write into the ancilla the angles $\theta_i$. These angles contain in practice sums of probabilities, hence why we mind about "integrability". More precisely, following the paper's notation, $\theta_i=\arccos\sqrt{f(i)}$ where $$f(i) = \frac{\int_{x^i_L}^{\frac{x^i_R-x^i_L}{2}} dx \,p(x)}{\int_{x^i_L}^{x^i_R} dx \,p(x)}.$$ Without delving into what the symbols in this last expression mean exactly, the gist (as I understand it) is that $f(i)$ is a ratio between the sum of a subset of probabilities, over the sum of a larger number of probabilities. For example if we want to go from $\sqrt{p_0+p_1}|0\rangle+\sqrt{p_2+p_3}|1\rangle$, we'll get $f(0)=p_0/(p_0+p_1)$ and $f(1)=p_2/(p_2+p_3)$. They're just the angles we need to get the correct amplitudes in the next iteration.

What are example situations where we cannot efficiently prepare a given target distribution? I have a hard time thinking of examples of "not efficiently integrable distribution", because here we're talking discrete distributions, so "integrability" is just about summing subsets of probabilities, and given that one presumably knows the probabilities $p_i$, how would you not be able to efficiently compute their sums?

From the paper's introduction, I'm guessing the authors might have been thinking about a different type of problem, where "efficient integrability" is an issue, or maybe where you don't know a priori the probabilities $p_i$ you want in your amplitudes because they come from coarse-graining a continuous distribution. But I don't know enough about this context to come out with an explicit example of such a situation.

glS
  • 27,510
  • 7
  • 37
  • 125

1 Answers1

1

In their abstract they write

We give a simple and efficient process for generating a quantum superposition of states which form a discrete approximation of any efficiently integrable (such as log concave) probability density functions.

They require the probability density function $p(x)$ to be efficiently integrable. If you already discretized it (you already integrated the probability density over that interval) as $p_i$, with $i \in \{0, 2^n-1\}$, then you can use the Grover-Rudolph algorithm.

However, the problem lies in calculating the probabilities $p_i$, or in calculating the function

$$ f(i)= \frac{\int_{x^i_L}^{\frac{x^i_R-x^i_L}{2}}dx~ p(x)}{\int_{x^i_L}^{x^i_R}dx~ p(x)} $$ which if you had $p(x)$ discretized as $p_i$ would be easy.

So here you need to integrate $p(x)$ over a given interval, and that might be classically difficult, expensive or intractable. There are many functions that are not efficiently computable classically, otherwise there would be no need for methods such as Monte Carlo integration.

An example is the Boltzmann distribution of a system composed of N particles that can be in spin up or down. This is given by

$$ p_B(x_0, x_1, ..., x_N) = \frac{e^{E(x_0, x_1, ..., x_N)}} {Z}, $$

where $Z = \int e^{E(x_0, x_1, ..., x_N)} dx_1 dx_2 ... dx_N$ is the partition function. Calculating the partition function $Z$ is mostly intractable.

To prepare this distribution you could proceed encoding one particle at a time (suppose they have state 0 or 1), but already at the first step you would not be able to calculate it. You need to calculate the probability of that particle to be in state 0 or 1, which means you need to integrate over all the other particles, which takes exponentially long.

Oskar
  • 111
  • 3