I understand that there is a constructive proof that arbitrary gates can be approximated by a finite universal gate set, which is the Solovay–Kitaev Theorem.
However, the approximation introduces an error, which would spread and accumulate in a long computation. This would presumably scale badly with the length of the calculation? Possibly one might apply the approximation algorithm to the complete circuit as a whole, not to a single gate. But how does this scale with the length of the computation (i.e. how does the approximation scale with the dimension of the gates)? How does the gate approximation relate to gate synthesis? Because I could imagine that this affects the final length of the computation?
Even more disturbing to me: What happens if the length of the calculation is not known at the time when the gate sequence is compiled?
 
    
    - 17,945
- 8
- 50
- 112
 
    
    - 2,457
- 17
- 40
1 Answers
Throughout this answer, the norm of a matrix $A$, $\left\lVert A\right\rVert$ will be taken to be the spectral norm of $A$ (that is, the largest singular value of $A$). The solovay-Kitaev theorem states that approximating a gate to within an error $\epsilon$ requires $$\mathcal O\left(\log^c\frac 1\epsilon\right)$$ gates, for $c<4$ in any fixed number of dimensions.
For the first part:
the approximation introduces an error, which would spread and accumulate in a long computation
Well, it can be shown by induction that errors accumulating through using one matrix to approximate another are subadditive (see e.g. Andrew Child's lecture notes). That is, for unitary matrices $U_i$ and $V_i$, $\left\lVert U_i - V_i\right\rVert < \epsilon\,\forall\, i \in \left\lbrace1, 2, \ldots, t\right \rbrace\implies \left\lVert U_t\ldots U_2U_1 - V_t\ldots V_2V_1\right\rVert \leq t\epsilon$.
What this means in terms of implementation is that, for an overall error no more than $\epsilon$ to be achieved, each gate needs to be approximated to within $\epsilon/t$, or
applying the approximation to the circuit as a whole
is the same as applying the approximation to each individual gate, each with an individual error no more than that of the entire circuit divided by the number of gates that you're approximating.
In terms of gate synthesis, The algorithm is performed by taking products of the gate set $\Gamma$ to form a new gate set $\Gamma_0$ which forms an $\epsilon^2$ net for $\operatorname{SU}\left(d\right)$ (for any $A \in \operatorname{SU}\left(d\right),\, \exists U\in\Gamma_0\, s.t. \left\lVert A-U\right\rVert\leq\epsilon^2$). Starting from identity, a new unitary is recursively found from the new gate set in order to get a tighter net round the target unitary. Oddly enough, the time for a classical algorithm to perform this operation is also $\mathcal O\left(\mathit{poly} \log 1/\epsilon\right)$, which is sub-polynomial time. However, as per Harrow, Recht, Chuang, in $d$-dimensions, as a ball of radius $\epsilon$ around $\operatorname{SU}\left(d\right)$ has a volume $\propto \epsilon^{d^2-1}$, this scales exponentially in $d^2$ for a non-fixed number of dimensions.
This does have an affect on the final computation time. However, as the scaling in both number of gates and classical computational complexity is sub-polynomial, this doesn't change the complexity class of any algorithm, at least for the commonly considered classes.
For $t$ gates, the overall (time and gate) complexity is then $$\mathcal O\left(t\, \mathit{poly} \log \frac t\epsilon\right)$$.
When using the unitary circuit model without intermediary measurements, the number of gates to be implemented will always be known prior to the computation. However, it is feasible to assume this isn't the case when intermediary measurements are used, so when then number of gates that you want to approximate is unknown, this is saying that $t$ is unknown. and if you don't know what $t$ is, you obviously can't approximate each gate to an error $\epsilon/t$. If you know a bound on the number of gates (say, $t_{\text{max}}$), then you could approximate each gate to within $\epsilon/t_{\text{max}}$ to get an overall error $\leq\epsilon$ and complexity $$\mathcal O\left(t\, \mathit{poly} \log \frac {t_{\text{max}}}{\epsilon}\right),$$ although if no upper bound on the number of gates is known, then each gate would be approximated to some (smaller) $\epsilon'$, giving an overall error $\leq t'\epsilon$ for the resulting number of implemented gates (which is unknown at the start) $t'$, with an overall complexity of $$\mathcal O\left(t'\, \mathit{poly} \log \frac {1}{\epsilon'}\right).$$
Of course, the total error of this is still unbounded, so one simple1 way of keeping the error bounded would be to reduce the error each time by a factor of, say, $2$, so that the $n^{th}$ gate would be implemented with error $\epsilon/2^n$. The complexity would then be $$\mathcal O\left(\mathit{poly} \log \frac {2^n}{\epsilon'}\right) = \mathcal O\left(\mathit{poly}\, n\log \frac {1}{\epsilon'}\right),$$ giving an overall (now polynomial) complexity $$\mathcal O\left(\mathit{poly}\, t \log \frac {1}{\epsilon}\right),$$ although this does have the advantage of guaranteeing a bounded error.
This isn't too bad, so I would hope that (when the number of gates is unknown) classical computers would be able to keep coming up with the correct gates at least as fast as a quantum processor would need them. If not currently, then hopefully once quantum processors become good enough that this actually becomes a problem!
1 Although, likely not the most efficient
 
    
    - 3,796
- 3
- 24
- 45