Can I efficiently decompose this flexible unitary matrix into 1 and 2 qubit gates?

Question

Consider the unitary matrix $A \in \mathbb{R}^{n^2\times n^2}$ which has only the first $n$ rows explicitly defined, with the remaining rows having some flexibility. $A$ can be written in block form as $ A= \left( {\begin{array}{cccc} B\\ C\\ \end{array} } \right) $ , where $B\in\mathbb{R}^{n\times n^2}$ and $C\in\mathbb{R}^{(n^2-n)\times n^2}$. The elements of $B$ are given by,

$ B_{ij}= \begin{cases} 1 & , \;\; i<n-1, \space j=i(n+1)+1 \\ 1 & , \;\; i=n-1, \space j=n^2-n \\ 0 & , \;\; \text{else} \end{cases} $

There are two restrictions on $C$, that are: 1) $C$ is the unitary complement to $B$ so $A$ is always unitary, and 2) the nonzero elements of $C$ cannot share a column with the nonzero elements of $B$.

Since this is a bit of an unusual matrix, here is an example for $n=4$. Note that $A$ is not unique because even though $B$ is fixed, there are many choices for $C$.

$ A = \left( {\begin{array}{cccccccccccccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ \hline 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\ \end{array} } \right) $

Where $B$ and $C$ are separated by the horizontal line.

My question is, given the definition of $B$ is there a choice for $C$ such that $A$ can be decomposed into $\mathcal{O} (poly(log(n^2)))$ one and two qubit gates?

My attempt at a solution was to force $A$ to be a permutation matrix as in the $n=4$ example above. This would allow me to decompose my matrix using only $SWAP$ gates. However, it appears that this is not efficient in general following this SE thread.

Craig Gidney · Answer 1 · 2024-10-24T04:11:59.330

If you look at what happens inside a specified row k, this matrix is moving the 1 from column k to column k*n+1. Then it does something special in row n-1. So this is an inverse-permutation corresponding to a multiplication, then an increment, then a fixup exchanging two states.

All those operations can be implemented in O(log^2 n) gates. In fact you can get down to O(log(n) log(log(n)) log(log(log(n)))) by using Schönhage–Strassen multiplication, though that would have bad constant factors.

Basically: the hard part of your problem is multiplication. Look up how to do modular multiplication, pick a prime P larger than n^2, and then look at the permutation you get by performing an inplace modular multiplication by the multiplicative inverse of n mod P. You'll see it has this each-row-advances-n-steps property you're looking for, which you can then tweak into a full solution.

score 0 · Accepted Answer · answered Nov 08 '24 at 04:54

I am creating a new answer to add some detail to Craig Gidney's helpful response. As they mentioned, there are three basic steps:

advance each row $n+1$ steps,
fixup the $i=n-1$ row,
increment all rows by $1$.

Let $Q=\text{log}(n^2)$ be the number of qubits where $n=\{4,8,16...\}$ to make $Q$ an integer.

For the first step, we can use inline multiplication to advance each row by $n+1$ steps when $n$ is even. Given the constraints of this problem, I can do this in $Q-1$ CNOT gates. For example, here are the circuits for n=4 and for n=8.

For the second step, we fixup the $i=n-1$ row by exchanging the $n^2-1$ column with the $n^2-n$ column. This is a two-level unitary permutation matrix and can be performed in at most $\mathcal{O}(Q^2)$ single qubit and CNOT gates following Nielsen and Chuang 2010 edition page 193. I don't have a general construction for this permutation matrix.

Finally, I use the most basic construction to increment each row by $1$ which requires $\mathcal{O}(Q)$ multi-controlled not gates to implement. For example, here is the increment circuit for the $Q=4$ qubit case.

Bringing everything together we have this circuit for the $n=4$ case. The first $n$ rows of my circuit match the matrix in my original question. The following $n^2-n$ rows do not match, but as I mentioned in the question, they can have any form as long as the full matrix is unitary. Though not a fully complete complexity analysis, the final complexity for the decomposition is: $\; (Q-1$ CNOT gates) + ($\mathcal{O}(Q^2)$ single qubit and CNOT gates) + ($\mathcal{O}(Q)$ multi-controlled NOT gates).

Can I efficiently decompose this flexible unitary matrix into 1 and 2 qubit gates?

2 Answers2