2

I'm trying to implement a gaussian elimination solver for the linear equations that are the outcome from simon's algorithm. I implemented two solutions,

They both work fine for systems which can be reduced to rref (reduced row echelon form), where every non-zero coefficient in a row is the only non-zero.

But neither can handle simple cases like:
Given secret b = 11. Measurements in simon's algorithm: 00 and 11.

The augmented given matrix I put as a param for a solver would be:
|| 1 1 | 0 ||
|| 0 0 | 0 ||
or just (reducing to non trivial equations):
|| 1 1 | 0 ||
I would expect the non trivial solution would be: b0 = 1 and b1 = 1.

Does someone has an idea or an implementation how to handle this cases? I'm a bit confused because what we are solving here is a special version of a homogeneous system
Ax = 0
If A is non-singular, than it only has one (the trivial zero) solution. Otherwise it has an infinite number of non-trivial solutions and gaussian elimination doesn't work here. See: https://en.wikipedia.org/wiki/System_of_linear_equations#Homogeneous_systems

But the example I posted above has exactly two solutions (00 and 11). I can only attribute this to the fact that we have a special case here with modulo 2, right?

However, all articles and books I looked into don't show an implementation of how to solve this. And from my inner feeling there are two options left:

  • I'm thinking to complicated
  • Or it's not solvable with the suggested gaussian-elimination.

The latter is underlined by the fact that Google's Cirq uses svd (Singular Value Decomposition) that I don't understand yet.

enter image description here

Oli
  • 57
  • 5

1 Answers1

3

Note that the $00$ measurement is actually unusable: any string $s$ is orthogonal to $00$, so this doesn't give you a bit of informration on $s$.

Furthermore, as noted in this answer, for a secret of $n$ bits, only $n-1$ equations are to be used. Otherwise, the only solution is $s=0$ if the equations are linearly independent.

In your case, the equations are not linearly independent (because of the $00$ measurement), since $11\oplus11=00$, which explains why the true solution $11$ is still possible. Note also that the zero vector will always be a solution of the system, since it will always be orthogonal to all the measurements you've got.

Thus, after solving for $s\neq0$ using $n-1$ equations, you ought to verify the hypothesis that $s\neq0$. This is quite easy: query for $f(x)$ and $f(x\oplus s)$. If they are equal, then $s$ is necessarily different from $0$. Otherwise, it means that $s=0$.

That said, I happen to have coded an implementation in Python some time ago. Note that it is not guarantee to be the fastest possible methods, but it should work.

import numpy as np

class NotFullRankError(Exception): pass

def solve(bitstrings: list[str]) -> str: matrix = np.vstack([np.array([int(x) for x in s], dtype=bool) for s in bitstrings])

# Check if there are n - 1 equations and whether the nil vector is among them
if not matrix.any(axis=1).all() or matrix.shape[0] != matrix.shape[1] - 1:
    raise NotFullRankError()

# Reduce
for i in range(matrix.shape[0] - 1):
    mask = matrix[i:, i]
    matrix[i:] = np.vstack((matrix[i:][mask], matrix[i:][~mask]))
    matrix[i + 1:i + mask.sum()] ^= matrix[i]

    # Check if the equations were linearly independent
    if not matrix.any(axis=1).all():
        raise NotFullRankError()

# Transforming matrix from a ref to a rref
for i in range(1, matrix.shape[0]):
    index = np.where(matrix[i])[0][0]
    mask = matrix[:i, index]
    matrix[:i][mask] ^= matrix[i]

index = np.where(~np.diag(matrix))[0]

if index.shape[0]:
    index = index[0]
else:
    index = matrix.shape[1] - 1

return "".join(str(x) for x in np.hstack((matrix[:index, index], [1], matrix[index:, index])))

A quick explanation concerning the following lines:

    index = np.where(~np.diag(matrix))[0]
if index.shape[0]:
    index = index[0]
else:
    index = matrix.shape[1] - 1

return np.hstack((matrix[:index, index], [1], matrix[index:, index]))

Once the matrix is rref, two cases are possible:

  • the diagonal is full of $1$s, like: $$A=\begin{pmatrix}1&0&0&0&1\\0&1&0&0&0\\0&0&1&0&0\\0&0&0&1&1\end{pmatrix}$$
  • the diagonal "misses" a step, like: $$B=\begin{pmatrix}1&1&0&0&0\\0&0&1&0&0\\0&0&0&1&0\\0&0&0&0&1\end{pmatrix}$$

index will represent the index at which the diagonal misses a step, which is $n-1$ for the first case and the first zero on the diagonal for the second one.

Let us consider the second case for now. Note that the last row will always be $\begin{pmatrix}0&\cdots&0&1\end{pmatrix}$, which implies that the last bit of $s$ must be $0$. Indeed, we must have $Bs=\begin{pmatrix}0\\\vdots\\0\end{pmatrix}$, so the last line of the system translates to $1\cdot s_n=0$. Essentially, this means that you can remove the last row of the matrix, since each of these elements will me multiplied by the last bit of $s$, which is $0$. You can then perform the same reasoning, until you're left with a matrix of the first case. Thus, if we show that in the first case, the solution $s$ is given by the last column to which one appends $1$, this would explain the method (since matrix[index:, index] is a vector full of zeroes).

Now for the first case, since the matrix is rref it means that every row on the matrix contain at most exactly two $1$s: one on the diagonal and one on the last column. Thus, when writing down $As=\begin{pmatrix}0\\\vdots\\0\end{pmatrix}$, all of the equations can be written as $s_i\oplus A_{i,n}\cdot s_n=0$, which implies that $s_i=A_{i,n}\cdot s_n$. If $s_n=0$, then this will yield the zero vector, which we don't want. Thus, we impose $s_n=1$, which then yield $s_i=A_{i,n}$.

All in all, once the index at which the diagonal misses a step is found, we simply have to take the column corresponding to this index up to that row, add a $1$ and complete with $0$ to find $s$, which explains the code.

Now, if you want to use this code, here's how you can do it. If you measured $10010$, $11010$, $10110$ and $10001$, you can find the unique non-zero $s$ which is orthogonl to all of these like this:

res = solve(["10010", "11010", "10110", "10001"]) # Compute s
print(res)

Running it with the list containing only "11" returns "11" as expected.

Tristan Nemoz
  • 8,429
  • 3
  • 11
  • 39