2

Has deep learning discovered any heretofore unknown algorithms?

Goodfellow et al. give an example of learning XOR, and the Universal Approximation Theorem seems to imply that deep learning might be able to represent any algorithm (or Turing machine?).

Geremia
  • 555
  • 1
  • 5
  • 12

1 Answers1

5

DL is algorithm-like in terms of their outcomes but often emerge through brute-force exploration of solution spaces rather than human-like conceptual insight. The line between "discovering a new algorithm" and "optimizing a known process" can be blurry, but examples like AlphaTensor show clear cases of algorithmic innovation.

In 2022, DeepMind introduced AlphaTensor, a neural network that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not... Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3x3 matrices remains unknown, even in commutative field. On 4x4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5x5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4x4 algorithm, and separately tweaked Deepmind's 96-step 5x5 algorithm down to 95 steps in mod 2 arithmetic and to 97 in normal arithmetic. Some algorithms were completely new: for example, (4, 5, 5) was improved to 76 steps from a baseline of 80 in both normal and mod 2 arithmetic.

Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, et al. “Discovering Faster Matrix Multiplication Algorithms with Reinforcement Learning.” Nature 610, no. 7930 (October 2022): 47–53.

However, this remains niche compared to the broader landscape of algorithm design, which still relies heavily on human creativity currently. In practice, a network trained to solve a problem (like XOR) is often simply learning a parameterization heuristically that approximates the function that you might also compute with a known algorithm or by hand in theory. Deep learning models whether general-purpose GPTs or specialized reasoning models, are all ultimately function approximators based on known principles.

Geremia
  • 555
  • 1
  • 5
  • 12
cinch
  • 11,000
  • 3
  • 8
  • 17