7

How can GPT 4 solve complex calculus and other math problems. I believe these problems require analytical reasoning and ability to compute numbers. Does it still use a LLM to complete this process or does it add on to this?

Here is the link to the official results published by OpenAI

desert_ranger
  • 672
  • 1
  • 6
  • 21

5 Answers5

14

Large Language Models actually can do math. It's an "emergent" property, i.e. it appears only at larger scales. Understanding complex English language does require some analytical ability, which can carry over to math tasks like calculus and even arithmetic. Numbers can be represented as words, so it's definitely not unthinkable that an LLM could learn to add and subtract if it were to see enough examples.

The graph below from the 2022 paper "Emergent Abilities of Large Language Models" shows that these properties "spontaneously" emerge as models get larger. We're interested in subgraph (A) here. Upto 10^22 FLOPS, the models studied (the largest models available at the time) have basically no arithmetic ability, but scaling the models further rapidly improves their capabilities. We don't know the internals of GPT-4, but it should be larger than these models, so it was expected that it would better at arithmetic.

It also goes the other way around, Numeracy enhances the Literacy of Language Models

Emergent Properties of Large language models

Harsh
  • 1,325
  • 8
  • 22
6

ChatGPT now uses Wolfram Alpha to deal with math as well as other factual information.

https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

3

As far as we know, GPT-4's core capabilities are still based mainly on a Large Language Model (LLM).

If so, then the apparent capabilities to reason are a somewhat surprising emergent phenomenom from a well-trained sequence prediction engine that has been trained on large amounts of data, and has capacity to create highly complex rules that approximate a "next symbol oracle".

Again, assuming this assertion is correct, then maths and logic capabilities of ChatGPT divide into a few different possibilities (these are not formal classifications, just my amateur analysis):

  • Rote learning of symbol associations. This is likely to occur with commonly-occurring items in the training data. Special values of trigonometry functions for example.
  • Things that look like logical reasoning, but are simply well-formed sentences that are on-topic. This is something we can easily be fooled by. When ChatGPT gives an explanation for a thing, it may not have any representation of it beyond being in an "explainy" state, and generating text that fits.
  • Approximate rules and processes. The LLM is a complex neural network, and can in principle learn arbitrary internal functions in order to predict sequence continuation. It learns these statistically, and there will be limitations - for example it is unlikely that it could learn to produce crytpographic hashes given only examples. But it may really learn to add two numbers across a wide range of numerical values given thousands of examples.
  • Logic processes embedded in the form of output text. I have seen many examples where a LLM gets a correct answer when it is allowed to "think things through" by showing the working out, whilst forcing a direct answer will be wrong.
  • Accurate rules and processes. Some rules in maths are very language-like and could be learned very well by an LLM. That could include some mathematical symbol manipulations such as variable substitution.

I expect that all the above are occurring in some mix.

For example, you could conjecture that there is a reasonable chance that GPT can internally count accurately up to some moderate number, and re-use that ability in different contexts to predict numerical symbols and associated words (e.g. one has also representation 1) It may also contain more than one such semi-accurate counter, using them in different contexts.

The sheer quantity of training material - more than any single person could consume in their lifetime - plus learning capacity of the neural network means that there are probably a lot of simple rote rules that are subject-dependent. However, in some areas the model will have a "deeper understanding" in that it will have learned reasonably complex manipulations, and used them to predict a sequence of symbols as accurately as possible, using as little of its learning capacity as possible (because it is being asked to predict text in a huge range of contexts, so benefits when it compresses its rules)

GPT has not learned primarily by reasoning and from first principles though. Its inner representations and logical units of work are likely to be quite alien to humans, and may freely combine grammar, sentiment, mathematical building blocks and other symbollic context in ways that could seem very odd if they could even be explained. This heady mix of things that occurs in most neural networks during training, is one reason why it is unlikely that OpenAI have wired in separate logic modules for highly structured processing such as math symbols or calculations. Providing such modules is possible, but detecting when to use them, and how to wire them into the network are both hard problems.

Neil Slater
  • 33,739
  • 3
  • 47
  • 66
2

OpenAI's CEO explicitly mentioned last month in the GPT-4 announcement video that GPT-4 isn't hooked up to a calculator.

One can however install plugins on top of ChatGPT, which may connect it to some other resources such as Wolfram as mentioned in Jaume Oliver Lafont's answer.

Franck Dernoncourt
  • 3,473
  • 2
  • 21
  • 39
1

There is a folk story about J.W. Gibbs that goes something like:

Being a famous scientist, Gibbs was a member of a number of scientific bodies. He was bored by those and never took a podium. Except for one time. The discussion was about redirecting some effort from teaching mathematics towards more effort at teaching foreign languages. Gibbs decided to give a speech that one time. He said: "Mathematics is a language."

I don't know is this story is true or not, but I share the attitude.

Kostya
  • 2,667
  • 12
  • 24