Might use of rational numbers and calculations be beneficial for an ANN?

Question

Rational numbers would help alleviate some gradient issues by not losing precision as the weights and the propagated values (signal) reach extremely low and high values.

I'm not aware of any hardware that is optimized for rationals. GPUs are all optimized for vector/matrix operations on floats. So, a couple of negatives are that it would likely double the storage and computation necessary for an ANN. But meh, hardware is constantly improving, and if there was value in creating a rational co-processor for AI, someone would build it.

I suspect an advantage would be greater precision and possibly mitigating some aspects of gradient issues.

What are the other pros and cons of an ANN using rationals?

Are there any research papers investigating the use of rationals in ANNs?

James Huddle · Answer 1 · 2024-11-29T23:38:29.870

Retrofitting (as you might intuit) is the hardest part. I've been working toward a solution. Fraction (from fractions) plugs and plays quite nicely with floats in many areas. I was surprised how smoothly numpy integrates matrix functionality as if you had used float. The first hurdle I found was sqrt(), which just fails. So if you have an ndarray composed of Fractions, for simple math you get ndarrays composed of Fractions. But if, for instance, you have a vector of 5 Fractions, np.std() just fails, because of the sqrt() issue. But np.var() returns a Fraction. My current solution is to create a class that inherits from Fraction and make sure that class has a sqrt() method. To my elation, passing a vector of those objects to np.std() will return a rational answer of that object type.

As this is a work in progress, I have not tried this with Keras, TF, PyTorch, etc, yet. I highly suspect that there will be myriad hurdles, there!

And the best part of Fraction (from fractions) is that both the numerator and the denominator are python integers, so they easily scale to any size, like GMP.

Might use of rational numbers and calculations be beneficial for an ANN?

1 Answers1