10

I am training a neural network where the target data is a vector of angles in radians (between $0$ and $2\pi$).

I am looking for study material on how to encode this data.

Can you supply me with a book or research paper that covers this topic comprehensively?

nbro
  • 42,615
  • 12
  • 119
  • 217
user366312
  • 341
  • 1
  • 13

2 Answers2

17

The main problem with simply using the values $\alpha \in [0, 2\pi]$ is that semantically $0 = 2\pi$, but numerically $0$ and $2\pi$ are maximally far apart. A common way to encode this is by a vector of $\sin$ and $\cos$. It perfectly conveys the fact that $0 = 2\pi$, because:

$$ \begin{bmatrix} \sin(0)\\ \cos(0) \end{bmatrix} = \begin{bmatrix} \sin(2\pi)\\ \cos(2\pi) \end{bmatrix} $$

This encoding essentially maps the angle values onto the 2D unit circle. In order to decode this, you can calculate $$\text{atan}2(a_1, a_2) = \alpha,$$

where $a_1 = \sin(\alpha)$ and $a_2 = \cos(\alpha)$.

Here is a nice detailed explanation and here are two references, where this is applied:

EDIT As it was noted in the comments: The values $\sin(\alpha)$ and $\cos(\alpha)$ are not independent and the following naturally holds: $\sqrt{\sin(\alpha)^2 + \cos(\alpha)^2}= 1$, i.e. the euclidean norm is one. In a situation where your Neural Network predicts the sin and cos values, this condition isn't necessarily true. Therefore, you should consider adding a regularization term to the loss that guides the neural network toward outputting valid values (with unit norm) which could look like this:

$$ r_\lambda\left(\hat{y}_1, \hat{y}_2\right)\; = \lambda \left(\; 1 - \sqrt{\hat{y}_1^2 + \hat{y}_2^2}\right), $$

where $\hat{y}_1$ and $\hat{y}_2$ are the sin and cos outputs of the network respectively and $\lambda$ is a scalar that weights the regularization term against the loss. I found this paper where such a regularization term is used (s. Sec. 3.2) to get valid quaternions (Quaternions must also have unit norm). They found that many values work for $\lambda$ and they settle for $\lambda = 0.1$

Chillston
  • 1,878
  • 7
  • 13
0

You might want to look at the von Mises Distribution, it defines a probability distribution over angles.

See Pattern Recognition and Machine Learning, Christopher Bishop, Appendix B (pg 693), or alternatively Wikipedia has an article on this.

You could certainly use this as a loss function in a neural network. My only reservation is that the input parameter $\theta_o$ is itself periodic which might not play well with standard neural network architectures? Therefore previous answers are also worth considering.

I only mention this as if you are interested in distributions over angles, the von Mises distribution is something you should probably be aware of.

Snehal Patel
  • 1,037
  • 1
  • 4
  • 27