Can the recurrent neural network's input come from a short-time Fourier transform?

Question

Can the recurrent neural network input come from a short-time Fourier transform? I mean the input is not from the time-series domain.

Neil Slater · Answer 1 · 2018-10-09T10:35:17.490

Yes you can apply RNN to any sequence of same data type. The sequence can be in space, time, or any arbitrary ordered list. The items in the sequence can have any data at all, the only requirement is that each represents that same kind of thing (if you have multiple types of thing to process as a sequence, you just need to expand the definition so that the input features can represent all types unambiguously - essentially creating a "base class" that can represent them all).

The RNN will consume the sequence as a time-based sequence, one item per time step of the RNN. However, you can think of that as the same as a processor clock for a computer . . . an RNN is essentially a trainable Turing machine, and in principle can learn to accumulate any data about the sequence it has seen, and output any function of that accumulated data. Although in practice this learning process might be too hard for our current systems, require immense amounts of data etc . . .

In your case, STFT does create a time-based sequence. Each item in the sequence is a frequency analysis for a short period of time, and each time step of the sequence represents a fixed time difference between STFT frames (the windows usually overlap a little), where frequencies in the signal may change. Typically each STFT frame is a single time step input to a RNN. You could input the frequency-domain values in fixed order (e.g. low to high frequency) one at a time into a RNN too, but that would be unusual and would make most learning tasks harder.

Can the recurrent neural network's input come from a short-time Fourier transform?

1 Answers1