I have a set up that connects to openai voice. The openai voice, takes my script and convert it to voice - which is sent to front end.
There is a real latency issue in my set up which I would like to overcome. I am sending the voice as streamresponse (fast api) to the front end. Any idea on possible system setup that might reduce latency.
Potentially, I am thinking of sending the stream in batches, so that I can listen to the current audio/voice while the other next text chunks are being converted to voice.
In particular, will dedicated server resources help in this case ? would kafka stream be an overkill? Or are there known services I could use (https://getstream.io/video/audio-rooms/) that would help solve this?