Just a quick note that v0.0.2 has been released on Github! It now has the beginnings of inferencing using Coqui STT. Literally all we’re doing is converting the audio from μLaw 8bit 8Khz to LPCM 16bit 16Khz so it can be fed into the generic “english huge vocabulary” model Coqui provide, but this is still a good achievement!
When v0.0.1 was released, I’d managed to play audio from a file on the server to a Combadge over RTP. Now, we’ve gone both ways. Spin Doctor can play audio to the badge, and it can record audio from the badge. It’s also been a bit of a refresher in audio coding. That’s a huge step from where the code was yesterday, let alone three days ago when I hadn’t successfully initialised a call.
There’s also a few more “inert” changes in Communicator.js where I’ve prototyped a few more untested badge commands based on packet captures. Additionally, as a holdover from an experiment in refactoring agentd into the main codebase, there’s the early stub of a slightly-generalised RTP library, which I plan to build on for the time being.
Small changes, but significant ones!