v0.0.2 – It does something. Not much, but something.

Just a quick note that v0.0.2 has been released on Github! It now has the beginnings of inferencing using Coqui STT. Literally all we’re doing is converting the audio from ╬╝Law 8bit 8Khz to LPCM 16bit 16Khz so it can be fed into the generic “english huge vocabulary” model Coqui provide, but this is still a good achievement!

When v0.0.1 was released, I’d managed to play audio from a file on the server to a Combadge over RTP. Now, we’ve gone both ways. Spin Doctor can play audio to the badge, and it can record audio from the badge. It’s also been a bit of a refresher in audio coding. That’s a huge step from where the code was yesterday, let alone three days ago when I hadn’t successfully initialised a call.

There’s also a few more “inert” changes in Communicator.js where I’ve prototyped a few more untested badge commands based on packet captures. Additionally, as a holdover from an experiment in refactoring agentd into the main codebase, there’s the early stub of a slightly-generalised RTP library, which I plan to build on for the time being.

Small changes, but significant ones!