Just a quick note that v0.0.2 has been released on Github! It now has the beginnings of inferencing using Coqui STT. Literally all we’re doing is converting the audio from μLaw 8bit 8Khz to LPCM 16bit 16Khz so it can be fed into the generic “english huge vocabulary” model Coqui provide, but this is still a good achievement!
When v0.0.1 was released, I’d managed to play audio from a file on the server to a Combadge over RTP. Now, we’ve gone both ways. Spin Doctor can play audio to the badge, and it can record audio from the badge. It’s also been a bit of a refresher in audio coding. That’s a huge step from where the code was yesterday, let alone three days ago when I hadn’t successfully initialised a call.
There’s also a few more “inert” changes in Communicator.js where I’ve prototyped a few more untested badge commands based on packet captures. Additionally, as a holdover from an experiment in refactoring agentd into the main codebase, there’s the early stub of a slightly-generalised RTP library, which I plan to build on for the time being.
Spin Doctor v0.0.1 has now been released on Github. I won’t rehash the commentary again, but there is now a rudimentary version of this software in existence. Hooray! What I will run through here is a couple of the design decisions I made.
Why nodejs?
Simply put, I’m lazy. At the scale this intended for, node is a perfectly adequate language. The server doesn’t need huge speed (anything that needs to be fast will hopefully be a compiled module), and there are some handy advantages of using a popular scripting language. First, node is very easy to write class-based network server code. The alawmulaw module by Rafael S. Rocha has no dependencies and is MIT licenced so it’s trivial to use it in the project. There are also node bindings for Coqui STT (Formerly Mozilla DeepSpeech), which is also compatible with AGPL – and this seems the best way to start using speech recognition.
Java and Python were also considered. Java was rejected on several grounds – it’s heavy, fragmented and has problematic licencing. Plus, the OEM server is written in it. Python was an option, but Javascript is easier to write network and class code in, and Python’s varied bundling in OS’s and awkward dependency packaging is also not ideal – Node is unheard of as a system library making it a “clean” install.
The biggest disadvantage of writing in node is the lack of a universally accepted javascript style guide, which may become an issue once there are multiple contributors to this project.
What’s with the README.md’s obsession with coffee machines?
In the TV series whence the Combadge concept originated, there are devices called “Replicators” that dispense food, beverages, tools and other items on request through some kind of technomagical material transmutation/energy to matter conversion; depending on which episode you’re watching at the time. We obviously don’t have that technology, but since the most common uses of the technology in the show are to dispense beverages – there’s no reason we couldn’t achieve the same practical effect in mundane means. Some brands (who shall remain nameless until they take up our offer) even have wall-integrated, network connected bean-to-cup machines that appear to all the world, like a replicator.
So, I want to make it real. Voice controlled “Coffee, Black.” through my sci-fi themed voice agent.
How do I get started?
This one isn’t so easy. To even test this, you’ll need a handful of things:
A server running Ubuntu (I assume, haven’t tested it on anything else), git and nodejs.
A B3000 combadge running similar firmware to mine. I get mine from computer recyclers, that sell them on a well-known eCommerce platform.
A battery, charger, and clip for the above.
A wireless network that matches one of the prebuilt profiles of the combadge.
Google.
There are instructions provided by the OEM themselves on how to enter the Badge Configuration Menu and set the wifi settings and IP addresses using the built in profiles. If you can work this out, you should be able to set up your server on the right IP and start the software.
In theory, if you have the badge kit and a Pi 4 – that should be everything you need (with isc-dhcp-server and hostapd on the pi) to get started. When you inevitably have issues, get involved on the github!