Categories
Releases

v0.0.8 – Well That Was a Gap

It’s certainly been a while since v0.0.4 was posted and confession, nothing really changed for most of it. By contrast in the last month or two, four releases have been pushed in quick succession. We’re now at v0.0.8 and shucks howdy have things come on.

v0.0.5 – This is the cleanest house ever.

There have been …three complete “code cleanup” merges since v0.0.4? The agent (now called robin) has become a semi-separate library instanced from combadged, a lot of things have been renamed, refactored and generally tidied up. The code is a lot more readable than it was, quite literally.

There’s still more to come on that, but where the code is now now is a pretty good place to build from. Which leads on to functional changes.

Most of those were not in this release. In some ways this was a step back – v0.0.5 did have simultaneous agent handling for multiple badges, but it didn’t (and as of v0.0.8 still doesn’t) support bidirectional RDP. That’s actually one of the official blockers for v0.1.0, what will officially be Spindoctor’s first semi-major release. However, it’s certainly easier to test the project now, without needing to launch multiple parallel scripts independently of each other.

v0.0.7 – Log in, Log out, Shake it all about.

The major change of v0.0.6 was a simple REST api in combadged that allowed you to change the currently logged-in user. Then v0.0.7 hit the repo, and the ability to log out a badge joined it’s sibling. That means that you can get your name onto an idle badge, as well as push the badge into the less power hungry (due to slower pings) “LOGGED OUT” state. Bonus, it turns out if a B2000 badge is in the dock when it’s logged out: it will shut down entirely. Given the energy crisis facing the world in 2022/23, that’s probably no bad thing.

v0.0.8 – This is now a Combadge server.

Please forgive my first person prose here. I started playing with my first eBay B1000A in about 2009. I moved onto B2000 badges around 2012, when I finally couldn’t justify running an AP in 802.11B mode anymore. The first time I was able to communicate with any of the badges was in May 2021, where after a feverish three-day coding sprint with a stack of packet captures, finally, between badge and server – I was able to respond to the initial ping. It was a long 12 years. Now, 18 months later – anyone can download Spindoctor on Github, and if you manage to configure a pair of badges using the on-board menus, and have the right open network – you can make a call between them.

As of right now, it’s a bit of a pathetic process – once the badges have registered on the Spindoctor server, you can use the web API to trigger a call between them – and hang up with the buttons when you’re done. You can’t trigger them with speech recognition (the agent transcribes, badly – and nothing else). The displays don’t change to reflect the call state (there is code for that, but it’s not been used in the call process). Hanging up one badge doesn’t hang up the other, it sits there with it’s RTP port open, dumping audio packets onto the network that will never be heard. And it’s honestly a little unstable, while my all-B2000 test lab works, my mixed B3000/B3000N lab drops calls quite eagerly. Whether that’s the badges or the wifi in that room is yet to be determined.

Looking ahead to v0.1.0

It’s an exciting point to reach after 13 years, but there’s also a lot more to do. There’s a few specific tasks in the short term, and some bigger long term goals.

Short term, believe it or not, speech recognition isn’t a priority. It’s complex, there isn’t developer expertise in the project for it and while the coolest feature on the badges – it’s not necessarily the most useful feature for anyone trying to recycle these. For now, the goals are:

  • 2-way audio between server and badge.
  • A more comprehensive and persistent concept of a “user”.

2-way audio and the chamber of scope creep.

To date, while trying to get 2-way audio working (it still doesn’t), two libraries have been written. One, a terrible fork of an old commonjs project, is an (way faster but buggy as all get-out) ECMAScript function generator to play simple audio tones and chirps. The other, is a (woefully incomplete) pure-ECMAScript RTP library. This is meant to push all of the audio transcoding and UDP wrangling for audio into a semi-independent library. Development of this is currently the main focus as part of #13. Once that’s “done”, robin should be able to just push audio into a stack in the RTP library which will handle feeding it to the badge, and the library will also listen for, transcode and feed audio into robin in turn.

I am Worf, son of Mogh.

Worf has become the go-to demo account for the log-in code, and is used as the example in the README.md. And being able to define yourself as the user of a badge is one of the more important (and frankly, easier) things to implement. Plus, v0.0.6 and v0.0.7 put in most of the important code for this at a badge protocol level.

What’s lacking is a properly structured concept of what a “user” is, and it turns out that can be a little complicated to do, especially if you want people to be able to autogenerate this from some CARDDAV or LDAP set-up down the line (which does seem sensible). So, that’s the next thing to do after the RTP service – a best effort best faith first-pass attempt to properly design a User object, and make it persistent to disk (probably with a simple “write to JSON” for now). #12 if you’re curious.

Seasonal Pumpkin Pie in the Sky~

Ok, it’s pretty obvious that there has to be at least some speech recognition and generation in place for whatever v1.0.0 looks like. There are some other important steps before that though, such as getting V5000 Smartbadges working, as well as (once they start ending up on eBay), the C1000 Minibadge if at all possible – especially since the Minibadge is the closest product to a TV Combadge yet.

I also think that a “1.0” release should handle all the basic functions of a badge. That might not mean every feature of the current gen, such as centrally-provisioned Bluetooth, but it will mean basic message handling and the ability to configure a blank badge for your wireless network without having to use an unencrypted network.

Beyond that, I’m more interested in implementing the cool things that the commercial server never will. For one, the “on-screen” command to transfer calls to an external display with video features. For another, links with home automation software – and especially coffee makers.

I don’t think it can be said enough times that Spindoctor is not meant to replace the commercial software. It’s a hobby project for hackers to run in places like Berlin’s C-Base (canonically, a crashed space station – functionally, it’s a hackerspace) or in their own homes. So, a lot of what makes sense for a Hospital is meaningless for this software. But fingers crossed it gives some entertainment, and helps slow down the sand-waste lifecycle for this hardware. Given that most of my original badges still work 13 years later – it seems like that has a lot of scope.

For now:

For now, you can finally run your own Combadge server – limited as it is. You can get the code on the Combadge project Github, and follow the instructions in the README.md file to get started. After that, if you know your way around ECMAScript and NodeJS, feel free to start throwing up a few PR’s, either against our Mantissa branch, or against my own incomplete PR branches. If that’s not you, bug reports are also welcome – the code is deliberately prioritising simplicity over resilience at this point, but where problems arise, I’d like to fix them all the same. Finally, use the same link if you have a genius concept you think would fit the project.

Categories
Releases

v0.0.4 – Like v0.0.3 but with the rest of my to-do list done.

Remember those “small changes” I said I wanted to do, when I dropped v0.0.3 yesterday? Well, they took less time than I expected them to. Three things changed today.

API Consistency

One thing I always hate about defining functions is the often arbitrarily pre-ordered list of parameters. It’s an awful way to create a syntax. So, I’ve replaced that in the Packet class constructors with a cleaner (MAC, {kwargs}) set-up. The kwargs will still vary between Packet types, but the order is now flexible – and default values can still be(and have been) set.

MAC stays separate, because it’s the only required value for every type of packet – and I wanted to make the importance of the value clear.

Now With More BSSID

One of the first features I wanted to include from the OEM Agent is the ability to locate a badge. In Star Trek, this is a pivotal plot device in many episodes; allowing crews to discover that people are missing, to transport to and from locations without complex co-ordinate explanation (or, the interesting-but-weird RFID bodymod used in Stargate) and to recognise when people are misbehaving or in danger. Of course in a typical home environment this is of absolutely no value – since the location will always be “somewhere near the only access point in the building”. But I can see use-cases at hackerspaces and conferences:

“Computer. Locate Gerda from the Club-Mate Distribution Team.”
“Gerda Hackerbrau is near ‘FPS Tournament Pavilion 3’.”

In Logfiles Near You

Of course, right now – there’s nowhere to actually use this data. We have no way to link BSSID’s with locations or AP’s, nor do we have an interface for querying this information if we did; other than the currently-stubbed “Info: Location” page available from the Combadge display.

What we do have is a simple console logging mechanism, and as of now; that has an appended string produced by a PacketClass.summary() function – which returns the key data for each type of packet. Though now writing this I realise posting the “Packet not complete” message in five different subclasses was entirely unnecessary – change for next version, move that to the CombadgePacket superclass!

Still, without actually paying much attention to their exact output, this addition gives similar logging capability to the OEM voice server for the classes that have already been defined, and an easy method to add more later that can be applied to both directions of packet transmission.

Get Yours Today

You can probably guess, but you can pull this down from master as of today, and there’s a tagged release to boot.

Next Steps

Apart from my styling error mentioned above (which would take less time to fix than writing about it did, if not for the pain of editing commits and updating tags) I’m about out of “simple” tasks for the time being. Next up are some big ones:

  • Making the agent responsive to the end of a statement, and threading it so it doesn’t lock up the badge thread.
  • Creating a multi-badge agent system with an additional port opened and closed as needed.
  • Adding some NLP to the agent transcription in order to start interpreting speech into actionable commands.
  • Fixing the hang-up code, and adding badge-badge and badge-group calling.
  • Opening the Combadge and Agent to the outside world.
    • Adding support for Users.
    • Adding location to BSSID translation.
    • Creating hooks to allow external control of badges, and querying information.

As to which I’ll work on next? I’ll think about it in the morning. Picard out.

Categories
Releases

v0.0.3 – A week of work, 2000 lines of code.

The software does pretty much exactly what it did before. Cool, eh?

Sometimes you need to reset in order to move forward. In this case, the problem was a very rushed, hacked coding style by someone who made up the language as they went along, and wrote half of it like they were working in Python.

The new version of the codebase is completely refactored to push the Combadge functionality further away. It’s also been split up so that the nitty-gritty of packet-level interaction has been separated out from the logic model of the Combadge protocol. This makes the combadged itself cleaner and also means that we have one “big and simple” chunk of code to handle the packet bytes, and one “small and complex” chunk of code to structure the conversation. You can think of it as a huge dictionary full of words, and a small grammar manual.

As well, I’ve moved everything to be an ES6 module, which updates some of the design and also gives us stricter checking – which as a bit of a hack, I definitely benefit from.

I still have some small changes to make to the design of the packet classes which I want to do soon, but the major work of the refactor is done – which should mean I can get on with more “novel” changes soon.

Help welcome, you can find v0.0.3 on the Github as per usual.

Categories
Releases

v0.0.2 – It does something. Not much, but something.

Just a quick note that v0.0.2 has been released on Github! It now has the beginnings of inferencing using Coqui STT. Literally all we’re doing is converting the audio from μLaw 8bit 8Khz to LPCM 16bit 16Khz so it can be fed into the generic “english huge vocabulary” model Coqui provide, but this is still a good achievement!

When v0.0.1 was released, I’d managed to play audio from a file on the server to a Combadge over RTP. Now, we’ve gone both ways. Spin Doctor can play audio to the badge, and it can record audio from the badge. It’s also been a bit of a refresher in audio coding. That’s a huge step from where the code was yesterday, let alone three days ago when I hadn’t successfully initialised a call.

There’s also a few more “inert” changes in Communicator.js where I’ve prototyped a few more untested badge commands based on packet captures. Additionally, as a holdover from an experiment in refactoring agentd into the main codebase, there’s the early stub of a slightly-generalised RTP library, which I plan to build on for the time being.

Small changes, but significant ones!

Categories
Releases

v0.0.1 – It does nothing, but it does it well.

A B3000 Combadge with "Computer" displayed on the OLED.
A B3000 Combadge, connected to the Spin Doctor voice agent.

Spin Doctor v0.0.1 has now been released on Github. I won’t rehash the commentary again, but there is now a rudimentary version of this software in existence. Hooray! What I will run through here is a couple of the design decisions I made.

Why nodejs?

Simply put, I’m lazy. At the scale this intended for, node is a perfectly adequate language. The server doesn’t need huge speed (anything that needs to be fast will hopefully be a compiled module), and there are some handy advantages of using a popular scripting language. First, node is very easy to write class-based network server code. The alawmulaw module by Rafael S. Rocha has no dependencies and is MIT licenced so it’s trivial to use it in the project. There are also node bindings for Coqui STT (Formerly Mozilla DeepSpeech), which is also compatible with AGPL – and this seems the best way to start using speech recognition.

Java and Python were also considered. Java was rejected on several grounds – it’s heavy, fragmented and has problematic licencing. Plus, the OEM server is written in it. Python was an option, but Javascript is easier to write network and class code in, and Python’s varied bundling in OS’s and awkward dependency packaging is also not ideal – Node is unheard of as a system library making it a “clean” install.

The biggest disadvantage of writing in node is the lack of a universally accepted javascript style guide, which may become an issue once there are multiple contributors to this project.

What’s with the README.md’s obsession with coffee machines?

In the TV series whence the Combadge concept originated, there are devices called “Replicators” that dispense food, beverages, tools and other items on request through some kind of technomagical material transmutation/energy to matter conversion; depending on which episode you’re watching at the time. We obviously don’t have that technology, but since the most common uses of the technology in the show are to dispense beverages – there’s no reason we couldn’t achieve the same practical effect in mundane means. Some brands (who shall remain nameless until they take up our offer) even have wall-integrated, network connected bean-to-cup machines that appear to all the world, like a replicator.

So, I want to make it real. Voice controlled “Coffee, Black.” through my sci-fi themed voice agent.

How do I get started?

This one isn’t so easy. To even test this, you’ll need a handful of things:

  • A server running Ubuntu (I assume, haven’t tested it on anything else), git and nodejs.
  • A B3000 combadge running similar firmware to mine. I get mine from computer recyclers, that sell them on a well-known eCommerce platform.
  • A battery, charger, and clip for the above.
  • A wireless network that matches one of the prebuilt profiles of the combadge.
  • Google.

There are instructions provided by the OEM themselves on how to enter the Badge Configuration Menu and set the wifi settings and IP addresses using the built in profiles. If you can work this out, you should be able to set up your server on the right IP and start the software.

In theory, if you have the badge kit and a Pi 4 – that should be everything you need (with isc-dhcp-server and hostapd on the pi) to get started. When you inevitably have issues, get involved on the github!

Categories
Releases

Behind the Scenes

Well, the menu won’t work properly without at least one post in the category, so I’d best stick something useful here.

Currently, the Spin Doctor code only works with the B3000N, as recent badge software releases are slightly more tolerant to “bad datagrams” from the server than older variants. As for what that code can do, it’s very advanced! It can:

  • Respond to the badge’s initial ping and make it send an ACK.
  • Crash the badge.
  • Force the badge into an “updater mode” that fails to connect to the update server.
  • Turn the badge off.

As you can see, it’s highly advanced and has close feature-parity with the OEM Voice Server. It’s naturally not ready to be made public. But, for a little more information –

The current generation of the code is written in NodeJS, following the traditional model of “scripted boilerplate, handing off to more comprehensive native code for difficult math”. The voice server is very rudimentary, but the model is currently based around a command and control thread which integrates user management, which will hand off to separate threads for RTP and audio conversion, inferencing and interpretation.

There is also a goal to try and write the badge control protocol into it’s own API – I’m hoping the badge isn’t completely tightly coupled to the genie model – and that calls and conferences could be instantiated by the server without a speech-to-text concept ever being applied. This could allow for other use-cases outside of the “Picard to Riker” default.