These fake commands are apparently called “Cocaine Noodles,” which is a great name invented by Georgetown researcher Tavish Vaidya that comes from the way Android devices can interpret “Cocaine Noodles” as “Okay, Google” (It also can be used as a fun Vanderpump Rules reference). The researchers working on these secret musical Cocaine Noodles were specifically trying to be as “stealthy” about it as possible, since a study in 2016 already proved that even white noise playing in a YouTube video could trick a phone into turning on airplane mode or opening a webpage.
The newer approach can do much more, though, and while the researchers don’t think any “malicious people” are actually doing this in real life, it’ll probably just be a matter of time before they do. Basically, the way the whole thing works is by taking advantage of the way that speech recognition system translate specific sounds into specific letters. By making “slight changes” to a sound, an audio attacker (Cocaine Noodler?) can change what the machine hears without changing what the human ear would hear. One system that researchers developed could even mute a phone before doing anything else, so the owner wouldn’t hear it responding to commands.
Naturally, Apple, Amazon, and Google all maintain that their systems are secure and that nobody can do anything nefarious with their technology, but anyone who has had to repeat the same simple command to one of these devices knows that it’s not all as infallible as the companies want us to believe. Also, the Times story notes that devices with digital assistants like Alexa or Siri will “outnumber people by 2021,” and if that’s not scary, then you’re probably a Terminator already.