this post was submitted on 02 Sep 2024
156 points (99.4% liked)
technology
23289 readers
37 users here now
On the road to fully automated luxury gay space communism.
Spreading Linux propaganda since 2020
- Ways to run Microsoft/Adobe and more on Linux
- The Ultimate FOSS Guide For Android
- Great libre software on Windows
- Hey you, the lib still using Chrome. Read this post!
Rules:
- 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
- 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
- 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
- 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
- 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
- 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
- 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Not that I disagree with your conclusion because there's an even simpler way to check if an app is listening: iOS and Android will tell you the mic is being used... Anyway, we do have always-on NNs listening for keywords ("Siri,", "Hey google", "Alexa") so I agree that full ass voice transcription like whisper will run like dogshit on your phone they can certainly run a much much lighter model to pick up a handful of keywords.
To Camdat's point, a general transcription is definitely not low power even if you have some kind of gating on when it transcribes. Obviously Apple and Google and Samsung and whoever makes the phone can turn on the mic without you knowing, otherwise how would their voice assistant work, but Apple probably isn't letting Facebook have access to the mic without throwing something up on the status bar.
Whatsapp is sending your audio to the cloud to handle transcription. This is not an accurate test because it is not an on-device process.
Sure this is definitely true. I should clarify that single-word NNs do run on-device all the time, but those require specialized models that are trained only on those keywords. Once those models trigger they need to send everything else to the cloud.
I agree. If I was going to do something like this for advertising though I wouldn't really care too much about what people were saying so instead I'd just listen for some limited set of keywords (maybe for some of my top paying advertisers) and serve ads for keywords that hit recently. Keep it all on device until an ad actually needs to be served.