NextElephant9

joined 1 month ago

Knowledge distilation is training a smaller model to mimic the outputs of a larger model. You don't need to use the same training set that was used to train the larger model (the whole internet or whatever they used for chatgpt), but can use a transfer set.

Here's a reference: Hinton, Geoffrey. "Distilling the Knowledge in a Neural Network." arXiv preprint arXiv:1503.02531 (2015)., https://arxiv.org/pdf/1503.02531

[–] NextElephant9@awful.systems 8 points 1 month ago

Just received a newsletter from Mystery AI Hype Theater 3000 about their book The AI Con - How to Fight Big Tech’s Hype and Create the Future We Want being available for preorder. I'm looking forward to it!

[–] NextElephant9@awful.systems 24 points 1 month ago (1 children)

Hi, I'm new here. I mean, I've been reading but I haven't commented before.

I'm sure you all know about how cheap labour is used for labelling data for training "AI" systems, but I just came across this video and wanted to share. Apologies if it has already been posted: Training AI takes heavy toll on Kenyans working for $2 an hour.