TechTakes

1435 readers

188 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 1 year ago

MODERATORS

dgerard@awful.systems

come see all the popular super-duper-autocomplete systems failing hard at really simple reasoning questions and babbling nonsense from latent space! (arxiv.org)

submitted 5 months ago by dgerard@awful.systems to c/techtakes@awful.systems

12 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] 200fifty@awful.systems 31 points 5 months ago* (last edited 5 months ago)

This is my favorite LLM response from the paper I think:

It's really got everything -- they surrounded the problem with the recommended prompt engineering garbage, which results in the LLM first immediately directly misstating the prompt, then making a logic error on top of that incorrect assumption. Then when it tries to consider alternate possibilities it devolves into some kind of corporate-speak nonsense about 'inclusive language', misinterprets the phrase 'inclusive language', gets distracted and starts talking about gender identity, then makes another reasoning error on top of that! (Three to five? What? Why?)

And then as the icing on the cake, it goes back to its initial faulty restatement of the problem and confidently plonks that down as the correct answer surrounded by a bunch of irrelevant waffle that doesn't even relate to the question but sounds superficially thoughtful. (It doesn't matter how many of her nb siblings might identify as sisters because we already know exactly how many sisters she has! Their precise gender identity completely doesn't matter!)

Truly a perfect storm of AI nonsense.