this post was submitted on 04 Oct 2024
19 points (85.2% liked)

technology

23303 readers
410 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 4 years ago
MODERATORS
 

I know people here are very skeptical of AI in general, and there is definitely a lot of hype, but I think the progress in the last decade has been incredible.

Here are some quotes

“In my field of quantum physics, it gives significantly more detailed and coherent responses” than did the company’s last model, GPT-4o, says Mario Krenn, leader of the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany.

Strikingly, o1 has become the first large language model to beat PhD-level scholars on the hardest series of questions — the ‘diamond’ set — in a test called the Graduate-Level Google-Proof Q&A Benchmark (GPQA)1. OpenAI says that its scholars scored just under 70% on GPQA Diamond, and o1 scored 78% overall, with a particularly high score of 93% in physics

OpenAI also tested o1 on a qualifying exam for the International Mathematics Olympiad. Its previous best model, GPT-4o, correctly solved only 13% of the problems, whereas o1 scored 83%.

Kyle Kabasares, a data scientist at the Bay Area Environmental Research Institute in Moffett Field, California, used o1 to replicate some coding from his PhD project that calculated the mass of black holes. “I was just in awe,” he says, noting that it took o1 about an hour to accomplish what took him many months.

Catherine Brownstein, a geneticist at Boston Children’s Hospital in Massachusetts, says the hospital is currently testing several AI systems, including o1-preview, for applications such as connecting the dots between patient characteristics and genes for rare diseases. She says o1 “is more accurate and gives options I didn’t think were possible from a chatbot”.

you are viewing a single comment's thread
view the rest of the comments
[–] woodenghost@hexbear.net 6 points 1 month ago (1 children)

I get it, but code isn't usually included in publications. Unless it was put on GitHub.

[–] Barx@hexbear.net 7 points 1 month ago

Physicist code tends to be pretty simple, particularly when it's just implementing some closed form solution. It is also possible that a model focused on parsing the math in papers - like equations in his thesis - would just reproduce this in Python or whatever.