Tech

AI still can’t answer questions about history: study

theloadedgunn.com 21 January 2025

0 4 2 minutes read

AI still can't answer questions about history: study

While artificial intelligence excels at tasks like coding and podcast generation, it struggles to accurately answer high-level history questions, according to a study.

Researchers tested OpenAI’s GPT-4, Meta’s Llama and Google’s Gemini using a newly developed benchmark called Hist-LLM.

The benchmark relies on the Seshat Global History Databank, a comprehensive database of historical knowledge.

The study, which was presented at the NeurIPS AI conference last month, found disappointing results, according to TechCrunch.

GPT-4 Turbo performed best but only achieved about 46% accuracy — barely above random guessing.

“LLMs, while impressive, still lack the depth required for advanced history,” said Maria del Rio-Chanona, a co-author of the paper and associate professor at University College London.

“They’re great for basic facts, but they fail at nuanced, PhD-level historical inquiries.”

Researchers found that LLMs often extrapolate from prominent historical data but struggle with more obscure details.

For instance, GPT-4 incorrectly stated that scale armor was present in ancient Egypt during a specific time period, when in reality, the technology only appeared 1,500 years later.

Similarly, the model falsely claimed ancient Egypt had a professional standing army during a particular period, likely due to the prevalence of information on standing armies in other ancient empires, such as Persia.

“If you get told A and B 100 times, and C only once, you’re more likely to recall A and B,” del Rio-Chanona explained.

Another concern was potential bias.

OpenAI’s GPT-4 and Meta’s Llama models performed worse when answering questions about regions such as sub-Saharan Africa, indicating training data limitations.

“These biases suggest LLMs reflect gaps in historical documentation rather than an unbiased representation of history,” said Peter Turchin, the study’s lead researcher.

Despite these limitations, researchers remain hopeful that AI can assist historians in the future.

They plan to refine the Hist-LLM benchmark by incorporating more diverse data sources and increasing the complexity of the questions.

“Our findings highlight areas where LLMs need improvement, but they also showcase their potential to support historical research,” the paper concluded.

As AI continues to evolve, experts say it is clear that human historians remain irreplaceable in interpreting complex historical narratives and ensuring accuracy in academic inquiry.

Source link

theloadedgunn.com

Florida thug caught on video violently snatching lottery winnings from 83-year-old woman, who falls to the ground amid fight

Wellness fad can help you lose weight and get better sleep, plus 10 more benefits

Related Articles

Venmo, PayPal users can help US government pay down $36.7T debt

Americans spend nearly half their day online — whether it’s work or play — ‘eye-opening’ poll shows

Potential fraud victim warns about ‘jury duty scam’ that almost stole her money

Elon Musk hits back at Trump after president said he didn’t want to strip Tesla boss of subsidies

Leave a Reply Cancel reply