New open-source platform allows users to evaluate performance of AI-powered chatbots

  • 📰 ScienceDaily
  • ⏱ Reading Time:
  • 61 sec. here
  • 10 min. at publisher
  • 📊 Quality Score:
  • News: 52%
  • Publisher: 53%

Computer Modeling News

Mathematics,Computers And Internet,Mathematical Modeling

Researchers have developed a platform for the interactive evaluation of AI-powered chatbots such as ChatGPT.

A team of computer scientists, engineers, mathematicians and cognitive scientists developed an open-source evaluation platform called CheckMate, which allows human users to interact with and evaluate the performance of large language models .

The researchers suggest models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, make better assistants. Human users of LLMs should verify their outputs carefully, given their current shortcomings., could be useful in both informing AI literacy training, and help developers improve LLMs for a wider range of uses.

"When talking to mathematicians about LLMs, many of them fall into one of two main camps: either they think that LLMs can produce complex mathematical proofs on their own, or that LLMs are incapable of simple arithmetic," said co-first author Katie Collins from the Department of Engineering."Of course, the truth is probably somewhere in between, but we wanted to find a way of evaluating which tasks LLMs are suitable for and which they aren't.

"One of the things we found is the surprising fallibility of these models," said Collins."Sometimes, these LLMs will be really good at higher-level mathematics, and then they'll fail at something far simpler. It shows that it's vital to think carefully about how to use LLMs effectively and appropriately."

 

Thank you for your comment. Your comment will be published after being reviewed.
Please try again later.
We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

 /  🏆 452. in Aİ

Ai Ai Latest News, Ai Ai Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

ChatGPT is coming to iOS 18, but ChatGPT Plus will still be betterApple's deal with OpenAI has been months in the making, with Siri reportedly running some queries on ChatGPT in iOS 18.
Source: BGR - 🏆 234. / 63 Read more »

30 Students Saved From Suicide By A ChatGPT Based AI, Say ResearchersInternationally known as The AI Educator. Bestselling author of 'The AI Classroom: The Ultimate Guide to Artificial Intelligence in Education.' Working with schools, universities and businesses worldwide to develop AI skills and strategy.
Source: ForbesTech - 🏆 318. / 59 Read more »

OpenAI starts training a new AI model to power ChatGPTArtificial intelligence company OpenAI said Tuesday that it has started training its newest AI model that will fuel the popular ChatGPT chatbot.
Source: sdut - 🏆 5. / 95 Read more »

OpenAI’s new ChatGPT privacy tool lets creators hide their work from the AIOpenAI finally announced a privacy tool for products like ChatGPT to help creators prevent their copyright content from training AI models.
Source: BGR - 🏆 234. / 63 Read more »