OpenAI Wants AI to Help Humans Train AI

wired.com

Summary

OpenAI developed a new model by fine-tuning its most powerful offering, GPT-4, to assist human trainers tasked with assessing code. The company found that the new model, dubbed CriticGPT, could catch bugs that humans missed. It is also part of an effort to ensure that AI behaves in acceptable ways even as it becomes more capable.

Dylan Hadfield-Menell, a professor at MIT who researches ways to align AI, says the idea of having AI models help train more powerful ones has been kicking around for a while. He says it remains to be seen how generally applicable and powerful it is.

4 Comments

eldiscipulo

A little telling most people don't ever want to talk about RLHF since it detracts from AI being an intelligent system by itself.

getthatmoneyyo

Can't ruin the illusion

iareuniqueOP

This is at least their way of admitting that

practicalmagic

Makes sense. Models need intervention to ensure the right output. People need to be compensated appropriately for the time and data.

Enjoying SpeakBits?

OpenAI Wants AI to Help Humans Train AI

Summary

4 Comments

Enjoying SpeakBits?

4 Comments