back to main landing Tweets ⌂

Tweet

2024-08-14T15:17:02+00:00 | 🔗

Thoughts on lmsys. Say there is a large cohort of human annotators giving feedback for RLHF and generating data. You train a model on that. Then you direct the same cohort of people onto lmsys. Your model will artificially perform better because it has the prefs of those ppl

Where else should I get information from? contact me somehow if I'm missing something, I really appreciate new interesting sources of information.