TAMING OVERCONFIDENCE IN LLMS: REWARD CALIBRATION IN RLHF
- Jixuan Leng
- , Chengsong Huang
- , Banghua Zhu
- , Jiaxin Huang
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
2
Link opens in a new tab
Scopus
citations