Level 4 · Expert · 9 min
RLHF
Reinforcement learning from human feedback. The training stage that turns a base model into a helpful assistant.
Reinforcement learning from human feedback. The training stage that turns a base model into a helpful assistant.