-
Notifications
You must be signed in to change notification settings - Fork 530
[FEATURE] Add GRPO Support #900
Copy link
Copy link
Open
Labels
type/featureFeature requestFeature request
Metadata
Metadata
Assignees
Labels
type/featureFeature requestFeature request
Type
Fields
Give feedbackNo fields configured for issues without a type.
🚀 Feature
Add GRPO Support
Motivation
With the release of DeepSeek's R1 model, GRPO has been shown to be a powerful way to instill reasoning capabilities in models for cases where there is either labeled data or a verifier. This request is to add support to train a model with GRPO, perhaps with a focus on building reasoning abilities.