[FEATURE] Add GRPO Support

🚀 Feature

Add GRPO Support

Motivation

With the release of DeepSeek's R1 model, GRPO has been shown to be a powerful way to instill reasoning capabilities in models for cases where there is either labeled data or a verifier. This request is to add support to train a model with GRPO, perhaps with a focus on building reasoning abilities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add GRPO Support #900

🚀 Feature

Motivation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[FEATURE] Add GRPO Support #900

Description

🚀 Feature

Motivation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions