Aquileo | lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4 · Hugging Face num_train_epochs: 1
learning_rate: 2e-4
total_batch_size: 32
Model tree for lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4
Dataset used to train lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4
Collection including lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4