pszemraj/simplepile-lite
Viewer • Updated • 465k • 29 • 1
How to use pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e") # Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM
tokenizer = AutoTokenizer.from_pretrained("pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e")
model = AutoModelForMultimodalLM.from_pretrained("pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e")How to use pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e
How to use pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e with Docker Model Runner:
docker model run hf.co/pszemraj/pythia-31m-simplepile-lite-2048-scratch-2e
Train from scratch based on config of EleutherAI/pythia-31m on the None dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 7.4089 | 0.07 | 100 | 7.3885 | 0.1133 |
| 6.2774 | 0.13 | 200 | 6.2091 | 0.1621 |
| 5.7019 | 0.2 | 300 | 5.7450 | 0.1890 |
| 5.4922 | 0.27 | 400 | 5.4697 | 0.2080 |
| 5.233 | 0.33 | 500 | 5.2846 | 0.2195 |
| 5.0523 | 0.4 | 600 | 5.1479 | 0.2296 |
| 4.9396 | 0.47 | 700 | 5.0391 | 0.2376 |
| 4.7633 | 0.53 | 800 | 4.9366 | 0.2458 |
| 4.7516 | 0.6 | 900 | 4.8339 | 0.2559 |
| 4.5937 | 0.67 | 1000 | 4.7286 | 0.2676 |
| 4.5079 | 0.73 | 1100 | 4.6293 | 0.2798 |
| 4.4608 | 0.8 | 1200 | 4.5433 | 0.2903 |
| 4.3426 | 0.87 | 1300 | 4.4719 | 0.2988 |
| 4.1722 | 0.93 | 1400 | 4.4089 | 0.3057 |
| 4.1655 | 1.0 | 1500 | 4.3585 | 0.3107 |
| 4.0927 | 1.07 | 1600 | 4.3101 | 0.3161 |
| 4.1439 | 1.13 | 1700 | 4.2714 | 0.3206 |
| 4.0064 | 1.2 | 1800 | 4.2330 | 0.3249 |
| 4.0633 | 1.27 | 1900 | 4.2015 | 0.3281 |
| 3.9948 | 1.33 | 2000 | 4.1702 | 0.3311 |
| 3.9389 | 1.4 | 2100 | 4.1439 | 0.3338 |
| 3.8833 | 1.47 | 2200 | 4.1200 | 0.3367 |
| 3.8411 | 1.53 | 2300 | 4.0949 | 0.3395 |
| 3.8481 | 1.6 | 2400 | 4.0764 | 0.3408 |
| 3.8397 | 1.67 | 2500 | 4.0578 | 0.3420 |
| 3.8897 | 1.73 | 2600 | 4.0383 | 0.3440 |
| 3.8785 | 1.8 | 2700 | 4.0206 | 0.3459 |
| 3.8126 | 1.87 | 2800 | 4.0044 | 0.3478 |
| 3.783 | 1.93 | 2900 | 3.9891 | 0.3498 |
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 24.7 |
| ARC (25-shot) | 21.59 |
| HellaSwag (10-shot) | 25.79 |
| MMLU (5-shot) | 24.99 |
| TruthfulQA (0-shot) | 50.62 |
| Winogrande (5-shot) | 48.62 |
| GSM8K (5-shot) | 0.0 |
| DROP (3-shot) | 1.32 |