Releases: aidos-lab/grokking-via-lid
Releases · aidos-lab/grokking-via-lid
First public release of grokking-via-lid
First public release of the grokking experiments codebase for our paper Less is More: Local Intrinsic Dimensions of Contextual Language Models (to appear in NeurIPS 2025).
What's Changed
- Refactor/anonymize code by @ben300694 in #24
- Update/change python build backend by @ben300694 in #25
- Update/edit licensing information and update references by @ben300694 in #26
Full Changelog: v0.2.0...v0.3.0
Grokking - Supplemental material for NeurIPS 2025 submission (anonymized)
This release contains the first fully documented and cleaned version of the grokking experiments, which are presented in the paper draft "Less is More: Local Intrinsic Dimensions of Contextual Language Models".
Although the code has been anonymized, this release contains the anonymization map that was used; thus, this version of the code is still identifiable.
What's Changed
- Feature/add hydra logging and return hidden states by @ben300694 in #1
- Feature/collect hidden states by @ben300694 in #2
- Feature/preprocess collected hidden states by @ben300694 in #3
- Feature/set seed for datasets and models by @ben300694 in #4
- Feature/local estimates computation by @ben300694 in #5
- Feature/local estimates logging in wandb by @ben300694 in #6
- Experiments/with weight decay and longer runs by @ben300694 in #7
- Feature/log example batches by @ben300694 in #8
- Feature/save model checkpoints by @ben300694 in #9
- Feature/load model checkpoints by @ben300694 in #10
- Feature/save random state in checkpoints by @ben300694 in #11
- Refactor/fix licensing information by @ben300694 in #12
- Adding the value 10000 to the number_of_samples_choices list, which f… by @ben300694 in #13
- Adding a ModMultiplyDataset by @ben300694 in #14
- Feature/visualize output projections by @ben300694 in #15
- Feature/make epsilon parameter in adam optimizer configurable by @ben300694 in #16
- Adding option to register model with wandb.watch so that parameters a… by @ben300694 in #17
- Feature/clip gradient norm by @ben300694 in #18
- Experiments/run different training data portions by @ben300694 in #19
- Feature/linear learning rate schedule by @ben300694 in #20
- Feature/sample from all tokens in input sequences by @ben300694 in #21
- Feature/grokking runs for larger groups by @ben300694 in #22
- Refactor/updated documentation by @ben300694 in #23
New Contributors
- @ben300694 made their first contribution in #1
Full Changelog: https://github.com/ben300694/grokking-private/commits/v0.2.0