Skip to content
Open
Show file tree
Hide file tree
Changes from 153 commits
Commits
Show all changes
164 commits
Select commit Hold shift + click to select a range
cd9d9aa
Add torchao installation to conversion
2015aroras Jul 8, 2025
ed0aeaf
Merge pull request #131 from allenai/shanea/olmo3-test
2015aroras Jul 15, 2025
41926be
Remove github cli from decon setup
no0p Jul 15, 2025
9e425bc
Merge pull request #136 from allenai/no-gh-cli-in-decon-setup
no0p Jul 15, 2025
cf3f4e6
placeholder swarm config
mayeechen Jul 16, 2025
fffb214
add new task config
davidheineman Jul 17, 2025
d8076b6
fix name
davidheineman Jul 18, 2025
02f14ca
move n32
davidheineman Jul 18, 2025
c7f7159
update table rendering logic
davidheineman Jul 18, 2025
031ab6e
dont need that col
davidheineman Jul 18, 2025
41fcfde
add back sorted
davidheineman Jul 18, 2025
1c8ce80
Merge pull request #140 from allenai/base-v1
davidheineman Jul 18, 2025
15c6c04
add swarm code (standard dirichlet, no staged/conditional swarming yet)
mayeechen Jul 18, 2025
3bab8ee
disable some styled adapt subtasks
davidheineman Jul 20, 2025
400de08
update basic swarm config
mayeechen Jul 20, 2025
1d72e90
add back in dirichlet sampling for topic x source, set prior = [1, ..…
mayeechen Jul 20, 2025
ea2a262
typo
mayeechen Jul 20, 2025
8676d14
change experimentconfig to ignore extras
mayeechen Jul 20, 2025
6da5583
allow for passing in sources to train.py (after being computed from s…
mayeechen Jul 20, 2025
bad1c6d
once mixes are computed, pass them in as an extra argument to train.py
mayeechen Jul 20, 2025
56c8803
cache mixes according to group_uuid
mayeechen Jul 21, 2025
1e76908
merge conflict
mayeechen Jul 21, 2025
18c8713
Merge remote-tracking branch 'origin/olmo3-anneals' into mayeec/annea…
mayeechen Jul 21, 2025
21962f4
Add initial impl of conversion from HF
2015aroras Jul 23, 2025
f04c91e
Fix remote command
2015aroras Jul 23, 2025
e8eefe4
Fix remote command
2015aroras Jul 23, 2025
a916253
More failure logging
2015aroras Jul 23, 2025
4556248
fix indexing errors, add code swarm
mayeechen Jul 23, 2025
ffe75ed
Revert "More failure logging"
2015aroras Jul 23, 2025
28dabbf
More failure logging
2015aroras Jul 23, 2025
1b467cf
Pass along all args
2015aroras Jul 23, 2025
142ab63
Merge pull request #143 from allenai/shanea/convert-from-hf
2015aroras Jul 23, 2025
923470d
fix empty source bug
mayeechen Jul 23, 2025
aa53e81
update swarm size
mayeechen Jul 23, 2025
5c1c4ac
add reasoning swarm
mayeechen Jul 23, 2025
4ebe102
adjust code swarm
mayeechen Jul 23, 2025
4e0b9fa
Adds support for AWS_PROFILES in pmr (#144)
undfined Jul 24, 2025
d737d3d
reddit swarm
mayeechen Jul 25, 2025
777dbc9
merge conflict
mayeechen Jul 25, 2025
b73b1c1
add math swarm
mayeechen Jul 31, 2025
0d95e8b
new v2 configs
davidheineman Aug 1, 2025
0b031d5
add fixed code-swarm
mayeechen Aug 1, 2025
7070c10
5b natural baseline
mayeechen Aug 1, 2025
a126ac8
fix naming
mayeechen Aug 1, 2025
ff13e6f
Merge remote-tracking branch 'origin' into mayeec/anneal-swarm
mayeechen Aug 1, 2025
e1236d4
natural distribution reasoning 5b
mayeechen Aug 1, 2025
3477221
error when oe-eval branch or commit is provided.
soldni Aug 1, 2025
76fc74c
allow for --source to be None in train.py (swarms were breaking compa…
mayeechen Aug 1, 2025
e6aa549
fix --source issue
mayeechen Aug 1, 2025
406ce3c
Merge pull request #147 from allenai/soldni/gantry-error
davidheineman Aug 2, 2025
a7c8e1b
Merge remote-tracking branch 'origin' into mayeec/anneal-swarm
mayeechen Aug 2, 2025
dd60835
fixed code swarm
mayeechen Aug 2, 2025
9beac86
code natural distr 5B
mayeechen Aug 2, 2025
3a418d0
fix source/topic structure
mayeechen Aug 2, 2025
86afe78
Fix target ratios for code natural
mayeechen Aug 3, 2025
21bf5ab
typo in olmo3:dev:7b:code_gen_mini:v2:n32:pass_at_16
mayeechen Aug 3, 2025
012bfbf
10B baselines
mayeechen Aug 3, 2025
0e54beb
priority
mayeechen Aug 3, 2025
db27ab8
fixed math natural baseline
mayeechen Aug 4, 2025
c85dfe2
add proposed reasoning mix
mayeechen Aug 4, 2025
7ebbcd6
save script
mayeechen Aug 5, 2025
6fcedb9
Merge remote-tracking branch 'origin' into mayeec/anneal-swarm
mayeechen Aug 6, 2025
3f52c96
Merge remote-tracking branch 'origin/olmo3-anneals' into mayeec/annea…
mayeechen Aug 6, 2025
8c0e802
round 2 swarm config plus slight fixes to conditional swarm setup
mayeechen Aug 6, 2025
7e7c4ca
rename swarm config
mayeechen Aug 6, 2025
738a9c1
update priority
mayeechen Aug 6, 2025
8fdefe9
Fixes conditional bug when loading ckpt (#149)
undfined Aug 6, 2025
5756cbe
check nonzero source (#150)
soldni Aug 6, 2025
57dfb57
code swarm
mayeechen Aug 8, 2025
1457f45
update priority
mayeechen Aug 8, 2025
20c23fd
Merge remote-tracking branch 'origin/olmo3-anneals' into mayeec/annea…
mayeechen Aug 8, 2025
9c4f142
merge
mayeechen Aug 8, 2025
327a9ae
add code-math swarm
mayeechen Aug 8, 2025
29d0008
Merge remote-tracking branch 'origin/olmo3-anneals' into mayeec/annea…
mayeechen Aug 11, 2025
9f6f730
Update default budget to ai2/oe-base (#154)
tyler-romero Aug 11, 2025
fd46463
small change to return data
davidheineman Aug 11, 2025
d7eb316
add code math with meta reasoning microanl config with format fixed
Aug 11, 2025
db1d7d5
revert "add code math with meta reasoning microanl config with format…
Aug 11, 2025
9d31464
ignoring workspace (#157)
soldni Aug 12, 2025
4c74ac4
add gen+mc vs code+math sweep
mayeechen Aug 12, 2025
5939e03
Add RULER eval suite (#142)
drschwenk Aug 13, 2025
cead2aa
add balanced gen+mc/code+math mix
mayeechen Aug 13, 2025
ced7a4b
fix RULER task names (#159)
soldni Aug 13, 2025
cada852
Improve APIs for Datalake add/remove from dashboard, cleanup code for…
soldni Aug 14, 2025
e3f3b2f
100% gen/mc mix
mayeechen Aug 14, 2025
4b5a26b
add fixed 50/50 config
mayeechen Aug 14, 2025
d85b77f
adjustments and minor fixes for more flexibly specifying swarms
mayeechen Aug 14, 2025
98db804
flat gen/mc/code/math swarm
mayeechen Aug 15, 2025
d79bae0
v2 of reasoning swarm
mayeechen Aug 17, 2025
d37c6ab
save bash scripts
mayeechen Aug 18, 2025
cbc6540
part 2 of superswarm
mayeechen Aug 23, 2025
ff71fe6
natural and round 5 5b
mayeechen Aug 26, 2025
4f03e20
add 5B proposed mixes
mayeechen Aug 26, 2025
5b59333
fix 0 weight paths
mayeechen Aug 26, 2025
63602b9
add cost ablation exps
mayeechen Aug 27, 2025
33117e5
update repetition factor for cost ablations
mayeechen Aug 27, 2025
c8ba751
fix naming
mayeechen Aug 27, 2025
90a353b
remove confirmation dialogues in Gantry (#163)
soldni Aug 27, 2025
5f623b0
Support OlmoCore eval backend (#148)
tyler-romero Aug 28, 2025
c25d3ce
add proposed flat mixes
mayeechen Aug 31, 2025
8360054
adjust priority
mayeechen Aug 31, 2025
7d223ff
more proposed mixes
mayeechen Aug 31, 2025
3e37e7f
natural distributions
mayeechen Sep 1, 2025
929af78
add patched baselines
mayeechen Sep 2, 2025
fe134ba
fix yaml
mayeechen Sep 2, 2025
9359595
fix removed paths
mayeechen Sep 2, 2025
e93a151
remove 0 paths
mayeechen Sep 2, 2025
f350a28
merge conflict
mayeechen Sep 2, 2025
e3d07ad
merge
mayeechen Sep 2, 2025
a091ef5
Handle HF Token for evals that require it; upgrade Gantry to support …
soldni Sep 4, 2025
f74a724
evil proposed mix
mayeechen Sep 5, 2025
f52c608
add evil s2pdf
mayeechen Sep 5, 2025
3bad686
Try to pin torchao for conversion
mayeechen Sep 6, 2025
3a2a200
add conditional dclm mixes
mayeechen Sep 7, 2025
7a4c311
1B runs for domain removal
mayeechen Sep 7, 2025
8caad94
1b mix that tests partitioning S2PDF
mayeechen Sep 8, 2025
71df7b3
remove empty sources
mayeechen Sep 8, 2025
6ee591c
1b dclm+s2pdf flat (32 runs)
Sep 10, 2025
8d8dcd9
128 sample dclm+s2pdf
Sep 10, 2025
1f6ac1e
Prefer new cluster names (#166)
soldni Sep 10, 2025
a272014
add 1T dclm+stackedu (flat and conditional)
mayeechen Sep 11, 2025
334cc55
adjust repetition factor
mayeechen Sep 11, 2025
c52d67d
s2pdf + dclm partitioned
mayeechen Sep 11, 2025
16e9f52
dclm stackedu 128
mayeechen Sep 12, 2025
cca5304
updated 128 run mixes
mayeechen Sep 12, 2025
a5411db
64 conditional
mayeechen Sep 12, 2025
e777617
flat s2pdf-all-dressed (for translation)
mayeechen Sep 13, 2025
e6bcf63
for 2T adding data one node (build up to independnece result)
mayeechen Sep 13, 2025
53af493
add tree
mayeechen Sep 14, 2025
582bb11
update repetition factor
mayeechen Sep 14, 2025
d7250bd
stackedu evil conditional dclm
mayeechen Sep 15, 2025
799b9b9
add conditional(64, 64) runs with seeds 1 and 2
mayeechen Sep 16, 2025
0ef7068
remove empty domains
mayeechen Sep 16, 2025
97100c4
dclm+s2pdf flat @ 1T
mayeechen Sep 16, 2025
0b8c551
making unique cluster set (#167)
soldni Sep 16, 2025
2f6a1b2
dclm+s2pdf conditional 1T
mayeechen Sep 16, 2025
2ec13c0
updated configs
mayeechen Sep 16, 2025
72a79b4
dclm finemath flat 6T
mayeechen Sep 17, 2025
fef1497
fix rep factor
mayeechen Sep 17, 2025
fbbf1c5
fix repetition factor
mayeechen Sep 17, 2025
2b8bf64
s2pdf all dressed
mayeechen Sep 17, 2025
a276b04
s2pdf all dressed naming
mayeechen Sep 17, 2025
b34465a
fix empty domains
mayeechen Sep 18, 2025
f3584ff
Make work_dir/dataset_cache not point to a url on Augusta (#168)
tyler-romero Sep 18, 2025
41713e6
Remove cookbook WSD in favor of OlmoCore WSD (#169)
tyler-romero Sep 18, 2025
0289a7e
dclm stackedu partitioned
mayeechen Sep 19, 2025
b69baca
dclm+finemath conditional
mayeechen Sep 20, 2025
ffd43d9
exact computation proposed mix
mayeechen Sep 21, 2025
acc7d0c
add dclm+s2pdf exact @ 6T
mayeechen Sep 23, 2025
6ed45d3
add dclm+stackedu exact @ 1T
mayeechen Sep 23, 2025
e6acc8f
add dclm+s2pdf exact @ 1T
mayeechen Sep 23, 2025
ac657a8
Remove legacy task configs (#165)
davidheineman Sep 27, 2025
f25a39d
merge conflict
mayeechen Sep 30, 2025
07b8453
Add `olmo3:dev:1b:qa:bpb` (#172)
davidheineman Oct 3, 2025
0311f0a
Add `olmo3:paper` (#173)
davidheineman Oct 13, 2025
21f56c6
dclm conditional stackedu exact
mayeechen Oct 16, 2025
f51e791
cost ablation baseline comparison
mayeechen Oct 22, 2025
16c4458
remove empty domains
mayeechen Oct 22, 2025
2815ac9
cost ablations
mayeechen Oct 23, 2025
1694324
adjust priority
mayeechen Oct 23, 2025
7645b58
add transformation experiments (DCLM + Pes2o -> S2PDF)
mayeechen Oct 28, 2025
ca90e72
Switching to v3 of gantry for oe eval (#174)
soldni Oct 28, 2025
1e65a10
adjust repetition factor
mayeechen Oct 28, 2025
746949c
Merge remote-tracking branch 'origin/main' into mayeec/anneal-swarm
mayeechen Oct 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -181,3 +181,7 @@ cython_debug/
tmp/
temp/
uv.lock


# ignore vscode workspace settings
*.code-workspace
89 changes: 89 additions & 0 deletions convert.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/bin/bash

count_jobs() {
jobs -r | wc -l
}

: 'for i in $(seq -w 0 15); do
while [ $(count_jobs) -ge 12 ]; do
sleep 5
done

echo "Starting experiment for index $i"
{
exp_id="olmo3_7b-12T-5B-round-2-code-conditional-gen-mcqa-swarm-515eaf2d-$i"
echo $exp_id

olmo-cookbook-eval convert "/oe-data-default/ai2-llm/checkpoints/mayeec/$exp_id/step2385" \
-t olmo-core-v2 \
--use-beaker \
--olmo-core-v2-commit-hash 57a04d0b69047d797c96eede056a211e75b5914a \
--huggingface-transformers-git-url https://github.com/2015aroras/transformers.git \
--huggingface-transformers-commit-hash ae3889ced6ed7362e5883671fc6dc4cb4fece5fa \
--beaker-allow-dirty

} &
done

wait
echo "All experiments completed."
'

: 'for i in $(seq -w 0 47); do
while [ $(count_jobs) -ge 12 ]; do
sleep 5
done

echo "Starting experiment for index $i"
{
exp_id="olmo3_7b-12T-5B-round-2-code-math-conditional-gen-mcqa-swarm-a3e06472-$i"

echo $exp_id

olmo-cookbook-eval convert "/oe-data-default/ai2-llm/checkpoints/mayeec/$exp_id/step2385" \
-t olmo-core-v2 \
--use-beaker \
--olmo-core-v2-commit-hash 57a04d0b69047d797c96eede056a211e75b5914a \
--huggingface-transformers-git-url https://github.com/2015aroras/transformers.git \
--huggingface-transformers-commit-hash ae3889ced6ed7362e5883671fc6dc4cb4fece5fa \
--beaker-allow-dirty

} &
done

wait
echo "All experiments completed."
'



experiments=(
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-30B-a0160e7d
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-35B-37f49e0b
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-40B-3b0ea5e3
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-45B-88bc0109
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-50B-de5ea711
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-55B-bb2e09e0
olmo3_7b-12T-5B-round-3-sweep-gen-mc-code-math-60B-931e11dc
)

for exp_id in "${experiments[@]}"; do
while [ $(count_jobs) -ge 12 ]; do
sleep 5
done

echo "Starting experiment $exp_id"
{
olmo-cookbook-eval convert "/oe-data-default/ai2-llm/checkpoints/mayeec/$exp_id/step2385" \
-t olmo-core-v2 \
--use-beaker \
--olmo-core-v2-commit-hash 57a04d0b69047d797c96eede056a211e75b5914a \
--huggingface-transformers-git-url https://github.com/2015aroras/transformers.git \
--huggingface-transformers-commit-hash ae3889ced6ed7362e5883671fc6dc4cb4fece5fa \
--beaker-allow-dirty
echo "Completed experiment for $exp_id"
} &
done

wait
echo "All experiments completed."
Loading