Skip to content

Commit 3d72f34

Browse files
committed
Fixing prepare_olmocrmix
1 parent c93ac4a commit 3d72f34

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

olmocr/train/prepare_olmocrmix.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,9 @@ def prepare_olmocr_mix(dataset_path: str, subset: str, split: str, destination:
5959
parquet_files = [dest_path / "hugging_face" / "train-s2pdf.parquet"]
6060
elif subset == "00_documents" and split == "eval_s2pdf":
6161
parquet_files = [dest_path / "hugging_face" / "eval-s2pdf.parquet"]
62-
elif subset == "01_books" and split == "train_s2pdf":
62+
elif subset == "01_books" and split == "train_iabooks":
6363
parquet_files = [dest_path / "hugging_face" / "train-iabooks.parquet"]
64-
elif subset == "01_books" and split == "train_s2pdf":
64+
elif subset == "01_books" and split == "eval_iabooks":
6565
parquet_files = [dest_path / "hugging_face" / "eval-iabooks.parquet"]
6666
else:
6767
raise NotImplementedError()

0 commit comments

Comments
 (0)