Skip to content

Commit 6791719

Browse files
authored
Fix notebook links (#35)
* Add assets * Update notebook * Rename 2048 notebook * Remove autoreload * Allow package building * Update name * Publish new version * Downgrade litellm * Update README.md * Revert python version change * Update uv.lock * Add contributing doc * Remove section header * Publish * Update header * Add launch post pill * Add credits * Add tags * Update links * Remove bottom divider * Adjust position
1 parent 3361fe4 commit 6791719

File tree

1 file changed

+21
-8
lines changed

1 file changed

+21
-8
lines changed

examples/2048/2048.ipynb

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@
1818
"This notebook shows how to train a Qwen 2.5 7B model to play 2048. It will demonstrate how to set up a multi-turn agent, how to train it, and how to evaluate it.\n",
1919
"\n",
2020
"Completions will be logged to OpenPipe, and metrics will be logged to Weights & Biases.\n",
21-
"\n",
22-
"You will learn how to construct an [agentic environment](#Agentic-Environment), how to define a [rollout](#Defining-a-Rollout), and how to run a [training loop](#Training-Loop)."
21+
"\n ",
22+
"You will learn how to construct an [agentic environment](#Environment), how to define a [rollout](#Rollout), and how to run a [training loop](#Loop)."
2323
]
2424
},
2525
{
@@ -84,9 +84,14 @@
8484
},
8585
{
8686
"cell_type": "markdown",
87-
"metadata": {},
87+
"metadata": {
88+
"tags": [
89+
"environment"
90+
]
91+
},
8892
"source": [
8993
"### Agentic Environment\n",
94+
"<a name=\"Environment\"></a>\n",
9095
"\n",
9196
"ART allows your agent to learn by interacting with its environment. In this example, we'll create an environment in which the agent can play 2048.\n",
9297
"\n",
@@ -313,9 +318,14 @@
313318
},
314319
{
315320
"cell_type": "markdown",
316-
"metadata": {},
321+
"metadata": {
322+
"tags": [
323+
"rollout"
324+
]
325+
},
317326
"source": [
318327
"### Defining a Rollout\n",
328+
"<a name=\"Rollout\"></a>\n",
319329
"\n",
320330
"A rollout is a single episode of an agent performing its task. It is generates one or more trajectories, which are lists of messages and choices.\n",
321331
"\n",
@@ -459,8 +469,13 @@
459469
},
460470
{
461471
"cell_type": "markdown",
462-
"metadata": {},
472+
"metadata": {
473+
"tags": [
474+
"loop"
475+
]
476+
},
463477
"source": [
478+
"<a name=\"Loop\"></a>\n",
464479
"### Training Loop\n",
465480
"\n",
466481
"The training loop is where the magic happens. For each of the 500 iterations defined below, the rollout function will be called 18 times in parallel. This means that 18 games will be played at once. Each game will produce a trajectory, which will be used to update the model.\n",
@@ -503,9 +518,7 @@
503518
"\n",
504519
"\n",
505520
"Questions? Join the Discord and ask away! For feature requests or to leave a star, visit our [Github](https://github.com/openpipe/art).\n",
506-
"</div>\n",
507-
"\n",
508-
"<a href=\"https://art.openpipe.ai/\"><img src=\"https://github.com/openpipe/art/raw/notebooks/assets/Header_separator.png\" height=\"5\"></a></a>"
521+
"</div>\n"
509522
]
510523
}
511524
],

0 commit comments

Comments
 (0)