[feat] Add benchmark tools #357

maxreciprocate · 2023-03-06T16:08:16Z

This PR adds tools to cross-reference example runs from different branches.

As an example, the following command will git clone add-benchmark-tools branch and run a set of examples on it, storing them under trlx-references project name under a private username:

python -m trlx.reference add-benchmark-tools

Also it will run the same set of examples on main by default, unless they were already present on the trlx-references project name, and make a w&b report comparing the two sets. Example of a report: https://wandb.ai/sorry/trlx-references/reports/add-benchmark-tools-v-main--VmlldzozOTAwMTQy

Presency of examples' runs is determined by a tags config field which contains hash of all files (excluding markdown). If runs are made on top of some PR, and after the PR is merged into the main, the total hash of files will not change (unlike git commit hash), so that runs won't have to be remade again.

By default runs are referenced against CarperAI/trlx's main, but optionally they can be referenced against any branch:

python -m trlx.reference xu-song/trlx:fix1 --against CarperAI/trlx:convert-examples-configs

cat-state

Thanks for this @reciprocated ! This looks good to me mostly except for a few suggestions, however when I tried to run it I get

  File "/mnt/nvme/home/uwu/conda/nemo-113/lib/python3.8/site-packages/wandb/sdk/wandb_require_helpers.py", line 37, in __init__
    self._check_if_requirements_met()
  File "/mnt/nvme/home/uwu/conda/nemo-113/lib/python3.8/site-packages/wandb/sdk/wandb_require_helpers.py", line 48, in _check_if_requirements_met
    raise Exception(
Exception: You must explicitly enable this feature with `wandb.require("report-editing:v0)"

cat-state · 2023-03-16T13:52:06Z

0001-feat-configs-add-tags-config-option.patch

@@ -0,0 +1,45 @@
+From 7e7144868f71f437e10a15868b99d2cfcc571f3e Mon Sep 17 00:00:00 2001


maybe we should hide this file away in a subfolder

I plan to delete this altogether right before this pr is merged, since it just patches other branches with a subset of changes this pr introduces

got it, that makes sense

cat-state · 2023-03-16T13:52:45Z

scripts/benchmark.sh

+
+rm -rf ../benchmark_logs && mkdir ../benchmark_logs
+
+CUDA_VISIBLE_DEVICES=0 accelerate launch --num_processes 1 --config_file configs/accelerate/zero2-bf16.yaml --main_process_port 8880 examples/ppo_sentiments.py "$args" > ../benchmark_logs/ppo_sentiments.log 2>&1 &


does it happen to change the timings at all if you run all of them together on the same machine vs one by one?

nothing noticable https://wandb.ai/sorry/trlx/reports/Timing-difference--VmlldzozODA4MTA4

cat-state · 2023-03-16T13:57:36Z

scripts/benchmark.sh

+pip install torch --extra-index-url https://download.pytorch.org/whl/cu117
+pip install -e .
+
+args='{"train": {"project_name": "trlx-references", "entity_name": '$entity', "tags": ["'$hash'"]}}'


Maybe worth adding the GPU name to the tags (interconnect would be cool too if there's some easy way to get that)

It's already logged under "System Hardware" in w&b as GPU type: NVIDIA A100-SXM4-80GB, or do you want to see it specifically as a tag?

cat-state · 2023-03-16T14:02:16Z

scripts/benchmark.sh

+
+python -m venv venv
+. venv/bin/activate
+pip install torch --extra-index-url https://download.pytorch.org/whl/cu117


is there some way to do this without mutating the environment e.g using conda create to make a fresh env? since this can break stuff you had installed depending on one version of torch.
Also would probably be good to pin a torch version and tag it? Or otherwise save the pip freeze output somewhere in the run's metadata.

It also failed to install for me with

Using cached https://files.pythonhosted.org/packages/bc/c3/f068337a370801f372f2f8f6bad74a5c140f6fda3d9de154052708dd3c65/Jinja2-3.1.2-py3-none-any.whl Collecting triton==2.0.0; platform_system == "Linux" and platform_machine == "x86_64" (from torch) ERROR: Could not find a version that satisfies the requirement triton==2.0.0; platform_system == "Linux" and platform_machine == "x86_64" (from torch) (from versions: 0.4.1, 0.4.2) ERROR: No matching distribution found for triton==2.0.0; platform_system == "Linux" and platform_machine == "x86_64" (from torch)

This setup directly reflects installation instructions from the https://github.com/CarperAI/trlx#readme. However I wouldn't mind pinning every dependency down both here and there

Also this is odd because I can install triton==2.0.0 on uname -i == x86_64, moreover we probably use the same hardware here

seems to work if i add a python -m pip install pip --upgrade right before installing torch in benchmark.sh, the pip in the created venv was too old.

cat-state · 2023-03-16T14:03:16Z

scripts/benchmark.sh

@@ -0,0 +1,63 @@
+#!/bin/bash
+


maybe add set args to make script die on error

maxreciprocate · 2023-03-16T23:13:47Z

@cat-state you must be using some older wandb version, could you please try it again after upgrading to wandb==0.14.0?

cat-state

Thanks, i was able to run it after updating wandb. I think this would be good to merge after adding the pip upgrade and the $dir check. I also still get

wandb.errors.CommError: It appears that you do not have permission to access the requested resource. Please reach out to the project owner to grant you access. If you have the correct permissions, verify that there are no issues with your networking setup.(Error 403: Forbidden)

even though I have a trlx-references project in my wandb.

cat-state · 2023-03-21T14:52:27Z

scripts/benchmark.sh

+set -ex
+dir=`mktemp -d -p .`
+cd $dir
+trap "rm -rf ../$dir" EXIT


is there any case where $dir could be null/empty string? might be good to add as otherwise you could accidentally rm -rf the containing directory

i don't think there is, but an additional check would hurt either

cat-state · 2023-03-21T15:15:02Z

scripts/benchmark.sh

+
+python -m venv venv
+. venv/bin/activate
+pip install torch --extra-index-url https://download.pytorch.org/whl/cu117


seems to work if i add a python -m pip install pip --upgrade right before installing torch in benchmark.sh, the pip in the created venv was too old.

cat-state · 2023-03-21T15:18:29Z

scripts/benchmark.sh

+branch=main
+entity=null
+only_hash=false
+only_tiny=false


there's no way to set this via references.py

cat-state

Also would be good to add a link to the trlx-references wandb in the readme

cat-state · 2023-03-27T01:37:50Z

Got it to work after changing my default entity from carperai to myself, thank you!

maxreciprocate added 5 commits March 6, 2023 10:52

feat(configs): add tags config option

7e71448

feat(scripts): add benchmark tools

3c710cf

refactor(reference): clean up debug prints

5ab91d2

style(reference): satisfy isort

5264961

style(reference): satisfy CI's isort

658ca6d

maxreciprocate requested a review from cat-state March 6, 2023 16:25

maxreciprocate added 12 commits March 7, 2023 18:37

feat(scripts/benchmark): add ppo_sentiments_t5

4d04cc9

fix(benchmark): ddp -> zero2-bf16 even with 1 process

445cdc2

fix(benchmark): rename wandb project name

5d36f3f

feat(reference): separate metrics per experiment

8fa0b74

chore(benchmark): add ppo_hh to runs, but keep it under 2 hours

f54f14b

feat(reference): add git hashes to descriptions

f8d4c6e

fix(ppo_sentiments_t5): use hparams from sys.argv

bcf4a4f

chore(benchmark): limit ppo_hh's total_steps across branches

b399b2b

fix(reference): set max_runs_to_show to 2

617b75d

feat(benchmark): add hh 6b to set of runs

6d476db

style: satisfy black

b5778c1

feat(reference): add a few simple prints

55d2a7a

maxreciprocate marked this pull request as ready for review March 13, 2023 19:31

maxreciprocate requested a review from Dahoas March 13, 2023 19:54

cat-state reviewed Mar 16, 2023

View reviewed changes

cat-state requested changes Mar 21, 2023

View reviewed changes

cat-state reviewed Mar 21, 2023

View reviewed changes

maxreciprocate added 4 commits March 22, 2023 10:30

Merge branch 'main' into add-benchmark-tools

4c0b11f

feat(benchmark): pin dependencies

0ee4f8a

fix(benchmark): ignore git apply patch failed error

711fb4d

chore(README): add a link to reference runs

fb8cdb7

cat-state approved these changes Mar 27, 2023

View reviewed changes

maxreciprocate added 6 commits March 28, 2023 13:42

refactor(reference): move script under trlx (same as sweeps)

4e60ee2

chore(benchmark): remove patch for other branches

d42e306

revert(ppo_hh): restore default total_steps

cef0b5e

style(reference): satisfy black

f595b87

style(reference): satisfy isort

d5d2367

feat(README): add benchmarking instruction

5db8e6b

maxreciprocate merged commit 114998b into main Mar 28, 2023

maxreciprocate mentioned this pull request Sep 1, 2023

Example/Test Model Benchmarks (Canonical WandB runs) #74

Closed

		@@ -0,0 +1,45 @@
		From 7e7144868f71f437e10a15868b99d2cfcc571f3e Mon Sep 17 00:00:00 2001


		rm -rf ../benchmark_logs && mkdir ../benchmark_logs

		CUDA_VISIBLE_DEVICES=0 accelerate launch --num_processes 1 --config_file configs/accelerate/zero2-bf16.yaml --main_process_port 8880 examples/ppo_sentiments.py "$args" > ../benchmark_logs/ppo_sentiments.log 2>&1 &

[feat] Add benchmark tools #357

[feat] Add benchmark tools #357

Uh oh!

Conversation

maxreciprocate commented Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cat-state left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cat-state Mar 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maxreciprocate commented Mar 16, 2023

Uh oh!

cat-state left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cat-state left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cat-state commented Mar 27, 2023

Uh oh!

Uh oh!

maxreciprocate commented Mar 6, 2023 •

edited

Loading

cat-state Mar 16, 2023 •

edited

Loading

cat-state left a comment •

edited

Loading

cat-state left a comment •

edited

Loading