Update link to "Learning to Summarize from Human Feedback" (#241)

jon-tow · web-flow · commit a02f806eb172 · 2023-01-31T14:12:40.000-05:00
diff --git a/examples/summarize_rlhf/README.md b/examples/summarize_rlhf/README.md
@@ -1,7 +1,7 @@
 ## Learning to summarize from Human Feedback using `trlx`
 
 This example shows how to use `trlx` to train a summarization model using human feedback
-following the fine-tuning procedures described in Stiennon et al.'s, "[Learning to Summarize from human feedback](https://arxiv.org/abs/2106.00987)".
+following the fine-tuning procedures described in Stiennon et al.'s, "[Learning to Summarize from human feedback](https://arxiv.org/abs/2009.01325)".
 
 
 Before running everything, we need some extra packages not included in the `trlx` dependency list. Specifically, we need HuggingFace's [`evaluate`](https://huggingface.co/docs/evaluate/index) package and Google's re-implementation of ROUGE, [`rouge-score`](https://github.com/google-research/google-research/tree/master/rouge). To install them, run `requirements.txt` in this example's root directory: