Skip to content
View mnoukhov's full-sized avatar

Block or report mnoukhov

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. async_rlhf async_rlhf Public

    Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models

    Python 64 10

  2. elastic-reset elastic-reset Public

    Code and Experiments for "Language Model Alignment with Elastic Reset" (NeurIPS 2023)

    Python 5

  3. vwxyzjn/summarize_from_feedback_details vwxyzjn/summarize_from_feedback_details Public

    Python 152 20

  4. emergent-compete emergent-compete Public

    Code for Emergent Communication under Competition (AAMAS 2021)

    Jupyter Notebook 10 2

  5. huggingface/trl huggingface/trl Public

    Train transformer language models with reinforcement learning.

    Python 16k 2.2k

  6. lecture-notes lecture-notes Public

    LaTeX lecture notes CS/ML courses at University of Waterloo and Universite de Montreal

    TeX 12 8