Releases · minimaxir/gpt-2-simple

18 Oct 02:38

v0.8.1

d1e97f5

Latest

Thanks to https://github.com/YaleDHLab via #275, gpt-2-simple now supports TensorFlow 2 by default, and the minimum TensorFlow version is now 2.5.1! The Colab Notebook has also been update to no longer use TensorFlow 1.X.

Note: Development on gpt-2-simple has mostly been superceded by aitextgen, which has similar AI text generation capabilities with more efficient training time and resource usage. If you do not require using TensorFlow, I recommend using aitextgen instead. Checkpoints trained using gpt-2-simple can be loaded using aitextgen as well.

Assets 3

14 Feb 21:13

minimaxir

v0.7.2

afe38c4

Fix model URL

Switched the model URL from GCP to Azure. (#253)
Pin TensorFlow 1.15 (#200)
Add checkpoint loading from other checkpoints (#175)

Assets 3

28 Dec 04:05

minimaxir

v0.7.1

c9ed489

Remove finetuning asserts

Some have successfully finetuned 774M/1558M, so the assert has been removed.

Assets 3

01 Dec 18:39

minimaxir

v0.7

35956e3

Multi-GPU support + TF 2.0 assert

Multi-GPU support (#127) (not fully tested; will add some docs when done)
Fixed checkpoint dir bug (#134)
Added a hard assert of a TensorFlow version >= 2.0 is used (#137)

Assets 3

28 Aug 17:11

minimaxir

v0.6

e6afb28

Handle 774M (large)

774M is explicitly blocked from being fine-tuned and will trigger an assert if attempted. If a way to finetune it without being super-painful is added, the ability to finetune it will be restored.
Allow ability to generate text from the default pretrained models by passing model_name to gpt2.load_gpt2() and gpt2.generate() (this will work with 774M.
Addsgd as an optimizer parameter to finetune (default: adam)
Support for changed model names, w/ changes more prominent in the README.

Assets 3

29 Jul 00:07

minimaxir

v0.5.4

4c36ea7

Polish before TF 2.0

Merged a few PRs:

Fixed generate cmd run name: #78
Resolved most depreciation warnings: #83
Optional model parameters: #90

This does not make the package fully TF 2.0 compatible, but it's a big step!

Assets 2

19 Jun 05:35

minimaxir

v0.5.3

7c8e969

Remove assertion

Assertion was triggering false positives, so removing it.

Assets 3

18 Jun 04:00

minimaxir

v0.5.2

d9b673e

Prevent OOB + Cap Gen Length

Minor fix to prevent issue hit with gpt-2-cloud-run.

A goal of the release was to allow a graph reset without resetting the parameters; that did not seem to work, so holding off on that release.

Assets 3

16 Jun 03:16

minimaxir

v0.5.1

e715c8d

Fixed prefix + miscellaneous bug fixes

Merged PRs (including fix for prefix issue). (see commits for more info)

Assets 3

20 May 03:53

minimaxir

v0.5

7dc8210

A bunch of highly-requested features

Adapted a few functions from Neil Shepperd's fork:

Nucleus Sampling (top_p) when generating text, which results in surprisingly different results. (setting top_p=0.9 works well). Supercedes top_k when used. (#51)
An encode_dataset() function to preencode and compress a large dataset before loading it for finetuning. (#19, #54)

Improvements to continuing model training:

overwrite argument for finetune: with restore_from="latest", this continues model training without creating a duplicate copy of the model, and is therefore good for transfer learning using multiple datasets (#20)
You can continue to finetune a model without having the original GPT-2 model present.

Improvements with I/O involving Colaboratory

Checkpoint folders are now packaged into a .tar file when copying to Google Drive, and when copying from Google Drive, the '.tar' file is automatically unpackaged into the correct checkpoint format. (you can pass copy_folder=True to the copy_checkpoint function to revert to the old behavior). (#37: thanks @woctezuma !)
copy_checkpoint_to_gdrive and copy_checkpoint_from_gdrive now take a run_name argument instead of a checkpoint_folder argument.

Miscellaneous

Added CLI arguments for top_k, top_p, overwrite.
Cleaned up redundant function parameters (#39)

Assets 3

Uh oh!

Releases: minimaxir/gpt-2-simple

v0.8.1: TensorFlow 2 support

Uh oh!

Fix model URL

Uh oh!

Remove finetuning asserts

Uh oh!

Multi-GPU support + TF 2.0 assert

Uh oh!

Handle 774M (large)

Uh oh!

Polish before TF 2.0

Uh oh!

Remove assertion

Uh oh!

Prevent OOB + Cap Gen Length

Uh oh!

Fixed prefix + miscellaneous bug fixes

Uh oh!

A bunch of highly-requested features

Adapted a few functions from Neil Shepperd's fork:

Improvements to continuing model training:

Improvements with I/O involving Colaboratory

Miscellaneous

Uh oh!