Go to file

Jeff Wu 953530fc24 update readme with usage caveats and calls for research

This write-up was loosely inspired in part by Mitchell et al.’s work on
[Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993).
Adding such model usage sections could be good practice in general for
open source research projects with potentially broad applications.

2019-03-06 15:06:30 -08:00

gpt-2-samples

add conditional samples with default settings

2019-02-20 20:51:19 -08:00

src

slight fix to batch size description

2019-02-26 23:05:11 -08:00

.gitattributes

add .gitattributes file to ensure files copied to docker container have LF line endings and all files stay unix on commit

2019-02-21 10:27:27 -08:00

.gitignore

First commit

2019-02-10 23:47:49 -08:00

Dockerfile.cpu

update download stuff

2019-03-04 10:46:37 -08:00

Dockerfile.gpu

update download stuff

2019-03-04 10:46:37 -08:00

download_model.py

update download stuff

2019-03-04 10:46:37 -08:00

LICENSE

add license

2019-02-19 17:06:49 -08:00

README.md

update readme with usage caveats and calls for research

2019-03-06 15:06:30 -08:00

requirements.txt

Python download script (#89 )

2019-03-04 09:57:42 -08:00

README.md

gpt-2

Code and samples from the paper "Language Models are Unsupervised Multitask Learners".

For now, we have only released a smaller (117M parameter) version of GPT-2.

See more details in our blog post.

Usage

This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2-117M. While GPT-2-117M is less proficient than GPT-2-1.5B, it is useful for a wide range of research and applications which could also apply to larger models.

Some caveats

GPT-2-117M robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2-117M for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
The dataset our GPT-2-117M was trained on contains many texts with biases and factual inaccuracies, and thus GPT-2-117M is likely to be biased and inaccurate as well.
To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.

Work with us

Please let us know if you’re doing interesting research with or working on applications of GPT-2-117M! We’re especially interested in hearing from and potentially working with those who are studying

Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
The extent of problematic content (e.g. bias) being baked into the models and effective mitigations

Installation

Git clone this repository, and cd into directory for remaining commands

git clone https://github.com/openai/gpt-2.git && cd gpt-2

Then, follow instructions for either native or Docker installation.

Native Installation

All steps can optionally be done in a virtual environment using tools such as virtualenv or conda.

Install tensorflow 1.12 (with GPU support, if you have a GPU and want everything to run faster)

pip3 install tensorflow==1.12.0

pip3 install tensorflow-gpu==1.12.0

Install other python packages:

pip3 install -r requirements.txt

Download the model data

python3 download_model.py 117M

Docker Installation

Build the Dockerfile and tag the created image as gpt-2:

docker build --tag gpt-2 -f Dockerfile.gpu . # or Dockerfile.cpu

Start an interactive bash session from the gpt-2 docker image.

You can opt to use the --runtime=nvidia flag if you have access to a NVIDIA GPU and a valid install of nvidia-docker 2.0.

docker run --runtime=nvidia -it gpt-2 bash

Sampling scripts

WARNING: Samples are unfiltered and may contain offensive content.

Some of the examples below may include Unicode text characters. Set the environment variable:

export PYTHONIOENCODING=UTF-8

to override the standard stream settings in UTF-8 mode.

Unconditional sample generation

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee /tmp/samples

There are various flags for controlling the samples:

python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee /tmp/samples

To check flag descriptions, use:

python3 src/generate_unconditional_samples.py -- --help

Conditional sample generation

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

To check flag descriptions, use:

python3 src/interactive_conditional_samples.py -- --help

GPT-2 samples

WARNING: Samples are unfiltered and may contain offensive content.

While we have not yet released GPT-2 itself, you can see some samples from it in the gpt-2-samples folder. We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40. We show conditional samples, with contexts drawn from WebText's test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.

Citation

Please use the following bibtex entry:

@article{radford2019language,
  title={Language Models are Unsupervised Multitask Learners},
  author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
  year={2019}
}

Future work

We may release code for evaluating the models on various benchmarks.

We are still considering release of the larger models.

License

MIT

README.md Unescape Escape

gpt-2

Usage

Some caveats

Work with us

Installation

Native Installation

Docker Installation

Sampling scripts

Unconditional sample generation

Conditional sample generation

GPT-2 samples

Citation

Future work

License

README.md