Jeff Wu 953530fc24 update readme with usage caveats and calls for research
This write-up was loosely inspired in part by Mitchell et al.’s work on
[Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993).
Adding such model usage sections could be good practice in general for
open source research projects with potentially broad applications.
2019-03-06 15:06:30 -08:00
2019-02-26 23:05:11 -08:00
2019-02-10 23:47:49 -08:00
2019-03-04 10:46:37 -08:00
2019-03-04 10:46:37 -08:00
2019-03-04 10:46:37 -08:00
2019-02-19 17:06:49 -08:00
2019-03-04 09:57:42 -08:00

gpt-2

Code and samples from the paper "Language Models are Unsupervised Multitask Learners".

For now, we have only released a smaller (117M parameter) version of GPT-2.

See more details in our blog post.

Usage

This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2-117M. While GPT-2-117M is less proficient than GPT-2-1.5B, it is useful for a wide range of research and applications which could also apply to larger models.

Some caveats

  • GPT-2-117M robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2-117M for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
  • The dataset our GPT-2-117M was trained on contains many texts with biases and factual inaccuracies, and thus GPT-2-117M is likely to be biased and inaccurate as well.
  • To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.

Work with us

Please let us know if youre doing interesting research with or working on applications of GPT-2-117M! Were especially interested in hearing from and potentially working with those who are studying

  • Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
  • The extent of problematic content (e.g. bias) being baked into the models and effective mitigations

Installation

Git clone this repository, and cd into directory for remaining commands

git clone https://github.com/openai/gpt-2.git && cd gpt-2

Then, follow instructions for either native or Docker installation.

Native Installation

All steps can optionally be done in a virtual environment using tools such as virtualenv or conda.

Install tensorflow 1.12 (with GPU support, if you have a GPU and want everything to run faster)

pip3 install tensorflow==1.12.0

or

pip3 install tensorflow-gpu==1.12.0

Install other python packages:

pip3 install -r requirements.txt

Download the model data

python3 download_model.py 117M

Docker Installation

Build the Dockerfile and tag the created image as gpt-2:

docker build --tag gpt-2 -f Dockerfile.gpu . # or Dockerfile.cpu

Start an interactive bash session from the gpt-2 docker image.

You can opt to use the --runtime=nvidia flag if you have access to a NVIDIA GPU and a valid install of nvidia-docker 2.0.

docker run --runtime=nvidia -it gpt-2 bash

Sampling scripts

WARNING: Samples are unfiltered and may contain offensive content.

Some of the examples below may include Unicode text characters. Set the environment variable:

export PYTHONIOENCODING=UTF-8

to override the standard stream settings in UTF-8 mode.

Unconditional sample generation

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee /tmp/samples

There are various flags for controlling the samples:

python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee /tmp/samples

To check flag descriptions, use:

python3 src/generate_unconditional_samples.py -- --help

Conditional sample generation

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

To check flag descriptions, use:

python3 src/interactive_conditional_samples.py -- --help

GPT-2 samples

WARNING: Samples are unfiltered and may contain offensive content.

While we have not yet released GPT-2 itself, you can see some samples from it in the gpt-2-samples folder. We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40. We show conditional samples, with contexts drawn from WebText's test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.

Citation

Please use the following bibtex entry:

@article{radford2019language,
  title={Language Models are Unsupervised Multitask Learners},
  author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
  year={2019}
}

Future work

We may release code for evaluating the models on various benchmarks.

We are still considering release of the larger models.

License

MIT

Description
Code for the paper "Language Models are Unsupervised Multitask Learners"
Readme 4.8 MiB
Languages
Python 100%