Files
gpt-2/README.md

55 lines
2.4 KiB
Markdown
Raw Permalink Normal View History

2019-07-26 17:02:46 -07:00
**Status:** Archive (code is provided as-is, no updates expected)
2019-02-10 20:22:00 -08:00
# gpt-2
2019-05-03 15:43:08 -07:00
Code from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).
2019-02-14 08:43:50 -08:00
2019-05-03 15:26:08 -07:00
We have currently released small (117M parameter) and medium (345M parameter) versions of GPT-2. While we have not released the larger models, we have [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.
2019-02-14 08:43:50 -08:00
See more details in our [blog post](https://blog.openai.com/better-language-models/).
2019-02-10 20:22:00 -08:00
## Usage
2019-05-02 20:39:33 -07:00
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2.
### Some caveats
2019-05-02 20:39:33 -07:00
- GPT-2 models' robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2 for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
- The dataset our GPT-2 models were trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2 models are likely to be biased and inaccurate as well.
- To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.
### Work with us
2019-05-02 20:39:33 -07:00
Please [let us know](mailto:languagequestions@openai.com) if youre doing interesting research with or working on applications of GPT-2! Were especially interested in hearing from and potentially working with those who are studying
- Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
- The extent of problematic content (e.g. bias) being baked into the models and effective mitigations
## Development
2019-02-10 20:22:00 -08:00
See [DEVELOPERS.md](./DEVELOPERS.md)
2019-02-19 17:48:19 -08:00
## Contributors
See [CONTRIBUTORS.md](./CONTRIBUTORS.md)
2019-02-28 15:51:34 -08:00
## Citation
Please use the following bibtex entry:
```
@article{radford2019language,
title={Language Models are Unsupervised Multitask Learners},
author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
year={2019}
}
```
2019-02-14 00:17:55 -08:00
## Future work
We may release code for evaluating the models on various benchmarks.
We are still considering release of the larger models.
2019-02-28 15:51:34 -08:00
## License
[MIT](./LICENSE)