2d0b62225ce6e5360f633a168f33913cca951e74
This fixes https://github.com/openai/gpt-2/issues/26 ``` File "C:\Users\James Pollack\Desktop\gpt-2\src\encoder.py", line 112, in get_encoder bpe_data = f.read() File "C:\Anaconda\envs\gpt-2\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 66951: character maps to <undefined>```
gpt-2
Code and samples from the paper "Language Models are Unsupervised Multitask Learners".
For now, we have only released a smaller (117M parameter) version of GPT-2.
See more details in our blog post.
Installation
Download the model data (needs gsutil):
sh download_model.sh 117M
Install python packages:
pip3 install -r requirements.txt
Unconditional sample generation
WARNING: Samples are unfiltered and may contain offensive content. |
---|
To generate unconditional samples from the small model:
python3 src/generate_unconditional_samples.py | tee samples
There are various flags for controlling the samples:
python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee samples
While we have not yet released GPT-2 itself, you can see some unconditional samples from it (with default settings of temperature 1 and no truncation) in gpt2-samples.txt
.
Conditional sample generation
To give the model custom prompts, you can use:
python3 src/interactive_conditional_samples.py
Future work
We may release code for evaluating the models on various benchmarks.
We are still considering release of the larger models.
Languages
Python
100%