Reproducing result: WMT German-English bleu score is less than the half of the expected score

Thanks for sharing this great work!

Although, I strictly tried to follow the instructions in the ReadMe, I am unable the reproduce the WMT German-English benchmark results on `newstest2015`.

Here are my details:
* python 3.6.2, Tensorflow 1.5.1
* I used the provided `nmt/scripts/wmt16_en_de.sh` to download and pre-process the data files.
* I patched the `nmt/standard_hparams/wmt16.json` by adding two lines `"num_encoder_layers": 4, "num_decoder_layers": 4,` in order to avoid the problem described in #264, and #265.
* I used the following pre-trained models:
    * http://download.tensorflow.org/models/nmt/deen_model_1.zip
    * http://download.tensorflow.org/models/nmt/deen_model_2.zip

I got the following inference results for `newstest_2015`:
* deen_model_1 -- real bleu: 11.7, expected bleu: 27.6 (command to run inference: `python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_1/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_1_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_1_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en`)
* deen_model_2 -- real bleu: 11.8, expected bleu: 28.9 (command to run inference: `python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_2/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_2_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_2_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en`)

Could you please provide any hint or help what am I doing wrong?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions