-
Notifications
You must be signed in to change notification settings - Fork 67
Description
Hi sir, I think your code is very meaningful and I want to reproduce it, but I have a problem and want to trouble you. When I was training, I found that the following problems occurred.
`Initialize the graph with random parameters.
bucket 0: (10, 23) (3463)
bucket 1: (14, 23) (3396)
bucket 2: (28, 23) (2954)
Epoch 1
0%| | 0/4000 [00:00<?, ?it/s]2020-04-02 09:59:41.480441: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.485770: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.498355: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.502850: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.507510: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.510871: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.514412: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.517675: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.520942: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.523802: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.527851: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.530873: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.534375: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.538748: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.543058: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.546131: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.549882: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.553088: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.556465: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.559658: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.563791: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.566951: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.570187: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.573230: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.576664: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.580121: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.583457: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.586287: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.589678: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.592683: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.596938: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.599845: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.604331: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.607419: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.610917: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.613959: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.617808: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.620678: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.624038: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.627022: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.630438: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.634057: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.638341: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.642178: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.645559: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-02 09:59:41.648711: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node token_decoder_decoder_rnn_2/Attention_0/Conv2D}}]]
[[add_5/_849]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node token_decoder_decoder_rnn_2/Attention_0/Conv2D}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cc/anaconda3/envs/a/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/cc/anaconda3/envs/a/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/cc/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 378, in
tf.compat.v1.app.run()
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/cc/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 353, in main
train(train_set, dataset)
File "/home/cc/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 95, in train
sess, formatted_example, bucket_id, forward_only=False)
File "/home/cc/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/framework.py", line 631, in step
outputs = session.run(output_feed, input_feed)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/cc/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node token_decoder_decoder_rnn_2/Attention_0/Conv2D (defined at /anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]]
[[add_5/_849]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node token_decoder_decoder_rnn_2/Attention_0/Conv2D (defined at /anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'token_decoder_decoder_rnn_2/Attention_0/Conv2D':
File "/anaconda3/envs/a/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/anaconda3/envs/a/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 378, in
tf.compat.v1.app.run()
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/anaconda3/envs/a/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/anaconda3/envs/a/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 353, in main
train(train_set, dataset)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 64, in train
model = define_model(sess, forward_only=False, buckets=train_set.buckets)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/translate.py", line 53, in define_model
FLAGS, session, Seq2SeqModel, buckets, forward_only)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/graph_utils.py", line 142, in define_model
model = model_constructor(params, buckets)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/seq2seq/seq2seq_model.py", line 28, in init
super(Seq2SeqModel, self).init(hyperparams, buckets)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/framework.py", line 71, in init
self.define_graph()
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/framework.py", line 140, in define_graph
encoder_copy_inputs=self.encoder_copy_inputs[:bucket[0]]
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/framework.py", line 256, in encode_decode
encoder_copy_inputs=encoder_copy_inputs)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/seq2seq/rnn_decoder.py", line 199, in define_graph
decoder_cell(input_embedding, state)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/decoder.py", line 240, in call
attns, alignments = self.attention(cell_output)
File "/下载/2 shiyan/nl2bash-master/nl2bash-master/encoder_decoder/decoder.py", line 211, in attention
input_tensor=l * tf.tanh(tf.nn.conv2d(input=v, filters=k, strides=[1,1,1,1], padding="SAME")), axis=[2, 3])
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 1913, in conv2d_v2
name=name)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 2010, in conv2d
name=name)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 1071, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 793, in _apply_op_helper
op_def=op_def)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3360, in create_op
attrs, op_def, compute_device)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3429, in _create_op_internal
op_def=op_def)
File "/anaconda3/envs/a/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1751, in init
self._traceback = tf_stack.extract_stack()
0%| | 0/4000 [00:25<?, ?it/s]
Makefile:41: recipe for target 'train' failed
make: *** [train] Error 1
`