` if clip: np.sign(reward)` this operation can not change the value of reward, and it should be ` if clip: reward = np.sign(reward)` Maybe it's the problem of the NumPy version. In my numpy environment, sign didn't support in-place update.