I have come to realize by the implementation in built_ins/gan.py that two separate calls are done in order to update the discriminator. The first optimizer update utilizes the gan loss and the second the gradient penalty.
This is because LossHandle will overwrite (s1) any value for a specific network key, which is an inconvenient behaviour. I can see in s2 and s3 that there was an intention to implement a convenient behaviour, but it seems that it has not been done.
I propose the following, tell me what you think: