Hey
I was wondering if you could shed some light on why did you add learnable weights and biases to the sync and identity losses? To me it seemed like you were possibly trying to scale and shift but I don't understand why the model doesn't train without it.
Also if you put learnable weights and biases, whats stopping the weight to be 0 and making loss 0?
I am using voxceleb dataset for training.
Below are the loss curves for when I removed the weights. The sync loss seems to be stagnant while the identity loss is increasing.

