for good performance on sparse, we shouldn't use any dense vector operation in a single coordinate update. such an update needs to have cost only (number of non-zeros of the returned feature vector by the oracle), which can be very sparse.
e.g. we should remove the dense temp variables here for example,
https://github.com/dalab/dissolve-struct/blob/master/dissolve-struct-lib/src/main/scala/ch/ethz/dalab/dissolve/optimization/DBCFWSolverTuned.scala#L961
we can check how that affects performance on a large sparse binarySVM dataset.