-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hello,
I have a couple of questions regarding the 75.8% synchronization accuracy reported in https://ieeexplore.ieee.org/abstract/document/9067055/
Perfect match Evaluation protocol: The task is to determine the correct synchronisation within a ±15 frame window, and the synchronisation is determined to be correct if the predicted offset is within 1 video frame of the ground truth. A random prediction would therefore yield 9.7% accuracy.
- How does changing M affect the model?
- The training is a 46-way classification. How exactly do you go from 46-way classification to ±15 way classification?
- Do you have the class-split for your evaluation data? Aren't all the test samples in sync? Where do you get out of sync ground truth frames from?
- The accuracy for N-way classification reported here is 49%. But your numbers are much higher. I'm wondering why there is a large discrepancy in the two numbers.
- The visual stream uses whole face pixels and not just mouth crops. Is that correct?
Thank you!
Metadata
Metadata
Assignees
Labels
No labels