The Yume1.5 paper states, "we perform random frame sampling at a rate of 1 out of 32." I understand this to mean sampling one frame for every 32 frames along the temporal dimension. However, I haven't found any corresponding implementation in the current training and inference code. Could you clarify how this is specifically implemented? Is it used only during training, or can this statement be safely ignored?
The Yume1.5 paper states, "we perform random frame sampling at a rate of 1 out of 32." I understand this to mean sampling one frame for every 32 frames along the temporal dimension. However, I haven't found any corresponding implementation in the current training and inference code. Could you clarify how this is specifically implemented? Is it used only during training, or can this statement be safely ignored?