Hi, I recently built an app that sends data to MIC, process them, and return them.
I implemented the whole thing with just pthreads to get as much transparency as possible.
Problem is, I'm not sure I'm measuring the offload latency right.
I currently built it so that it take 4 timestamps:
offload begin (from host) - (scif transfer) - remote processing begin (from mic) - (actual processing) - remote processing end (from mic) - (scif transfer back to host) - offload end (from host)
Would using clock_gettime() with one of the parameters (CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC, CLOCK_REALTIME)
each time get me a correct, consistent measurement across host and device with microseconds precision?
If so, which parameter of the three should I use?