MIC requires strict 64Byte data alignment to utilize vpu, but why? I found Sparc also have such an requirement. But other multi-core CPU can handle unaligned data.
As MIC can automatically vectorize a for loop of data(with compiler optimization), what if the data is unaligned in this case? will the auto optimization still work? if yes, how?