Unpredictable Mobile Inference Latency Could Impact Device Performance Significantly
Deep neural networks used on mobile devices can have varying latency, especially when CPU resources are limited. A model that performs well on one device may not do as well on another or under different resource constraints. This means that just looking at average latency isn't enough when optimizing these models for mobile use. It's important to consider how latency can change in different conditions, like using different devices or levels of CPU usage.