While open-source software has made AI accessible to more people, there are still two significant barriers to its widespread use: inference delay and cost. System optimizations have come a long way and can substantially reduce latency and cost for DL model in…