Auto embeddings in Manticore are now several times faster.
In the article: what ONNX Runtime changed, why we removed internal batch processing, and how to load data for maximum QPS.
14× faster embeddings: how we rebuilt the ONNX path in Manticore
Auto embeddings in Manticore are now several times faster.
In the article: what ONNX Runtime changed, why we removed internal batch processing, and how to load data for maximum QPS.
14× faster embeddings: how we rebuilt the ONNX path in Manticore