Fabio’s Journal

Float8 (FP8) Quantized LightGlue in TensorRT with NVIDIA Model Optimizer: up to ~6× faster and ~68% smaller engines

January 19, 2026

FP8 quantization via NVIDIA Model Optimizer shrinks TensorRT engines for SuperPoint + LightGlue and can cut latency versus FP32, with a visible match-quality...

Accelerating LightGlue Inference with ONNX Runtime and TensorRT

July 17, 2024

In this post, we show how to accelerate LightGlue inference using ONNX Runtime and TensorRT, achieving 2x-4x speed gains over compiled PyTorch, across many b...

Fabio Milentiansen Sim

Recent posts

Float8 (FP8) Quantized LightGlue in TensorRT with NVIDIA Model Optimizer: up to ~6× faster and ~68% smaller engines

Accelerating LightGlue Inference with ONNX Runtime and TensorRT