T13: Speeding up neural network inferencing with TensorRT and Triton Inference Server

This module is dedicated to neural network inferencing optimization and acceleration with NVIDIA TensorRT and Triton Inference Server. Subjects discussed will cover accelerating inference, lowering latency, and optimized AI model deployment on GPU resources.