AI/ML Engineer: Quantization & Optimization for CNN Models
Mercedes-Benz is seeking a highly skilled AI/ML Engineer to join our team in Bengaluru. This role focuses on the quantization, optimization, and deployment of Convolutional Neural Network (CNN) models for efficient inference on edge and embedded systems. The ideal candidate will possess deep expertise in Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and mixed-precision inference. Practical experience deploying models using frameworks like Qualcomm SNPE/QNN SDK and NVIDIA TensorRT is essential. You will be instrumental in developing scalable and efficient CNN solutions for a variety of applications including object detection, image classification, segmentation, and embedded vision systems, contributing to the future of automotive technology.
Key Responsibilities
- Design, implement, and fine-tune quantization pipelines for CNN architectures such as ResNet, MobileNet, YOLO, and EfficientNet.
- Apply PTQ and QAT techniques to minimize accuracy degradation while optimizing for low-latency and low-power inference.
- Conduct layer-wise sensitivity analysis, activation calibration, and mixed-precision tuning.
- Develop and maintain the toolchain for CNN deployment using frameworks like PyTorch.
- Perform benchmarking and profiling to evaluate trade-offs between accuracy, latency, and power efficiency.
Required Skills & Experience
- Strong knowledge of CNNs and their optimization workflows.
- Proficiency in PyTorch and/or TensorFlow.
- Familiarity with hardware-aware deployment frameworks such as SNPE/QNN SDK, TensorRT, and OpenVINO.
- Solid programming skills in Python; working knowledge of C++ is preferred.
- Experience with model compression, pruning, and mixed-precision training.
Qualifications
Bachelor's or Master's degree in Computers & Technology or a related field.
