Senior ML Engineer

Mantra Softech

Primary Goal

Reduce the cost and latency of model outputs. This person ensures the model doesn't just work, but runs fast enough for 100 people to chat simultaneously without lag.

Key Responsibilities

  • Inference Acceleration: Implement and tune high-performance backends like vLLM (PagedAttention) or NVIDIA TensorRT-LLM.
  • Model Optimization: Apply quantization (AWQ, GPTQ, or FP8) to fit larger VLMs into smaller GPU footprints without significant accuracy loss.
  • VLM Specifcs: Optimize the "Vision Encoder" (e.g., CLIP or SigLIP) to handle high-resolution images efficiently.
  • Benchmarking: Build automated pipelines to measure Tokens Per Second (TPS) and Time to First Token (TTFT).

Must-Have Skillset

  • Frameworks: PyTorch, DeepSpeed, vLLM, or Triton Inference Server.
  • Techniques: Quantization, Speculative Decoding, Continuous Batching.
  • Vision Expertise: Experience with ViT (Vision Transformers) and multimodal fusion layers.
  • Coding: Highly proficient in Python and CUDA (C++ is a plus).

Join Our Team

At Mantra, we understand each employee has a unique role and responsibility. Come explore the new opportunities across different locations.

At Mantra Softech, we work hand in hand towards one goal to Make customers happy with the best service. Our objectives are clear and we give our team the best tools to help them achieve that goal. Whatever role they play, we motivate employees to make a difference for our customers, our team, and ourselves.

1. Application Submission

Browse our open positions and submit your CV through our online portal. Ensure your profile highlights your technical expertise and passion for innovation.

2. Profile Shortlisting

Our talent acquisition team carefully reviews every application to match your skills and aspirations with our evolving business needs.

3. Technical Evaluation

Engage in deep technical discussions with our experts and leaders to showcase your problem-solving abilities and domain knowledge.

4. Joining the Tribe

Upon successful evaluation, receive an offer and embark on your journey to secure the digital identity of millions together with us.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.