GenAI Lead Engineer

Persistent Systems

Role Overview

We are looking for a GenAI/LLM Engineer with strong Python engineering skills and proven experience building production-grade Retrieval-Augmented Generation (RAG) systems using LlamaIndex and/or LangChain, and integrating vector databases (Pinecone preferred). The role will focus on designing scalable RAG pipelines, implementing advanced retrieval strategies, and building MCP/tool-calling connectors to expose enterprise APIs as agent tools for read/write operations.

Key Responsibilities

RAG Pipeline Design & Development
Design and develop end-to-end RAG pipelines, including:

Data ingestion
Document parsing and preprocessing
Chunking strategies
Embedding generation
Indexing into Pinecone (preferred)
Retrieval and response generation

Build production-ready semantic retrieval solutions and continuously improve relevance/grounding quality.
Implement and optimize advanced retrieval strategies, including semantic search and retrieval tuning.
Agent Tooling & MCP Integrations
Build and integrate MCP connectors to expose internal/external system APIs as agent-callable tools (read/write).
Contribute to agent orchestration patterns including:

Intent routing (e.g., deciding between RAG vs MCP vs workflow)
Tool selection and execution sequencing
Agent reliability patterns (fallbacks, retries, observability)

Security, Reliability & Performance
Apply security controls and handle authentication/authorization tokens, ensuring safe access to enterprise systems.
Optimize AI/ML workflows for performance, scalability, and reliability (latency, throughput, cost, robustness).
Ensure seamless deployment and integration across environments in collaboration with platform/DevOps teams.
Cross-functional Collaboration
Work closely with product, backend, data engineering, and platform teams to ensure successful integration and delivery.
Contribute to design discussions, technical documentation, and best practices for GenAI application engineering.

Required Skills & Qualifications

5?8 years of experience in software engineering and/or data engineering.
2+ years of hands-on experience building LLM/GenAI applications.
Strong programming expertise in Python.
Proven production experience with LlamaIndex and/or LangChain, especially for RAG systems.
Hands-on experience with vector databases; Pinecone preferred.
Strong understanding of retrieval concepts, embeddings, indexing, and semantic search.

Preferred / Good-to-Have

Knowledge of MCP/tool-calling patterns; FASTMCP experience is a strong plus.
Experience with agent frameworks, tool routing, and workflow orchestration.
Familiarity with observability for GenAI apps (logging, tracing, evaluation, prompt/versioning).

What Success Looks Like (KPIs/Outcomes)

High-quality RAG pipeline delivering accurate, grounded responses with measurable improvements in relevance.
Reliable MCP connectors enabling safe tool-based automation across enterprise systems.
Reduced latency and improved scalability with robust security and token management.
Smooth integration and deployment through strong collaboration and engineering discipline.

Microsoft Dynamics