About the Role:
The AI/ML Engineer will lead the design and development of next-generation AI systems built on modern LLM, RAG, and agentic architectures. This role requires building intelligent, multi-step, tool-using AI agents, optimizing retrieval systems, finetuning open-source LLMs, and implementing structured prompting strategies for reliable model behavior. You will work across the full lifecycle of LLM-driven applications—from data preparation and model training to evaluation, optimization, and deployment at production scale.
The ideal candidate brings strong experience in NLP, vector databases, LangGraph or LangChain agent frameworks, and transformer-based architectures. You will design inference APIs, orchestrate multi-agent workflows, and ensure systems meet stringent performance, safety, and reliability criteria. This is a highly technical and innovationdriven role focused on delivering enterprise-grade AI solutions optimized for real-world use cases and scale.
Key Responsibilities:
• Core LLM + NLP Development
• Build RAG pipelines using FAISS, Chroma, or Milvus with optimized chunking
+ reranking.
• Implement LangGraph for deterministic, multi-step agent workflows.
• Build agentic systems with tool-calling, memory, and multi-agent orchestration.
• Fine-tune and train open-source LLMs using LoRA / QLoRA.
• Work with Llama, Mistral, Qwen, Gemma, Phi, MPT, and Falcon models.
• Develop inference APIs using FastAPI, LangChain, LlamaIndex, or custom stacks.
Prompt Engineering + Model Control:
• Design structured prompts: system roles, instructions, constraints, evaluation loops.
Tune model behavior using parameters like:
oTemperature
o top-k o top-p
o frequency & presence penalties
• Build prompt templates for multi-turn reasoning, safety, and reliability. Optimize prompt-based agents (planning, routing, retrieval, tool-use).
Evaluation + ML Ops:
• Evaluate hallucination rate, latency, and retrieval quality.
• Implement self-evaluation agents and grader-based pipelines.
• Track experiments with MLflow, W&B, or custom logging.
Required Skills:
• 2–3 years hands-on experience in NLP, LLMs, or ML systems.
• Strong Python + PyTorch skills.
Experience with:
o RAG
o LangGraph or LangChain agents o Vector databases
o HuggingFace models & tokenizers
• Understanding of transformers, embeddings, attention, and text generation fundamentals.
• Experience deploying LLM services in production (Docker, FastAPI, GPUs).
Preferred / Bonus Skills:
• LangGraph advanced workflows (multi-branch, conditional routing, supervised graphs).
• Knowledge of inference optimization (vLLM, TGI, TensorRT-LLM).
• Experience with RLHF, DPO, or alignment methods.
• Ability to debug complex LLM behavior and implement safety constraints.
Experience building production-grade agent systems (tools, memory, retrievers).
If you would like to apply for this position, send your CV to [email protected]