Lead Data Engineer with AI experience

Jobgether • India

No Relocation

Posted: June 15, 2026

Additional Content

Job Description

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Lead Data Engineer with AI experience based in India. This role sits at the core of modern AI and data transformation initiatives, building the foundational infrastructure that powers next-generation intelligent systems. You will design and operate scalable data pipelines, retrieval systems, and ML/LLMOps frameworks that enable advanced AI applications, including conversational agents, RAG systems, and predictive models. The work spans both classical data engineering and cutting-edge AI infrastructure, requiring strong architectural thinking and hands-on execution. You will collaborate with cross-functional engineering and AI teams to translate reference architectures into production-grade systems that are reliable, scalable, and efficient. Your contributions will directly influence the performance, accuracy, and scalability of AI-driven products used in real-world enterprise environments. The role offers exposure to agentic systems, semantic data layers, and advanced retrieval architectures at scale. It is a highly technical and impact-driven position where engineering excellence and AI innovation intersect.
Accountabilities: Data Pipeline Engineering: Build, optimize, and maintain robust batch and streaming data pipelines using modern cloud-native tools such as Snowflake, PySpark, Delta Lake, and Kafka, ensuring reliability, scalability, and performance. RAG & Retrieval Infrastructure: Design and implement end-to-end retrieval systems including embedding pipelines, vector databases, hybrid search, chunking strategies, and ranking mechanisms to optimize AI context relevance. Semantic & Knowledge Layer Development: Develop ontologies, entity mappings, and knowledge graphs while maintaining semantic contracts, metadata systems, and lineage tracking for AI and ML use cases. ML/LLMOps Enablement: Support ML and LLM lifecycle workflows including dataset curation, feature engineering, model evaluation, experiment tracking, and production monitoring. Agentic Data Systems: Build APIs, context stores, and tool interfaces that enable autonomous agents, including observability for reasoning traces, tool calls, and contextual outputs. Governance & Data Quality: Implement robust data governance frameworks including RBAC, PII handling, schema validation, data quality monitoring, and compliance-ready audit logging systems. Requirements This role requires a highly experienced data engineering professional with strong cloud, distributed systems, and AI infrastructure expertise. The ideal candidate combines deep technical execution with architectural thinking and hands-on experience building production-grade AI-enabled data systems. 7+ years of experience in data engineering with strong exposure to cloud-based data platforms. 2+ years of experience building production AI/ML or LLM-related data infrastructure at scale. Strong expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming. Hands-on experience with vector databases, embedding pipelines, and retrieval systems in production RAG environments. Solid understanding of MLOps practices including MLflow, CI/CD for ML systems, and automated evaluation frameworks. Strong knowledge of data governance, security, compliance, and data quality frameworks. Experience working with cloud ecosystems such as AWS or Azure and containerized environments (Docker, Kubernetes). Familiarity with AI/LLM tooling such as LangChain, LlamaIndex, OpenAI/Claude/Bedrock APIs, and FastAPI is a plus. Strong problem-solving mindset with the ability to design scalable systems and operate in fast-moving AI environments. Benefits Competitive compensation package aligned with experience and market standards Remote-friendly or hybrid work flexibility depending on team structure Opportunity to work on cutting-edge AI, LLM, and agentic systems Exposure to global engineering teams and enterprise-scale AI transformation projects Health, insurance, and wellness benefits (as per policy and location) Learning and development support for advanced AI and data engineering skills Access to modern cloud-native and AI-first technology stacks Collaborative, engineering-driven culture focused on innovation and impact.
How Jobgether works: We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Why Apply Through Jobgether? Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
apply for this job

Apply Now View Full Posting

RemoteJob Guru

Menu

Lead Data Engineer with AI experience

Additional Content

Job Description