ML and Search Engineer, RAG and Re-ranking
5+ Years (IR/ML domain)
Information Retrieval, RAG Pipelines, Re-ranking, Vector Databases (pgvector, Pinecone, Qdrant), Python (PyTorch, Hugging Face), Search Evaluation
Full Time
Remote
Job Description
The ML and Search Engineer, RAG and Re-ranking will focus on designing and optimizing retrieval-augmented generation (RAG) pipelines, hybrid search strategies, and re-ranking models. This role requires expertise in information retrieval, vector databases, and ML frameworks to drive measurable improvements in ranking metrics, latency, and overall search quality. The engineer will collaborate with product and data teams to land production-grade improvements supported by rigorous offline and online experiments.
Responsibilities
- Design and optimize RAG pipelines including chunking, embeddings, and retrieval.
- Build and deploy re-ranking models and hybrid search strategies.
- Create evaluation datasets and metrics for offline and online testing.
- Collaborate with product and data teams to drive measurable quality improvements in production.
- Ensure higher NDCG and MRR with defined quality and latency SLOs.
- Develop robust evaluation harnesses and experiment frameworks.
Requirements
- Degree in Computer Science, ML, IR, or related discipline.
- Strong background in information retrieval and machine learning (embeddings, search ranking).
- Hands-on experience with vector databases (pgvector, Pinecone, Qdrant).
- Skilled in Python ML stack including PyTorch and Hugging Face.
- Practical experience tuning retrieval pipelines for quality and latency.
- Experience shipping search or RAG features to production.
- Clear understanding of retrieval metrics (NDCG, MRR) and experiment design.
Nice to Haves
- Experience building re-rankers, hybrid search systems, or evaluation datasets.
- Familiarity with prompt engineering and feedback loops.
- Knowledge of observability-driven model adaptation.
Experience in multi-lingual retrieval or domain adaptation.