ML Specialist
Mappa
Software Engineering, Data Science
Mexico City, Mexico
Posted on Sep 13, 2025
We're seeking an experienced LLM/ML Specialist with deep expertise in LLaMA models or other open source and Retrieval-Augmented Generation (RAG) systems. The ideal candidate will have strong skills in model fine-tuning, prompt engineering, and production deployment of language models.
You'll build and optimize RAG pipelines, implement vector databases, and develop efficient inference solutions. Requirements include 2+ years of LLM experience, Python proficiency with PyTorch/Hugging Face, and demonstrated projects involving LLaMA models.
Technical Skills:
- Deep understanding of LLaMA model architecture and its variants
- Experience fine-tuning and adapting LLaMA models for specific applications
- Proficiency in Python and ML frameworks (especially PyTorch and Hugging Face)
- Knowledge of prompt engineering specific to LLaMA models
- Experience with efficient inference and quantization techniques for LLaMA
- Understanding of model deployment and optimization for large language models
- Expertise in Retrieval-Augmented Generation (RAG) systems and architectures
- Experience implementing vector databases and similarity search techniques
- Knowledge of document chunking and embedding strategies for RAG
- Familiarity with evaluation metrics for RAG systems
Qualifications:
- Machine Learning, NLP, Computer Science, or related field
- 2+ years of experience working with large language models, preferably LLaMA
- Strong mathematical background in statistics and probability
- Demonstrated projects involving LLaMA model adaptation or deployment
- Excellent communication skills to explain complex concepts
- Proven experience building and optimizing RAG pipelines in production environments
Preferred:
- Experience with PEFT methods (LoRA, QLoRA, etc.) for LLaMA models
- Experience with fine-tuning
- Understanding of model limitations and ethical considerations
- Experience with LLaMA integration into production systems
- Familiarity with open-source LLM ecosystems
- Experience with hybrid search methodologies (dense + sparse retrieval)
- Knowledge of context window optimization techniques for RAG systems
- Experience with multi-stage retrieval architectures
Responsibilities:
- Design and implement LLaMA-based solutions for real-world applications
- Develop, optimize, and deploy RAG systems using vector databases and embedding strategies
- Fine-tune LLaMA models using techniques like LoRA and QLoRA
- Create efficient prompt engineering strategies for specialized use cases
- Implement and optimize inference pipelines for production environments
- Design evaluation frameworks to measure model and RAG system performance
- Collaborate with engineering teams to integrate LLMs into product infrastructure
- Research and implement the latest advancements in LLM and RAG technologies
- Document technical approaches, model architectures, and system designs
- Mentor junior team members on LLM and NLP best practices
What We Offer:
- Opportunity to work on cutting-edge LLM and RAG applications
- Access to computational resources for model training and experimentation
- Collaborative environment with other ML/AI specialists
- Flexible work arrangements with remote options
- Competitive salary and comprehensive benefits package
- Professional development budget for conferences and courses
- Clear career progression path for AI specialists
- Chance to contribute to open-source LLM ecosystem
- Balanced workload with dedicated research time
- Inclusive culture that values diverse perspectives and innovative thinking