ML Specialist

Mappa

Mappa

Software Engineering, Data Science
Mexico City, Mexico
Posted on Sep 13, 2025

We're seeking an experienced LLM/ML Specialist with deep expertise in LLaMA models or other open source and Retrieval-Augmented Generation (RAG) systems. The ideal candidate will have strong skills in model fine-tuning, prompt engineering, and production deployment of language models.

You'll build and optimize RAG pipelines, implement vector databases, and develop efficient inference solutions. Requirements include 2+ years of LLM experience, Python proficiency with PyTorch/Hugging Face, and demonstrated projects involving LLaMA models.

Technical Skills:

  • Deep understanding of LLaMA model architecture and its variants
  • Experience fine-tuning and adapting LLaMA models for specific applications
  • Proficiency in Python and ML frameworks (especially PyTorch and Hugging Face)
  • Knowledge of prompt engineering specific to LLaMA models
  • Experience with efficient inference and quantization techniques for LLaMA
  • Understanding of model deployment and optimization for large language models
  • Expertise in Retrieval-Augmented Generation (RAG) systems and architectures
  • Experience implementing vector databases and similarity search techniques
  • Knowledge of document chunking and embedding strategies for RAG
  • Familiarity with evaluation metrics for RAG systems

Qualifications:

  • Machine Learning, NLP, Computer Science, or related field
  • 2+ years of experience working with large language models, preferably LLaMA
  • Strong mathematical background in statistics and probability
  • Demonstrated projects involving LLaMA model adaptation or deployment
  • Excellent communication skills to explain complex concepts
  • Proven experience building and optimizing RAG pipelines in production environments

Preferred:

  • Experience with PEFT methods (LoRA, QLoRA, etc.) for LLaMA models
  • Experience with fine-tuning
  • Understanding of model limitations and ethical considerations
  • Experience with LLaMA integration into production systems
  • Familiarity with open-source LLM ecosystems
  • Experience with hybrid search methodologies (dense + sparse retrieval)
  • Knowledge of context window optimization techniques for RAG systems
  • Experience with multi-stage retrieval architectures

Responsibilities:

  • Design and implement LLaMA-based solutions for real-world applications
  • Develop, optimize, and deploy RAG systems using vector databases and embedding strategies
  • Fine-tune LLaMA models using techniques like LoRA and QLoRA
  • Create efficient prompt engineering strategies for specialized use cases
  • Implement and optimize inference pipelines for production environments
  • Design evaluation frameworks to measure model and RAG system performance
  • Collaborate with engineering teams to integrate LLMs into product infrastructure
  • Research and implement the latest advancements in LLM and RAG technologies
  • Document technical approaches, model architectures, and system designs
  • Mentor junior team members on LLM and NLP best practices

What We Offer:

  • Opportunity to work on cutting-edge LLM and RAG applications
  • Access to computational resources for model training and experimentation
  • Collaborative environment with other ML/AI specialists
  • Flexible work arrangements with remote options
  • Competitive salary and comprehensive benefits package
  • Professional development budget for conferences and courses
  • Clear career progression path for AI specialists
  • Chance to contribute to open-source LLM ecosystem
  • Balanced workload with dedicated research time
  • Inclusive culture that values diverse perspectives and innovative thinking