Lead Data Engineer
Mappa
Software Engineering, Data Science
USD 2,800-6k / month
Posted on Aug 1, 2025
Job description
About the Role
We are looking for a Senior or Lead Data Engineer to be the founding member of our new data team. In this high-impact role, you’ll report directly to the CTO and play a critical part in shaping the future of data at our company. You will work across three distinct product lines to standardize and unify data pipelines, processes, and governance. This is a unique opportunity to build from the ground up and influence data architecture and strategy at a multi-product SaaS company.
Responsibilities
- Design, build, and optimize scalable data pipelines using Apache Spark, Kafka, and related tools
- Develop and orchestrate data workflows using Apache Airflow
- Architect and manage infrastructure in AWS, including services like S3, Kinesis, Lambda, EMR, Glue, and Redshift
- Build and support real-time data pipelines and event-driven systems for streaming data ingestion and processing
- Establish standardized data ingestion, transformation, and storage processes across multiple product lines
- Work closely with product and engineering teams to unify and document data models and schemas
- Ensure data quality, lineage, and observability across all pipelines
- Lead data governance, compliance, and security practices from day one
- Mentor future hires and define engineering standards as the data team grows
Qualifications
- 5+ years of experience in data engineering, ideally in SaaS or multi-product environments
- Expertise in AWS services relevant to data infrastructure (S3, Glue, Redshift, EMR, Lambda, Kinesis, etc.)
- Strong proficiency in Python for building pipelines, ETL/ELT jobs, and infrastructure automation
- Deep hands-on experience with Apache Spark, Kafka, and the broader Apache ecosystem
- Proven success designing and supporting real-time data streams and batch workflows
- Solid experience with Airflow or equivalent orchestration tools
- Strong understanding of distributed systems, data architecture, and scalable design patterns
- Passion for clean, maintainable code and building robust, fault-tolerant systems
Nice to Have
- Experience with Delta Lake, Apache Iceberg, or Apache Hudi
- Familiarity with containerized workloads (Docker, Kubernetes, ECS/EKS)
- Background in building internal data platforms or centralized data lakes
- Experience supporting data science and BI/analytics teams
Knowledge of data privacy, compliance, and security practices (e.g., SOC 2, GDPR)
Details
- Payment in USD [2800-6000 USD]
- Full time