We are looking for a Senior or Lead Data Engineer to be the founding member of our new data team. In this high-impact role, you’ll report directly to the CTO and play a critical part in shaping the future of data at our company. You will work across three distinct product lines to standardize and unify data pipelines, processes, and governance. This is a unique opportunity to build from the ground up and influence data architecture and strategy at a multi-product SaaS company.

Responsibilities

Design, build, and optimize scalable data pipelines using Apache Spark, Kafka, and related tools
Develop and orchestrate data workflows using Apache Airflow
Architect and manage infrastructure in AWS, including services like S3, Kinesis, Lambda, EMR, Glue, and Redshift
Build and support real-time data pipelines and event-driven systems for streaming data ingestion and processing
Establish standardized data ingestion, transformation, and storage processes across multiple product lines
Work closely with product and engineering teams to unify and document data models and schemas
Ensure data quality, lineage, and observability across all pipelines
Lead data governance, compliance, and security practices from day one
Mentor future hires and define engineering standards as the data team grows

Qualifications

5+ years of experience in data engineering, ideally in SaaS or multi-product environments
Expertise in AWS services relevant to data infrastructure (S3, Glue, Redshift, EMR, Lambda, Kinesis, etc.)
Strong proficiency in Python for building pipelines, ETL/ELT jobs, and infrastructure automation
Deep hands-on experience with Apache Spark, Kafka, and the broader Apache ecosystem
Proven success designing and supporting real-time data streams and batch workflows
Solid experience with Airflow or equivalent orchestration tools
Strong understanding of distributed systems, data architecture, and scalable design patterns
Passion for clean, maintainable code and building robust, fault-tolerant systems

Nice to Have

Experience with Delta Lake, Apache Iceberg, or Apache Hudi
Familiarity with containerized workloads (Docker, Kubernetes, ECS/EKS)
Background in building internal data platforms or centralized data lakes
Experience supporting data science and BI/analytics teams

Knowledge of data privacy, compliance, and security practices (e.g., SOC 2, GDPR)

Details

Payment in USD [2800-6000 USD]
Full time

This job is no longer accepting applications

See open jobs at Mappa.See open jobs similar to "Lead Data Engineer" SoGal Ventures.

See more open positions at Mappa

Privacy policy Cookie policy