Staff Data Engineer

6+ years of experience in data engineering, with a strong focus on data ingestion, ETL/ELT pipeline design, and large-scale data processing.
Proven experience in designing and managing data ingestion frameworks for structured and unstructured data.
Expertise in data observability and monitoring tools (Monte Carlo, Databand, Bigeye, or similar).
Strong experience with batch and real-time data ingestion (Kafka, Kinesis, Spark Streaming, or equivalent).
Proficiency in orchestration tools like Apache Airflow, Prefect, or Dagster.
Strong understanding of data lineage, anomaly detection, and proactive issue resolution in data pipelines.
Proficiency in SQL and Python for data processing and automation.
Strong knowledge of API-based data integration and experience working with third-party data sources.
Hands-on experience with Snowflake and best practices for data warehouse ingestion and management.
Experience working with data governance, security best practices, and compliance standards.
Ability to collaborate cross-functionally and communicate technical concepts to non-technical stakeholders.

Lead the design, development, and optimization of data ingestion pipelines, ensuring timely, scalable, and reliable data flows into the enterprise data warehouse (Snowflake).
Define and implement best practices for data ingestion, transformation, governance, and observability, ensuring consistency, data quality, and compliance across multiple data sources.
Develop and maintain data ingestion frameworks that support batch, streaming, and event-driven data pipelines.
Implement and maintain data observability tools to monitor pipeline health, track data lineage, and detect anomalies before they impact downstream users.
Design and enforce automated data quality checks, validation rules, and anomaly detection to ensure teams can rely on high-integrity data.
Own and optimize ETL/ELT orchestration (Airflow, Prefect) and ensure efficient, cost-effective data processing.
Proactively support the health and growth of data infrastructure, ensuring it’s secure and adaptable to future needs.
Partner with data engineering, software engineering, and platform teams to integrate data from transactional systems, streaming services, and third-party APIs.
Provide technical mentorship to other engineers on data observability best practices, monitoring strategies, and pipeline reliability.
Stay curious—research and advocate for new technologies that enhance data accessibility, freshness, and impact.

Experience with event tracking, behavioral analytics, and CDP data pipelines (GA, Heap, Segment, RudderStack, etc.).
Hands-on experience with DBT for data transformation.
Understanding of data science and machine learning pipelines and how ingestion supports these workflows.