Senior Software Engineer-Open Source Analytics

Company	Snowflake
Location	Menlo Park, CA, USA, Bellevue, WA, USA
Salary	$195000 – $287500
Type	Full-Time
Degrees
Experience Level	Senior

5+ years of experience designing and building scalable, distributed systems.
Strong programming skills in Java, Scala, or C++ with an emphasis on performance and reliability.
Deep understanding of distributed transaction processing, concurrency control, and high-performance query engines.
Experience with open-source data lake formats (e.g., Apache Iceberg, Parquet, Delta) and the challenges associated with multi-engine interoperability.
Experience building cloud-native services and working with public cloud providers like AWS, Azure, or GCP.
A passion for open-source software and community engagement, particularly in the data ecosystem.
Familiarity with data governance, security, and access control models in distributed data systems.

Pioneer new and innovative technical capabilities in the Open Source Analytics community.
Design and implement features and enhancements for Apache Iceberg and Apache Polaris focusing on scalability, performance and usability such as Iceberg DML/DDL transactions, schema evolution, partitioning, time travel, and more.
Collaborate with the Open source community by contributing code, participating in discussions and reviewing pull requests to ensure high quality contributions.
Architect and build systems that integrate open source technologies seamlessly with Snowflake – enabling our customers to build and deploy massive data lake architectures across platforms and across cloud providers.
Collaborate with Snowflake’s open-source team and the Apache Iceberg community to contribute new features and enhance the Iceberg table format and REST specification.
Work on core data access control and governance features for Apache Polaris.
Contribute to our managed Polaris service, Snowflake Open Catalog, enabling customers to seamlessly manage and expand their data lake through Snowflake as well as other external query engines like Spark and Trino.
Build tooling and services that automate data lake table maintenance, including compaction, clustering, and data retention for enhanced query performance and efficiency.

Contributing to open-source projects, especially in the data infrastructure space.
Designing or implementing REST APIs, particularly in the context of distributed systems.
Managing large-scale data lakes or data catalogs in production environments.
Working on highly-performant and scalable query engines such as Spark, Flink, or Trino.