Skip to content

Senior Software Engineer – Polaris & Data Lake Catalog
Company | Snowflake |
---|
Location | Bellevue, WA, USA |
---|
Salary | $195000 – $276000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- 8+ years of experience designing and building scalable, distributed systems.
- Strong programming skills in Java, Scala, or C++ with an emphasis on performance and reliability.
- Deep understanding of distributed transaction processing, concurrency control, and high-performance query engines.
- Experience with open-source data lake formats (e.g., Apache Iceberg, Parquet, Delta) and the challenges associated with multi-engine interoperability.
- Experience building cloud-native services and working with public cloud providers like AWS, Azure, or GCP.
- A passion for open-source software and community engagement, particularly in the data ecosystem.
- Familiarity with data governance, security, and access control models in distributed data systems.
Responsibilities
- Design and implement scalable, distributed systems to enable support for Iceberg DML/DDL transactions, schema evolution, partitioning, time travel, and more.
- Architect and build systems that integrate Snowflake queries with external Iceberg catalogs (e.g., AWS Glue, Databricks Unity) and various data lake architectures, enabling seamless interoperability across cloud providers.
- Develop high-performance, low-latency solutions for catalog federation, allowing customers to manage and query their data lake assets across multiple catalogs from a single interface.
- Collaborate with Snowflake’s open-source team and the Apache Iceberg community to contribute new features and enhance the Iceberg REST specification.
- Work on core data access control and governance features for Polaris, including fine-grained permissions such as row-level security, column masking, and multi-cloud federated access control.
- Contribute to our managed Polaris service, ensuring that external query engines like Spark and Trino can read from and write to Iceberg tables through Polaris in a way that’s decoupled from Snowflake’s core data platform.
- Build tooling and services that automate data lake table maintenance, including compaction, clustering, and data retention for enhanced query performance and efficiency.
Preferred Qualifications
- Contributing to open-source projects, especially in the data infrastructure space.
- Designing or implementing REST APIs, particularly in the context of distributed systems.
- Managing large-scale data lakes or data catalogs in production environments.
- Working on highly-performant and scalable query engines such as Spark, Flink, or Trino.