Posted in

Senior Software Engineer – Polaris & Data Lake Catalog

Senior Software Engineer – Polaris & Data Lake Catalog

CompanySnowflake
LocationBellevue, WA, USA
Salary$195000 – $276000
TypeFull-Time
Degrees
Experience LevelSenior, Expert or higher

Requirements

  • 8+ years of experience designing and building scalable, distributed systems.
  • Strong programming skills in Java, Scala, or C++ with an emphasis on performance and reliability.
  • Deep understanding of distributed transaction processing, concurrency control, and high-performance query engines.
  • Experience with open-source data lake formats (e.g., Apache Iceberg, Parquet, Delta) and the challenges associated with multi-engine interoperability.
  • Experience building cloud-native services and working with public cloud providers like AWS, Azure, or GCP.
  • A passion for open-source software and community engagement, particularly in the data ecosystem.
  • Familiarity with data governance, security, and access control models in distributed data systems.

Responsibilities

  • Design and implement scalable, distributed systems to enable support for Iceberg DML/DDL transactions, schema evolution, partitioning, time travel, and more.
  • Architect and build systems that integrate Snowflake queries with external Iceberg catalogs (e.g., AWS Glue, Databricks Unity) and various data lake architectures, enabling seamless interoperability across cloud providers.
  • Develop high-performance, low-latency solutions for catalog federation, allowing customers to manage and query their data lake assets across multiple catalogs from a single interface.
  • Collaborate with Snowflake’s open-source team and the Apache Iceberg community to contribute new features and enhance the Iceberg REST specification.
  • Work on core data access control and governance features for Polaris, including fine-grained permissions such as row-level security, column masking, and multi-cloud federated access control.
  • Contribute to our managed Polaris service, ensuring that external query engines like Spark and Trino can read from and write to Iceberg tables through Polaris in a way that’s decoupled from Snowflake’s core data platform.
  • Build tooling and services that automate data lake table maintenance, including compaction, clustering, and data retention for enhanced query performance and efficiency.

Preferred Qualifications

  • Contributing to open-source projects, especially in the data infrastructure space.
  • Designing or implementing REST APIs, particularly in the context of distributed systems.
  • Managing large-scale data lakes or data catalogs in production environments.
  • Working on highly-performant and scalable query engines such as Spark, Flink, or Trino.