Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale... management, continuous delivery and deployment and open source cloud enabling technologies like Kubernetes and OpenStack. SRE...
Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale... management, continuous delivery and deployment and open source cloud enabling technologies like Kubernetes and OpenStack. SRE...
The Infra SRE-Infrastructure-Assurance team extends TikTok infrastructure's operability, observability, visibility... may change at any time. Responsibilities: - Perform SRE duties and operations on supported services in production...
The Global E-commerce SRE team of US Tech and Product works with engineering and product teams to build and run large...-scale, globally distributed, observable, fault-tolerant systems. As an SRE, you will deliver on production ownership...
infrastructure and is one of the biggest GCP customers. As a Principal SRE, you'll be at the forefront of building and maintaining... excellence, champion SRE best practices, and work collaboratively to ensure our systems are robust and performant. This includes...
infrastructure and is one of the biggest GCP customers. As a Principal SRE, you'll be at the forefront of building and maintaining... excellence, champion SRE best practices, and work collaboratively to ensure our systems are robust and performant. This includes...
infrastructure and is one of the biggest GCP customers. As a Principal SRE, you'll be at the forefront of building and maintaining... excellence, champion SRE best practices, and work collaboratively to ensure our systems are robust and performant. This includes...
, and general maker skills are a plus. Past experience with SRE or DevOps. Apple is an equal opportunity employer...
with cross functional teams including Product Management, Data, SRE, etc. to design and implement performant, internal...
through guides, codification, brown bags, and Tech Talks. What We're Looking For: 4+ years of professional SRE/DevOps...
teams like QA, SRE, and Operations. Assist in support of the existing code in production environments. Key...
public-facing systems to collaborate with external security teams. This role is a Software Engineering/SRE role tasked... and debuggable code that can be maintained by a large team, ideally in an imperative programming language used by SRE teams. Examples...
for a mission that matters at a company where you matter. Your Impact As a contributor in the APX SRE organization.... You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with the APX SRE...
must be United States citizens. 8+ years of professional experience in observability, SRE, or cloud platform roles...
services. Experience teaching reliability engineering (e.g. SRE) and/or other scale-oriented cloud systems practices to peers...
teams like QA, SRE, and Operations. Assist in support of the existing code in production environments. Key...
, etc.) Exposure to HPC interconnect technologies like InfiniBand or MPI Familiarity with batch workload managers Experience using SRE...
. Proven ability to engage and align cross-functional stakeholders, such as Product Management, SRE, Data Science, and ML...