and dedicated Site Reliability Engineering (SRE) team serving the forefront of the latest science and technology trends on cloud..., management, reliability and availability of this infrastructure spanning 1000s of GPU nodes. As a senior SRE, you are responsible...
and dedicated Site Reliability Engineering (SRE) team serving the forefront of the latest science and technology trends on cloud..., management, reliability and availability of this infrastructure spanning 1000s of GPU nodes. As a senior SRE...
? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform..., a strong background in SRE practices, systems, networking, coding, capacity management, cloud operations, continuous delivery...
of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars.... What You Will Be Doing: Develop and maintain large-scale systems supporting critical use cases for AI Infrastructure, driving reliability...
Job Description Senior DevOps Engineer - AI/ML Infrastructure Position Overview: We are seeking an experienced... Senior DevOps Engineer to build and maintain the production infrastructure for our enterprise AI automation platform...