techniques for Physical AI. GPU-based libraries, frameworks, tools, SDKs and infrastructure for model training and inference... platform. This platform consists of three core pillars: systems for massively parallel AI training in the data center...
learning innovation. In this role, you will architect, scale, and optimize high-performance ML infrastructure used... and infrastructure for training and inference on large-scale, distributed GPU clusters. Develop internal tools and automation for ML...
analysis for AI training/inference applications. Large-Scale System Development & Debugging: Experience developing..., and distributed training functionalities. GPU Performance Analysis & Optimization Acuity: The ability to analyze profiling data...