with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal... generative AI models. Responsibilities - Design and optimize large model inference pipelines for low-latency, high-throughput...
with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal... creation and consumption on TikTok and serve billions of users. We are seeking an experienced AI model optimization engineer...
, but are not limited to, distillation frameworks, model acceleration, hardware-efficient inference, and their applications... and implementing efficient models for large-scale generative AI, with a particular emphasis on large model distillation and compression...
company. Currently, we are looking for Machine Learning Engineer - Model Serving Infrastructure to join our team to support... and advance that mission. - Responsible for the design and implementation of distributed inference infrastructure for feeds, ads...
, but are not limited to, distillation frameworks, model acceleration, hardware-efficient inference, and their applications... and implementing efficient models for large-scale generative AI, with a particular emphasis on large model distillation and compression...
to date with and apply cutting-edge techniques in large model optimization and inference acceleration. Qualifications: Minimum Qualifications... our hybrid work model, and the specific requirements may change at any time. We are seeking a Machine Learning Engineer...
paradigms - Deploy and optimize text/multimodal LLMs, including inference acceleration, model alignment during training... for large-scale ML infra and online/offline distributed systems, enabling AI to realize its potential value for billions...
benchmark tools and performance optimization of AI workloads specifically tailored for large-scale LLM training and inference... Hardware Acceleration (e.g., GPU/TPU/RDMA) or ML for Systems, and Distributed Storage. - Experience in AI model development...
and latest trend in inference and training optimization. Hand-on experience in mapping model architecture to low level software... model architecture, especially SoTA models, distributed inference and deployment at scale is crucial. KEY RESPONSIBILITIES...
performance and optimization team across various frameworks and model architectures. This is a highly visible role with large... trend in inference and training. Experience in mapping model architecture to low level software, hardware and understanding...