best-in-class observability tools Ensure data quality and governance by managing metadata like schemas, lineage, and access control..., Kubernetes, and infrastructure-as-code tools like Terraform Familiar with DevOps best practices, using version control (Git...
Monitor and continuously improve the performance and reliability of our systems, using best-in-class observability tools... technologies like Docker, Kubernetes, and infrastructure-as-code tools like Terraform Familiar with DevOps best practices, using...
environment (Azure or others). Experience with Terraform or comparable IaC tools. Operational experience with an observability... stack, preferably Prometheus and Grafana. Development experience with CI/CD pipeline (Azure DevOps or others). Development...
, and monitoring compute resources across Slurm and Kubernetes environments. Develop observability, alerting, and auto-healing systems.... Skills / Must Have: 7+ years of experience in SRE, DevOps, or Infrastructure Engineering roles supporting large-scale...
in adopting DevSecOps best practices, automates the DevOps lifecycle, and advances platform architecture using modern tools... (Kubernetes, Terraform, Helm, Python). Champions secure coding, automated security testing, and observability, ensuring compliance...
programming, DevOps automation, and scalable microservices development. Core Responsibilities System & Infrastructure Software... teams to integrate low-level capabilities into unified software workflows DevOps & Containerization Contribute to CI/CD...
to foster a robust engineering ecosystem. You'll champion DevOps culture, streamline development workflows, enhance system... and enhance system observability through effective monitoring, logging, tracing, and alerting strategies. Stay abreast...
, Monitoring and Observability using tools like CloudWatch, Splunk, Datadog etc. Experience with databases like PostgreSQL, MySQL... extensive, hands-on experience across core AWS services and modern DevOps tooling: Compute & Serverless: AWS Lambda, AWS API...
one another and leveling up as a team. We enjoy a continuous deployment DevOps culture, and take owner-operator pride in supporting our code... addresses issues as they arise. Solves problems with multiple states and execution paths Observability: Demonstrates a good...
. Observability: Design and maintain monitoring, alerting, and logging systems to provide real-time visibility into model serving... Reliability Engineering, DevOps, or Infrastructure Engineering roles. Strong proficiency in Kubernetes, Docker, and container...
and automated) Observability: LGTM stack (Loki, Grafana, Tempo, Mimir) Deployment: Automated CI/CD pipeline Collaboration: Git... environment provisioning, and zero manual infrastructure changes. Secondary or Additional Responsibilities 01 | DevOps...
, data lakehouses, and data hubs, along with related capabilities such as ingestion, governance, modeling, and observability... manipulation and optimization. DEVOPS: Demonstrated experience in DevOps practices, including code management, CI/CD...
and proactively seeks new knowledge that will improve the availability, reliability, efficiency, observability, and performance... or infrastructure services; multi-threaded/concurrent programming. Platform/DevOps: exposure to containers, Kubernetes, CI/CD...
excellence through code reviews, telemetry-driven decisions, and modern DevOps practices. Improve test coverage, implement... integration tests, and ensure observability and reliability for live services. Participate in on-call rotations to support...
and cloud, using Docker, OpenShift and AKS; Developing DevOps CICD pipeline in GitLab using Maven and Pip; Executing application... performance monitoring, observability, and logging by utilizing and customizing AppDynamics, Splunk, Grafana, Prometheus...
Design, implement and deliver AI services to support product offerings for large-scale agent observability Collaborate... responsibility for the development lifecycle and production readiness of the services you build and drive the team's DevOps culture...
Management – creates and maintains a central repository for logs to create metrics, observability, and incident root cause... Collaborates with Cloud Engineering, Data Engineering, Data Governance Office, DevOps, Information Security, Infrastructure...
across engineering teams. Collaborate with applications, DevOps, and platform teams to ensure platform tools are reliable, scalable... judgment to make sound technology choices. Champion coding standards, automation, observability, and security practices...
datasets. Exposure to DevOps and engineering hygiene practices such as containerization (Docker), infrastructure-as-code... or analytical requirements into scalable technical solutions, with solid grounding in code quality, reliability, observability...
, robustness, and observability. Lead project development across the organization and work with subject matter experts... services you build and drive the team's DevOps culture. Drive and uphold the best practices of modern software engineering...