resolution. Manage AWS infrastructure (EKS, RDS, S3, EC2 etc.). Partner with other SRE and Cloud engineering functions... to continuously improve the SRE ecosystem by automation, toil reduction, service improvements, observability improvements...
Job Category: KMBL Degree Level: Bachelor's Degree Job Description: Title : Observability Platforms and SRE Engg... and associated delivery platform. The Observability Platforms and SRE team is a group of experts developing, maintaining, scaling...
Commerce Cloud, eProcurement gateways, AWS Cloud infrastructure, Monitoring and Security. This role is essential to our global... measures, and reinforcing a culture of security within the engineering team. Site Reliability Engineering (SRE): You will lead...
: AWS Engineering, Azure Engineering, Containerisation (Kubernetes), and DevOps/CI-CD. You will own strategy, standards...-class customer outcomes. Job Description What You’ll Do Strategic & Technical Leadership Own the multi-cloud (AWS...
, and SRE teams to define observability standards and KPIs. Enable proactive incident detection and root cause analysis through... search,Technology->Cloud Platform->AWS App Development->Cloudwatch,Technology->DevOps->Continuous delivery...
, switches, etc. Apply DevOps and SRE methodologies to improve system reliability, scalability, monitoring, and incident... based automation software languages Excellent understanding and experience in AWS, EKS, Helm, and GitHub Actions. The...
! We are looking for an experienced Senior Site Reliability Engineer to join our Product SRE team Engineering team. Reporting to the Senior Director, Site..., scaling, and infrastructure management · Build scalable portals for SRE dashboards, SLI/SLO/SLA tracking, error budgets...
) with experience in AWS, GCP, Kubernetes, and GitOps to work with our Site Reliability Engineering (SRE) team. The successful candidate... will understand SRE practices and have a track record of implementing high-quality site reliability engineering practices (SLAs, SLOs...
an experienced hands-on Cloud SRE manager to lead high-severity incident and problem management across our GCP-centric platforms... with regional and global SRE counterparts with special attention to the below Incident Analysis & Problem Management: Implement...
to cloud (AWS/GCP) involving automation of multiple deployments using Terraform/Cloudformation for IAC. You'll participate... and performance, to work in the DevOps/SRE team Evaluate cloud services and architecture to identify strengths and weaknesses...
where problem is solved Additional Comments: Mandatory Skills: Site Reliability Engineer, AWS, Devops, automation, Prometheus..., monitoring, framework, design review Skill to Evaluate: Site Reliability Engineer, AWS, Devops, automation, Prometheus...
. Responsibilities: Preferred Skills AWS Certifications, such as: AWS Certified DevOps Engineer – Professional Solutions Architect... architectures using: SNS, SQS, Lambda, Step Functions Exposure to cost optimization strategies and billing analysis in AWS...
optimization across complex, global infrastructures. We are expanding our Site Reliability Engineering (SRE) team... detect and prevent issues. What makes you an ideal candidate: 2+ years of hands-on experience with AWS (EC2, ECS, EKS, RDS...
optimization across complex, global infrastructures. We are expanding our Site Reliability Engineering (SRE) team... detect and prevent issues. What makes you an ideal candidate 2+ years of hands-on experience with AWS (EC2, ECS, EKS...
by collaborating with Senior Engineers, SRE and application development teams to vet and validate test automation for our edge... such as Openstack/Kubernetes. Experience working on well-known clouds like AWS/Azure/GCP would be a plus. Excellent written and verbal...
. Infrastructure & Cloud: Architect and operate AWS infrastructure (VPC, EKS, Transit Gateway, RDS, S3, ECR, IAM, etc.). Design..., aligning them with business needs. SKILLS Must have 8+ years of DevOps/SRE/Platform engineering experience, with 2...
Management Take ownership of how to procedurally deal with emergency situations. SRE should write the playbook on how to deal... and architectural skills) Expertise in various AWS services & their use cases. (EC2, Network, Lambda, IAM and more) Eagerness to keep...
Management Take ownership of how to procedurally deal with emergency situations. SRE should write the playbook on how to deal... and architectural skills) Expertise in various AWS services & their use cases. (EC2, Network, Lambda, IAM and more) Eagerness to keep...