AWS customers have come to rely on our track record of stellar operational performance. Our team ensures that the quickly expanding number and scale of services are able to deliver that reliability promise. We analyze trends and build software used across AWS to reduce recurrence, duration, and size of customer impacting events. AWS Safety Engineering team is looking for a strong engineering leader to build an innovative incident analysis and problem management platform. Our ideal candidate thrives in a fast-paced environment, relishes working with large scale systems, understanding how systems at scale fail, driving the identification of solutions and, above all else, is a passionate builder of talent and teams. In this role you will be responsible for leading teams of engineers to create world-class incident analysis and problem management workflows for AWS services and customers. You must be willing to insist on the highest standards for quality, controllership, maintainability, performance
· 3+ years of non-internship professional software development experience · Programming experience with at least one modern language such as Java, C++, or C# including object-oriented design · 1+ years of experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems.We're building tools to create an immune system for AWS. The preferred candidate will have experience with AWS tools and a track record of building highly reliable software. Past experience working with Machine Learning researchers a plus.