Dream of a world where any scale of computation can be performed on an ever elastic cloud available at your fingertips. Imagine the horsepower of millions of virtual servers that can host millions of simultaneous multiplayer gaming sessions, solve complex computational problems in sub-seconds, support mission critical applications across a variety of business sectors; available anywhere, anytime on any device. Do you want to be a part of the team that is making this dream a reality? If so, the Azure Compute Manager is the right place for you. The team is focused on providing 99.999% uptime to all Virtual Machines in Azure. We are rapidly transforming Virtual Machine failure detection and repair decision making to be backed by Machine Learning. We're creating a new service to handle 10x our current scale and provide an experimentation platform for new failure prediction models and repair strategies. The service is being built on Kafka to process millions of events from host machines in near
The candidate should have hands on experience with design, development and operational expertise with highly available distributed systems. You will be working on optimizing the utilization of compute resources across Azure servers while balancing security isolation, performance isolation and workload priorities.