New job, posted less than a week ago!
Job Details
Posted date: Mar 04, 2026
There have been 2642 jobs posted with the title of Principal Software Engineer all time at Microsoft.There have been 2642 Principal Software Engineer jobs posted in the last month.
Category: Software Engineering
Location: Redmond, WA
Estimated salary: $222,050
Range: $139,900 - $304,200
Employment type: Full-Time
Work location type: 3 days / week in-office
Role: Individual Contributor
Description
OverviewMicrosoft’s Cloud Operations and Innovation (CO+I) organization builds and operates the global datacenter infrastructure that powers Microsoft’s cloud. Within CO+I, the Engineering organization (CO+IE) delivers the software platforms, telemetry pipelines, and automation that enable scalable, reliable, and cost‑efficient datacenter planning and operations. These systems form a critical competitive advantage for Microsoft, translating physical infrastructure signals into intelligence that protects availability, improves sustainability, and enables continuous scale.
As a Principal Software Engineer, you operate as a hands‑on architect and AI expert at the intersection of cloud platforms, critical physical infrastructure, and intelligent operations. You design and incubate next‑generation AI Ops solutions for Critical Environments (CE)—powering real‑time situational awareness, proactive risk detection, and autonomous decision support across Microsoft’s global datacenter fleet.
We are looking for a Principal Software Engineer to help deliver automation capabilities that power the long-range execution planning efforts, drive workflow improvements and build solutions to assist in the delivery of large scale data centers through efficient management of cost and schedule.
Aligned with Microsoft’s mission to empower every person and every organization on the planet to achieve more, you bring a growth mindset, strong ownership, and a bias for learning. You embody Microsoft’s values of respect, integrity, and accountability, fostering an inclusive engineering culture where diverse perspectives shape how AI responsibly and effectively transforms critical infrastructure operations.
Responsibilities
AI‑Driven Situational Awareness for Critical Environments
You lead the technical vision for AI‑assisted observability and operations across electrical and mechanical systems, including power distribution, cooling, and life‑safety infrastructure. By integrating telemetry from SCADA, EPMS, BAS, and IoT systems—leveraging protocols such as MODBUS, BACnet, and MQTT—you create unified, cloud‑native data platforms on Azure that enable intelligent reasoning over complex, hierarchical infrastructure.
Your AI expertise is applied end‑to‑end:
Telemetry Intelligence
You design semantic telemetry layers that normalize and contextualize signals from heterogeneous CE devices, enabling AI models to understand not just raw data, but system intent, topology, and operational state.Anomaly Detection & Prediction
You develop and operationalize machine learning models that learn normal operating behavior of power and mechanical systems, detect subtle deviations, and identify early indicators of risk—well before issues escalate into incidents. These models support predictive maintenance, failure prevention, and faster time‑to‑detection across the fleet.Event Correlation & Blast Radius Analysis
You architect AI‑driven correlation engines that reason across electrical, thermal, and operational signals to determine root cause, propagation paths, and customer impact, reducing cognitive load for operators and accelerating mitigation.AI‑Assisted Triage and Decision Support
You enable AI‑powered operational workflows that surface the right information at the right time—combining telemetry, historical incidents, and system models to guide engineers through response actions with clarity and confidence.
From Human‑in‑the‑Loop to Autonomous Operations
As a Principal, you push the organization along the maturity curve from reactive monitoring to proactive and eventually autonomous CE operations. You design systems where AI augments human judgment—providing recommendations, validating safety boundaries, and automating routine actions—while respecting the stringent reliability and safety requirements of mission‑critical infrastructure.You are deeply hands‑on, building prototypes, reference architectures, and early production systems that validate AI approaches against real‑world CE constraints. You partner closely with datacenter operations, electrical and mechanical engineers, and control‑system vendors to ensure AI solutions are physically grounded, explainable, and trusted by operators.Technical Leadership and Organizational ImpactBeyond individual contributions, you provide technical leadership across CO+IE:Defining architectural patterns for AI Ops, event‑driven systems, and real‑time analytics in Critical Environments
Mentoring engineers on applied ML, streaming data platforms, and reliable distributed systems
Influencing roadmaps to ensure AI is embedded as a first‑class capability in CE tooling, not an afterthoughtYour work helps transform Critical Environment telemetry into a strategic asset—enabling higher availability, optimized energy usage, improved sustainability outcomes, and safer operations at global scale, consistent with CO+I’s mission and North Star direction.
You will:
• Write high quality, maintainable, reusable code following SOLID principles.
• Collaborate with and demonstrate features developed to stakeholders in an Agile environment.
• Resolve complex system integration challenges working with other members of the team and external teams.
• Share learnings and code assets developed with the CO+I engineering team.
• Leverage subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.
• Act as a Designated Responsible Individual (DRI) and guide other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.
• Proactively seek new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale.
Qualifications
Required/minimum qualifications
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or PythonOR equivalent experience. Background Check Requirements:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Additional or preferred qualificationsMaster's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or PythonOR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or PythonOR equivalent experience.Experience or exposure in data engineering and backend work. Experience working with MODBUS, BACNET, DATACENTER CRITICAL ENVIRONMENT TELEMETRY, AZURE IOT, AI OPS, LLM, Agentic Apps, KUSTO, MACHINE LEARNING, MQTT, OPC-UA OR Equivalent experience.
#COICareers
Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.