The building blocks of Microsoft’s responsible AI program

The pace at which artificial intelligence (AI) is advancing is remarkable. As we look out at the next few years for this field, one thing is clear: AI will be celebrated for its benefits but also scrutinized and, to some degree, feared. It remains our belief that, for AI to benefit everyone, it must be developed and used in ways which warrant people’s trust. Microsoft’s approach, which is based on our AI principles, is focused on proactively establishing guardrails for AI systems so that we can make sure that their risks are anticipated and mitigated, and their benefits are maximized.

Over the past few years, principles around developing AI responsibly have proliferated and, for the most part, there is overwhelming agreement on the need to prioritize issues like transparency, fairness, accountability, privacy, and security. Furthermore, while principles are necessary, having them alone is not enough. The hard and essential work begins when you endeavor to turn those principles into practices.

Today, I am sharing some details about the building blocks that are the basis for our responsible AI program at Microsoft: a governance structure to enable progress and accountability; rules to standardize our responsible AI requirements; training and practices to help our employees act on our principles and think deeply about the sociotechnical impacts of our AI systems; and tools and processes for implementation.

There’s much more to discuss around critical issues like facial recognition, large-scale language models and other sensitive AI applications that affect all of our lives. We are engaged on those issues and plan to stay engaged in the broader community-wide discussions, and to hear from and learn from others. It’s important to have these broader societal conversations because the collective choices we make will shape how we build and use AI and the future it will help bring about.

This blog doesn’t address those issues directly. It’s specifically focused on a core question we get asked by customers and the broader community about how we take principles and turn them in to practice. This blog is about the processes, tools, training and other resources that we use to ensure that the AI solutions we develop actually reflect the principles we adopted.

Governance as a foundation for compliance

While there is much that is new and unchartered in the domain of responsible AI, there’s also much that can be learned from adjacent domains. Our responsible AI governance approach borrows the hub-and-spoke model that has worked successfully to integrate privacy, security and accessibility into our products and services.

Our “hub” includes: the Aether Committee, whose working groups leverage top scientific and engineering talent to provide subject-matter expertise on the state-of-the-art and emerging trends regarding the enactment of Microsoft’s responsible AI principles; the Office of Responsible AI, which sets our policies and governance processes; and our Responsible AI Strategy in Engineering (RAISE) group, which enables our engineering groups to implement our responsible AI processes through systems and tools. The three groups work together to set a consistent bar for responsible AI across the company and they empower our “spokes” to drive initiatives and be accountable for them.

The spokes of our governance include our Responsible AI Champs community. The Champs are appointed by company leadership and sit in engineering and sales teams across the company. They raise awareness about Microsoft’s approach to responsible AI and the tools and processes available, they spot issues and help teams assess ethical and societal considerations, and they cultivate a culture of responsible innovation in their teams.

Developing rules to enact our principles

In the fall of 2019, we published internally the first version of our Responsible AI Standard, a set of rules for how we enact our responsible AI principles underpinned by Microsoft’s corporate policy. We published the first version of the Standard with an eye to learning, and with a humble recognition that we were at the beginning of our effort to systematically move from principles to practices. Through a phased pilot across 10 engineering groups and two customer-facing teams, we learned what worked and what did not. Our pilot teams appreciated the examples of how responsible AI concerns can arise. They also struggled sometimes with the open-endedness of the considerations laid out in the Standard and expressed a desire for more concrete requirements and criteria. There was a thirst for more tools, templates, and systems, and for a closer integration with existing development practices.

Just over a year later, we’re previewing version two of the Responsible AI Standard with our employees. The revision reinforces a human-centered approach, building upon strong research and engineering foundations. It will mandate that teams building AI systems meet requirements that accrue to principle-specific goals. These goals help engage our engineering teams’ problem-solving instincts and provide context for the requirements.

For each requirement in the Responsible AI Standard, we will build out a set of implementation methods that teams can draw upon, including tools, patterns and practices crowdsourced from within and outside the company and refined through a maturity process. We expect this to be a cross-company, multi-year effort and one of the most critical elements for operationalizing responsible AI across the company. We will continue to collect and integrate feedback as we move towards finalizing version two of the Standard and its global implementation.

Drawing red lines and working through the grey areas

In the fast-moving and nuanced practice of responsible AI, it is impossible to reduce all the complex sociotechnical considerations into an exhaustive set of pre-defined rules. This led us to create a process for ongoing review and oversight of high-impact cases and rising issues and questions.

Our sensitive uses process requires that use cases that meet our review criteria are reported to our Office of Responsible AI for triage and review, which includes a deliberation when there is no existing precedent to draw upon. Since July 2019, we’ve processed over two hundred use case reviews, including an uptick in reviews since March 2020 as more Microsoft teams and customers sought to use AI technologies amid applications and opportunities with harnessing data and AI methods to mitigate challenges with Covid-19.

This sensitive uses review process has helped us navigate the grey areas that are inevitably encountered and, leads in some cases to new red lines. Outcomes of the process includes our declining opportunities to build and deploy specific AI applications because we were not confident that we could do so in a way that upheld our principles. For example, Microsoft President Brad Smith spoke publicly about how, through our sensitive uses review process, we determined that a local California police department’s real-time use of facial recognition on body-worn cameras and dash cams in patrol scenarios was premature, and he shared the fact that we turned down the deal. In addition to navigating the technical challenges presented by facial recognition operating in an uncontrolled environment, our sensitive uses review process helped us to form the view that there needed to be a societal conversation around the use of facial recognition, and laws needed to be established. Thus, a redline was drawn for this use case as early as 2018.

In working through the complexities of several other cases, we also came to appreciate the importance of three key learnings. First, by digging into the details of use cases, we’ve been able to understand and articulate their different risk profiles, such as the impact on failure and misuse on stakeholders, and the readiness of the technology for the particular use case. Second, we’ve learned the important roles that benchmarking and operational testing play, helping to ensure that AI systems serve their stakeholders well and meet quality bars not just in labs, but also in the real world. And, third, we’ve learned how we need to communicate with our customers to empower them to deploy their systems responsibly.

These learnings helped inform new practices at Microsoft. For example, we developed Transparency Notes to help teams communicate the purposes, capabilities and limitations of an AI system so our customers can understand when and how to deploy our platform technologies. Transparency Notes fill the gap between marketing and technical documentation, proactively communicating information that our customers need to know to deploy AI responsibly. Our Face API Transparency Note was our first attempt at this new practice, and we now have a growing number of Transparency Notes being prepared across our platform offerings. We see synergies between our Transparency Notes and other industry efforts such as Model Cards, Datasheets for Dataset, and AI FactSheets, and we’re pleased to be playing an active role in the Partnership on AI’s ABOUT ML initiative to evolve the artifacts and processes for responsible AI documentation.

Evolving our mindset and asking hard questions

Today, we understand that it is critically important for our employees to think holistically about the AI systems we choose to build. As part of this, we all need to think deeply about and account for sociotechnical impacts. That’s why we’ve developed training and practices to help our teams build the muscle of asking ground-zero questions, such as, “Why are we building this AI system?” and, “Is the AI technology at the core of this system ready for this application?”

In 2020, our mandatory Introduction to Responsible AI training helped more than 145,000 employees learn the sensitive use process, the Responsible AI Standard and the foundations of our AI principles.

Additionally, we introduced Envision AI, an applied workshop and practice for completing impact assessments. Developed by our Project Tokyo team and Office of Responsible AI, Envision AI takes participants through real scenarios that emerged while our Project Tokyo team was immersed in designing an approach to intelligent personal agent technology. Through interactive exercises, participants learn about human-centered AI and the mindset it requires, and they experience using the types of tools available to systematically think about the impacts of a technology on a broad set of people. Individuals and teams apply this directly to the task of conducting an impact assessment, which is required within our Responsible AI Standard. As is the norm for our responsible AI work, we built the workshop using a research-led, iterative test-and-learn approach, and we have been encouraged by the feedback we have received. We are in the process of scaling Envision AI to more teams across Microsoft and to groups outside the company.

The kind of mindset shift we are guiding involves an ongoing process of dialogue, integration and reinforcement. At times, our teams at Microsoft have experienced galvanizing moments that accelerated progress, such as triaging the customer report of an AI system behaving in an unacceptable way. At the same time, we’ve also seen teams wonder whether being “responsible” will be limiting, only to realize later that a human-centered approach to AI results in not just a responsible product, but a better product overall.

Pioneering new engineering practices

Privacy, and the GDPR experience in particular, taught us the importance of engineered systems and tools for enacting a new initiative at scale and ensuring that key considerations are baked in by design.

As we have been rolling out our responsible AI program across the company, the existence of engineering systems and tools to help deliver on our responsible AI commitments has been a priority for our teams. Although tooling – particularly in its most technical sense – is not capable of the deep, human-centered thinking work that needs to be undertaken while conceiving AI systems, we think it is important to develop repeatable tools, patterns and practices where possible so the creative thought of our engineering teams can be directed toward the most novel and unique challenges, not reinventing the wheel. Integrated systems and tools also help drive consistency and ensure that responsible AI is part of the everyday way in which our engineering teams work.

In recognition of this need, we are embarking on an initiative to build out the “paved road” for responsible AI at Microsoft – the set of tools, patterns and practices that help teams easily integrate responsible AI requirements into their everyday development practices. AzureML serves as the foundation for this paved road, leveraging the early integrations of our open source tools, Fairlearn and InterpretML, so that our customers will also benefit from our development of engineering systems and tools.

Scaling our efforts to develop AI responsibly

As we look ahead, we’ll focus on three things: first, consistently and systematically enacting our principles through the continued rollout of our Responsible AI Standard; second, advancing the state of the art of responsible AI through research-to-practice incubations and new engineering systems and tools; third, continuing to build a culture of responsible AI across the company.

We are acutely aware that, as the adoption of AI technologies accelerates, new and complex ethical challenges will arise. While we recognize that we don’t have all the answers, the building blocks of our approach to responsible AI at Microsoft are designed to help us stay ahead of these challenges and enact a deliberate and principled approach. We will continue to share what we learn, and we welcome opportunities to learn with others.