show image

Microsoft’s Satya Nadella: “Data is full of biases – developers must invest in the tools to correct them”

Engineers risk building systemic bias into algorithms if they fail to make the right design decisions, Microsoft CEO Satya Nadella warned yesterday.

Speaking at the software giant’s Transformative AI summit in London, Nadella said the risk of automating bias was particularly high when it comes to natural language processing.

“One of the challenges of AI especially around language understanding are the models that pick up language learning from a vast corpus of human data,” he told an audience of customers at Millbank Tower. “Unfortunately the data is full of biases.”

“In fact you need to invest in tooling that de-biases when you model the language from the corpus of human data. That is one example of the kind of tooling that is required in order to make AI and the practices of AI work together with the ethical principles.”

Microsoft learnt this the hard way. In 2016, the company unveiled an AI Twitter bot called Tay that had been programmed in “conversational understanding”. Within 24 hours, Tay was starting to parrot the sexist and racist abuse it had received. The bot was swiftly shut down.

The software giant is hardly alone in having accidentally built a prejudiced algorithm. Researchers at Boston University and Microsoft Research New England have recently found that some NLP models are more likely to associate the word “programmer” with “man” and “homemaker” with “woman”. Facial recognition algorithms, meanwhile, are more likely to misidentify people of colour, potentially leading to unwarranted arrests.

Mitra Azizirad, Microsoft’s cloud AI marketing chief, told NS Tech that the company is committed to opening up the debiasing tools it has built in house to those who use its platforms. “Training these models is not just the realm of data scientists and engineers, but developers too. The ability for them to have access to the same kind of tools that we do internally is something we’re focused on.”

One of the requirements of the EU’s new data protection regulation is that organisations should be able to explain to users why algorithms have come to particular conclusions. Some in the industry fear that this could present a challenge to tech firms, but Azizirad is optimistic. “I don’t think that’s a tomorrow or next year kind of thing, but the goal is to be able to do that. No matter how far-flung things are, the thing about AI is that it’s 80 per cent data… You can’t profess not to understand what’s going on.”