Should brands open-source their data to remove AI bias?

18 : 12 : 2017 Technology : Digital : Civic Brands
Graphcore is a machine learning start-up Graphcore is a machine learning start-up

With humans putting the trust of millions of micro and macro decisions in the hands of these algorithms, this push for transparency has never been more paramount.

Josh Walker, journalist, The Future Laboratory

It’s been well documented that, for all the best intentions that AI and algorithms have in their learning and rapid decision-making, they are inherently biased.

Last year, a report from ProPublica claimed that the program used by a US court for risk assessment was inherently racist. The system was found to be almost twice as likely to flag a black prisoner to reoffend as it was a white person (45% compared to 24%). And who could forget the disaster that was Microsoft’s AI chatbot Tay? In less than 24 hours, it had turned from a machine with ‘conversational understanding’ to a misogynistic, sexist, racist supporter with extreme far-right views.

While both of these examples of AI bias are no doubt shocking on the surface, what is perhaps more deeply unsettling is that they are mirrors of today’s society. Merely processing the information with which they have been fed, they are an embodiment of the well known computer science term GIGO, or Garbage In, Garbage Out.

Interestingly, a recent article from Motherboard reported that much of the biased nature of these algorithms comes down to copyright. With so much of the data used to train algorithms in the US protected by copyright, and AI researchers having to use public domain works to test their algorithms out, bias will no doubt be high. When you take resources like the much-trawled Wikipedia into account, where, according to an Editor Survey from back in 2011, only 8.5% of Wikipedia editors were noted as female, you start to see where the problems arise.

So with the market for machine learning applications set to reach £31bn ($40bn, €34bn) by 2020, according to market intelligence firm International Data Corporation, and with Narrative Science and the National Business Research Institute predicting that by next year, 62% of US enterprises will be using AI, the time for change is now, and it's up to brands to facilitate that change.

Though there would be natural reluctance from brands to hand a competitor's AI total access to their proprietary data, the very least they can do is make the effort to understand how these systems are working.

Within our Civic Brands macrotrend, we outlined the importance of pushing Total Transparency, noting that in order to create an eco-system wherein their products and services are future-proofed, brands will need to open up their data and collaborate with partners. With humans putting the trust of millions of micro and macro decisions, particularly within industries such as law and medicine, in the hands of these algorithms, this push for transparency has never been more paramount.

And though there would be natural reluctance from brands to hand over complete transparency of their internal workings to an unknown AI, the very least they can do is make the effort to understand how these systems are working - whether that's from the point of view of brands understanding how a black box system functions or from the point of view of consumers who, as algorithms become even more prevalent, will want to know how a brand’s algorithm reached its final decision.

As John Giannandrea, Google’s AI chief recently said, ‘it’s important that we be transparent about the training data that we are using, and are looking for hidden biases in it, otherwise we are building biased systems. If someone is trying to sell you a black box system for medical decision support, and you don’t know how it works or what data was used to train it, then I wouldn’t trust it.'

Garbage in, garbage out.

For more on how your brand can step in as a force for good within society, see our Civic Brands macrotrend.