Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News The AI Misinformation Epidemic

The AI Misinformation Epidemic


Over the past five years, Google searches for Machine Learning have gone up five times. Andy Stewart, a managing partner at Motive Partners pointed out last week at the International Fintech Conference that 'for anything that has machine learning in it or blockchain in it, the valuation goes up, 2, 3, 4, 5x'. There is undeniably great interest from the public as well as investors in Machine Learning and how it can be applied to different industry sectors.

In a recent article about 'The AI Misinformation Epidemic', Zachary Lipton, incoming assistant professor at Carnegie Mellon University, described how interest in Machine Learning from the wider public audience combined with a lack of understanding of the internals of what is happening, is creating the perfect storm of interest with ignorance, causing a misinformation epidemic in the field. In a follow up post, he clarified some of the outline points made in the first post.

In an outline of future posts to come, Lipton attributed this epidemic to some of the AI influencers, some prophets of futurism and a failure of the press to accurately describe AI in layman’s terms.

From a technical perspective, it’s not easy to understand in layman’s terms what’s happening in a Machine Learning system. It’s easier to describe and visualize procedural, deterministic algorithms, but many Machine Learning algorithms are based on probabilistic theory, statistics and N-dimensional spaces. These are terms that cannot be explained easily and within the length limitations of current publications to the average reader.

Even letting this aside, with the wealth of APIs available around Machine Learning from major tech companies, it’s complicated to explain the difference in orders of magnitude of work required between using a predictive analytics SaaS and rolling out your own implementation.

On the flip side, even if the average tech- or general audience-oriented site’s coverage of AI is lacking, the field has a wealth of information freely available to anyone interested to learn. Most of cutting edge research in the field is published in Arxiv and is available to everyone; there are numerous courses and nanodegrees around Machine Learning and AI from distinguished universities and the open source ecosystem is vibrant and welcoming to anyone who wants to get her hands dirty on the subject.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Computer are deterministic...

    by Will Hartung,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Whenever something truly bizarre happens on a machine in my purview, I like to chime, sarcastically, "Computers are deterministic". You know, plug in data in one side, get results on the other. Because, you know, computer are SUPPOSED to be deterministic.

    Sequential logic and state machines. "Logic". "State".

    Computer are, still, deterministic, but the problem we have with them as users and consumers, and even as developers, is we are more and more unaware of the base state that the computer is in from which decisions are made. The complexity of underlying software, the myriad of systems, the unknown interactions (both by design and not). All of these give the appearance of spontaneous behavior.

    Throw in "probabilistic" algorithms, and "intelligent behavior", and computers are going to get less and less "deterministic". We've all encountered things that can "never happen" in our systems.

    This is part a parcel to the rise of testing. We've given up understanding our software. Instead we just make sure it'll do what we ask for these specific tasks. "What happens for this edge case?" Many people don't know, unless they've tested for it. And even when the tests pass, that's no assurance of perfection, as the tests, as all tests are, by their nature, incomplete.

  • Deterministic to a T layers and layers of software

    by Alex Giamas,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Fundamentally I don't disagree at all. Computers are deterministic and algorithms are steps that a somewhat "dumb" but really fast machine has to take in order to get from A to B.
    The problem is that what may appear as non-deterministic is a series of several layers of frameworks, libraries, APIs and a hardware stack that we no longer control or understand other than input/output.
    It was "easy" back in the early days of computing to understand pretty much everything that was happening beneath. Nowadays we have to stand on the shoulders of thousands of other developers to build even a basic single page informational web page if we want it to be up to our times.
    Also, its of these libraries may have their own bugs, unintended corner case behaviour or just a case with a wrong test in place. Adding these up in layers and you end up with something that doesn't make sense in the application layer maybe just because of a fault in a driver to the database, or something going wrong in a queue or who knows what..

    That being said, in Machine Learning and AI you have a whole different class of problems. As someone said.."There are lies damn lies and statistics."
    Statistics have become the holy grail of our times, trying to understand what is happening, without so much emphasis as to how or why. We care about a complex probabilistic algorithm segmenting our users but other than testing it with real users, we can't understand why and how of each user's assignment.
    We can no longer easily visualize what's going on the same way we can do with a quicksort for instance..

    This I reckon is the main issue, one that is not easily solvable. The more our algorithms advance, the more we (we as in whoever is not a deeply involved ML/statistician person) will have to rely on ML outputs the same way we rely on an OAuth library working as expected, without so much of an understanding as to how things are happening.

    If you extrapolate this to non-techies it's of course even worse. If there are many software engineers who can't understand the inner workings of an AI algorithm, how can the popular press distinguish between AI true findings and AI related news that have no scientific substance? It becomes impossible, and that's why we hear about AI all the time in the press/media and a good chunk of it is marketing talk rather than true AI...:)

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p