Artificial intelligence and diagnostic radiology: Not quite ready to welcome our computer overlords

Published Date: March 30, 2012

Hal 9000 from 2001: A Space Odyssey:

Hal: “The 9000 series is the most reliable computer ever made. No 9000 computer has ever made a mistake or distorted information. We are all, by any practical definition of the words, foolproof and incapable of error. I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”

Hal: “Look Dave, I can see you’re really upset about this. I honestly think you ought to sit down calmly, take a stress pill, and think things over.”

Hal: “I know I’ve made some very poor decisions recently, but I can give you my complete assurance that my work will be back to normal. I’ve still got the greatest enthusiasm and confidence in the mission. And I want to help you.”

Ken Jennings (winner of 74 consecutive Jeopardy! matches after a devastating loss to Watson):

“I for one welcome our new computer overlords.”

Siri responding to the question “Will you marry me”:

“My End User Licensing Agreement does not cover marriage. My apologies.”

Watson in response to this Jeopardy! answer under the category of U.S. Cities:

Answer: Its largest airport is named for a World War II hero; its second largest, for a World War II battle.

Question: What is Toronto?¹

The year 2011 will likely be remembered as a major milestone in the mainstream introduction and adoption of artificial intelligence in our society. The introduction of “Siri,” the artificial intelligence female voice and personal assistant included with every Apple iPhone 4S, has captured the collective imagination of millions of people around the world who now use it routinely to plan, communicate, learn, and entertain themselves.

The highly anticipated and watched 3-day Jeopardy! match in February 2011 demonstrated that despite their valiant efforts, the 2 best human players of all time could not hang with IBM’s Watson computer, prompting Ken Jennings to paraphrase a line from The Simpsons and declare, “I for one welcome our new computer overlords,” after he and Brad Rutter were crushed in the 3-day match.¹

What is significant about the “deep Q/A software” is its ability to perform so well and with lightning speed in an “open-domain” challenge to answer questions about almost any topic. What is particularly unique and groundbreaking about Watson’s approach is its ability to break down a query using natural language processing and to then consider, analyze, and rank millions of possible answers within 3 seconds. It does this dynamically without having any preprogrammed questions or answers memorized. This represents a fundamental advance over artificial intelligence software developed in the late 1970’s, such as Mycin, which was created at Stanford to diagnose sepsis in the ICU, or Internist I and Internist II, which were programmed based on the experience and clinical expertise of a single “expert,” Dr. Jack Myers, the chairman of medicine at the University of Pittsburgh in the early 1980’s. These systems were amazing in their era, but were relatively slow, difficult to interact with and inflexible; limitations which made those programs impractical for use in actual clinical care.

The potential to bring the Watson and other 21st Century technology to medicine to analyze current and historic medical literature and to assist in patient surveillance, diagnosis, and therapy is enormous. In my opinion, the application of what has been referred to as “artificial intelligence” will represent a major advance in the evolution of medicine and, specifically, in the evolving practice of diagnostic imaging. I personally had the opportunity to work with the Watson team prior and subsequent to the Jeopardy! match to bring the technology to medical care. I have been extraordinarily impressed with the computer’s ability to rapidly acquire medical domain knowledge and to accurately suggest what seems to be an impressive list of diagnostic and therapeutic possibilities listed in order of confidence. The potential of a massively parallel system that is able to read the equivalent of a million books per second and to respond within moments is tremendous in research, disease prevention, and in clinical care. These “artificial intelligence” systems do not fall prey to such cognitive pitfalls as “satisfaction of search” (premature closure), incorrect assumption of a single rather than multifactorial cause, fatigue, or distractibility. Physicians in general, including radiologists, are reaching the point of information saturation due to increasing volume and complexity of information. This has actually worsened with the increasing availability of the electronic medical records, “alert fatigue,” and will intensify with the approaching era of personalized medicine and the accompanying deluge of biomarkers, such as genomic, proteomic, and metabolic patient data.

In our own specialty, we diagnostic radiologists actually only spend a relatively small percentage of our day actually performing pure image recognition, comparison, and interpretation. The vast majority of our time is actually spent signing in and out of various computer systems, protocoling studies, trying to obtain relevant and useful information related to studies that we are interpreting, arranging the proper images/sequences/studies for comparison, trying to communicate information to technologists, patients, clinicians, and support personnel, and in other miscellaneous tasks.

Most of us have varying levels of assistance in these tasks; for example, I have residents and fellows in our academic radiology practice to preread the studies, interface with technologists, clinicians, and to, in some cases, review a patient’s electronic medical record or talk directly to the patients. In nonacademic practices there are in some cases, designated personnel to preprocess the images, communicate with physicians, protocol studies, or to summarize previous recommendations or important findings in the patient’s chart or from previous imaging studies.

Intelligent computer systems will, within the next 10 years, be used to automate many of these nonimage interpretation related functions, which will make us not only substantially more efficient, but will decrease error rates and improve patient safety. Unlike residents and fellows, computer systems, such as Watson, can work more than 80 hours per week, do not experience fatigue, and do not graduate or leave at the end of the year, and like residents and fellows, they will learn and continue to improve over time.

For those of you who are wondering whether radiologists will soon be replaced by artificial intelligence systems, such as Watson or Siri, there is encouraging news. It turns out that while these systems can do a fairly good job with extraction and analysis of structured and even unstructured text-based data, they still are at a surprisingly primitive level in their evaluation of images. Koch and Tononi published an article in Scientific American,²suggesting that the ultimate test of “conscious awareness” was not the famous Turing test, which assesses whether a computer can fool a human into thinking it is another human, but rather the ability to determine what is wrong with an “ordinary” photograph. They use an example of an elephant sitting on top of the Eiffel Tower, which might be used in a Highlight’s magazine quiz for 5-year-olds as an example of the difficulty computers have with analyzing what is wrong with a given image. The current state-of-the-art in computer science is still many years away from being able to solve these types of challenges, which suggests that radiology may be one of the last specialties to be vulnerable to being replaced (or unfortunately, strongly assisted) by the current generation of artificial intelligence systems, however many TeraFLOPs of processing power they may possess.

The recent renaissance in artificial intelligence (AI) in medicine will likely have a major positive impact on the practice of diagnostic radiology and the practice of medicine in general within the next 10 years. It will allow us to spend a higher percentage of our time in the actual analysis and interpretation of medical images and will provide a much better summary of relevant patient information to help us determine the a priori probability of disease in a given patient to help us work more effectively, efficiently, and safely. However, we need to be cautious in our development and adoption of the technology. Just as our residents and fellows make mistakes, we need to understand that our fledgling AI systems have the potential to make even bigger blunders, and that they, at least for now, must be treated like a quirky but really enthusiastic and hard working medical student with incredible potential, if not a great base of experience, common sense, or sense of humor and humility.

References

Baker S. Final Jeopardy: How can Watson conclude that Toronto is a U.S. city? Numerati. http://thenumerati.net/?postID=726&final-jeopardy-how-can-watson-conclude-that-toronto-is-a-u-s-city. Updated February 15, 2011. Accessed March 19, 2012.
Koch C, Tonini G. How will we know when we’ve built a sentient computer? By making it solve a simple puzzle. Scientific American. http://www.scientificamerican.com/article.cfm?id=a-test-for-consciousness. Updated June 13, 2011. Accessed March 19, 2012.

5 not found