By Simon Crox
Last week, when I put my 18-month-old nephew Ramsy to bed he said: ‘Today I learned six new words!’ He seemed happy and I was proud of him. Of course, it was just my imagination that he told me. Could he even be conscious of how many words he learned? I realised that even without touching a dictionary, he could eventually be talking fluently. With daily life, the complexity and subtle rules of language would become clear to him. In this same manner, software could actually – implicitly – learn from data that is provided. This is known as machine learning (ML).
Most software used in medical science is – explicitly – programmed to perform a task. Their algorithms encode knowledge on a given topic and can draw conclusions about a specific scenario. Conversely, ML software observes variables and looks for combinations that predict an outcome of interest. For example, a prognostic ML software could analyse millions of electronic health records (EHRs) from cancer patients and learn about predictors of mortality. Then, it could analyse a patient’s EHR and predict mortality for this patient[1]. Researchers from Harvard think that these prognostic algorithms will come into use within the next five years. So, ML software shines in handling large amounts of data with high complexity, in an interactive way. In clinical medicine, ML could also help with generating differential diagnoses or radiology. Recently, Radboudumc developed a ML software that can tell, based on patients’ X-rays, whether a patient is at risk for tuberculosis together with Delft Imaging systems[2]. This was subsequently successfully used in mobile clinics in Africa and Europe.
However, ML software is not only used in a clinical setting. At the moment, determining the toxicity of chemicals is mainly accessed in vivo with animal models, but toxicology predictions could also be made in silico using a computational method. If accurate enough, ML could actually reduce the need for animal testing[3].
Of course, serious concerns must be addressed. ML algorithms require plentiful observations to make accurate predictions. In clinical medicine, this is problematic because EHR software is not easily accessible by ML software and more so because of privacy legislations. Moreover, ML often requires manual input at first, which is extensively labour intensive. Another issue is the “black box” character of ML, meaning it can not explain how it came to its given outcome. Finally, we should keep in mind that these predictions are not causalities, but rather correlations. This could lead to so-called spurious correlations, as nicely depicted on www.tylervigen.com. For instance, if ice cream sales were high when most people drowned it does not mean that selling ice cream causes people to drown.
ML to date is probably still the child learning a new language, promising, but we can not give it much reasonability yet. If more mature, it is likely that ML will provide new useful tools for both physicians and researchers.
References:
- Obermeyer Z, Emanuel EJ. Predicting the Future — Big Data, Machine Learning, and Clinical Medicine. The New England journal of medicine 375, 1216-1219 (2016)
- Van Dijk V. Kunstmatige intelligentie in de zorg maken computers de radioloog overbodig. Ned Tijdschr Geneeskd 161:C3656, (2017)
- Raies AB, Bajic VB. In silico toxicology: computational methods for the prediction of chemical toxicity. Wiley Interdisciplinary Reviews Computational Molecular Science 6: 147-172 (2016)