ml in medicine
view markdownsome rough notes on ml in medicine
general
- 3 types
- disease and patient categorization (e.g. classification)
- fundamental biological study
- treatment of patients
- philosophy
- want to focus on problems doctors can’t do
- alternatively, focus on automating problems parents can do to screen people at home in cost-effective way
- pathology - branch of medicine where you take some tissue from a patient (e.g. tumor), look at it under a microscope, and make an assesment of what the disease is
- websites are often easier than apps for patients
- The clinical artificial intelligence department: a prerequisite for success (cosgriff et al. 2020) - we need designated departments for clinical ai so we don’t have to rely on 3rd-party vendors and can test for things like distr. shift
- challenges in ai healthcare (news)
- adversarial examples
- things can’t be de-identified
- algorithms / data can be biased
- correlation / causation get confused
- healthcare is 20% of US GDP
- prognosis is a guess as to the outcome of treatment
- diagnosis is actually identifying the problem and giving it a name, such as depression or obsessive-compulsive disorder
- AI is a technology, but it’s not a product
- health economics incentives align with health incentives: catching tumor early is cheaper for hospitals
high-level
- focus on building something you want to deploy
- clinically useful - more efficient, cutting costs?
- effective - does it improve the current baseline
- focused on patient care - what are the unintended consequences
- need to think a lot about regulation
- USA: FDA
- Europe: CE (more convoluted)
- intended use
- very specific and well-defined
criticisms
- Dissecting racial bias in an algorithm used to manage the health of populations (obermeyer et al. 2019)
medical system
evaluation
- doctors are evaluated infrequently (and things like personal traits are often included)
- US has pretty good care but it is expensive per patient
- expensive things (e.g. Da Vinci robot)
- even if ml is not perfect, it may still outperform some doctors
medical education
- rarely textbooks (often just slides)
- 1-2% miss rate for diagnosis can be seen as acceptable
- how doctors think
- 2 years: memorizing facts about physiology, pharmacology, and pathology
- 2 years learning practical applications for this knowledge, such as how to decipher an EKG and how to determine the appropriate dose of insulin for a diabetic
- little emphasis on metal logic for making a correct diagnosis and avoiding mistakes
- see work by pat croskerry
- there is limited data on misdiagnosis rates
- representativeness error - thinking is overly influenced by what is typically true
- availability error - tendency to judge the likelihood of an event by the ease with which relevant examples come to mind
- common infections tend to occur in epidemics, afflicting large numbers of people in a single community at the same time
- confirmation bias
- affective error - decisions based on what we wish were true (e.g. caring too much about patient)
- See one, do one, teach one - teaching axiom
political elements
- why doctors should organize
- big pharma
- day-to-day
- Doctors now face a burnout epidemic: thirty-five per cent of them show signs of high depersonalization
- according to one recent report, only thirteen per cent of a physician’s day, on average, is spent on doctor-patient interaction
- study during an average, eleven-hour workday, six hours are spent at the keyboard, maintaining electronic health records.
- medicare’s r.v.u - changes how doctors are reimbursed, emphasising procedural over cognitive things
- ai could help - make simple diagnoses faster, reduce paperwork, help patients manage their own diseases like diabetes
- ai could also make things worse - hospitals are mostly run by business people
medical communication
“how do doctors think?”
- easy to misinterpret things to be causal
- often no intuition for even relatively simple engineered features, such as averages
- doctors require context for features (e.g. this feature is larger than the average)
- often have some rules memorized (otherwise memorize what needs to be looked up)
- unclear how well doctors follow rules
- some rules are 1-way (e.g. only follow it if it says there is danger, otherwise use your best judgement)
- 2-way rules are better
- without proper education 1-way rules can be dangerously used as 2-way rules
- doesn’t make sense to judge 1-way rules on both sepcificity and sensitivity
- rules are often ambiguous (e.g. what constitutes vomiting)
- doctors adapt to personal experience - may be unfair to evaluate them on larger dataset
- sometimes said that doctors know 10 medications by heart
- Overconfidence in Clinical Decision Making (croskerry 2008)
- most uncertainty: family medicine [FM] and emergency medicine [EM]
- some uncertainty: internal medicine
- little uncertainty: specialty disciplines
- 2 systems at work: intuitive (uses context, heuristics) vs analytic (systematic, rule-based)
- a combination of both performs best
- doctors are often black boxes as well - validated infrequently, unclear how closely they follow rules
- doctors adapt to local conditions - should be evaluated only on local dataset
- potential liabilities for physicians using ai (price et al. 2019)
- What’s the trouble. How doctors think. New Yorker. 2007
- JAMA Users’ Guide to the Medical Literature
- TRIPOD 22 points paper
- basic stats in the step1 exam
- How to Read Articles That Use Machine Learning: Users’ Guides to the Medical Literature (liu et al. 2019
- Carmelli et al. 2018 - primer for CDRs but also a good example of what sort of article I have envisioned creating.
- Looking through the retrospectoscope: reducing bias in emergency medicine chart review studies. (kaji et al. 2018)
communicating findings
- don’t use ROC curves, use deciles
- need to evaluate use, not just metric
- internal/external validity = training/testing error
- model -> fitted model
- retrospective (more confounding, looks back) vs prospective study
- internal/external validity = train/test (although external was usually using different patient population, so is stronger)
- specificity/sensitivity = precision/recall
examples
succesful examples of ai in medicine
- ECG (NEJM, 1991)
- EKG has a small interpretation on it
- there used to be bayesian networks / expert systems but they went away…
icu interpretability example
- goal: explain the model not the patient (that is the doctor’s job)
- want to know interactions between features
- some features are difficult to understand
- e.g. max over this window, might seem high to a doctor unless they think about it
- some features don’t really make sense to change (e.g. was this thing measured)
- doctors like to see trends - patient health changes over time and must include history
- feature importance under intervention
high-performance ai studies
- chest-xray: chexnet
- echocardiograms: madani, ali, et al. 2018
- skin: esteva, andre, et al. 2017
- pathology: campanella, gabriele, et al.. 2019
- mammogram: kerlikowske, karla, et al. 2018
medical imaging
- Medical Imaging and Machine Learning
- medical images often have multiple channels / are 3d - closer to video than images
improving medical studies
- Machine learning methods for developing precision treatment rules with observational data (Kessler et al. 2019)
- goal: find precision treatment rules
- problem: need large sample sizes but can’t obtain them in RCTs
- recommendations
- screen important predictors using large observational medical records rather than RCTs
- important to do matching / weighting to account for bias in treatment assignments
- alternatively, can look for natural experiment / instrumental variable / discontinuity analysis
- has many benefits
- modeling: should use ensemble methods rather than individual models
- screen important predictors using large observational medical records rather than RCTs