ml in medicine
view markdownsome rough notes on ml in medicine
general
- 3 types
    
- disease and patient categorization (e.g. classification)
 - fundamental biological study
 - treatment of patients
 
 - philosophy
    
- want to focus on problems doctors can’t do
 - alternatively, focus on automating problems parents can do to screen people at home in cost-effective way
 
 - pathology - branch of medicine where you take some tissue from a patient (e.g. tumor), look at it under a microscope, and make an assesment of what the disease is
 - websites are often easier than apps for patients
 - The clinical artificial intelligence department: a prerequisite for success (cosgriff et al. 2020) - we need designated departments for clinical ai so we don’t have to rely on 3rd-party vendors and can test for things like distr. shift
 - challenges in ai healthcare (news)
    
- adversarial examples
 - things can’t be de-identified
 - algorithms / data can be biased
 - correlation / causation get confused
 
 - healthcare is 20% of US GDP
 - prognosis is a guess as to the outcome of treatment
 - diagnosis is actually identifying the problem and giving it a name, such as depression or obsessive-compulsive disorder
 - AI is a technology, but it’s not a product
 - health economics incentives align with health incentives: catching tumor early is cheaper for hospitals
 
high-level
- focus on building something you want to deploy
    
- clinically useful - more efficient, cutting costs?
 - effective - does it improve the current baseline
 - focused on patient care - what are the unintended consequences
 
 - need to think a lot about regulation
    
- USA: FDA
 - Europe: CE (more convoluted)
 
 - intended use
    
- very specific and well-defined
 
 
criticisms
- Dissecting racial bias in an algorithm used to manage the health of populations (obermeyer et al. 2019)
 
medical system
evaluation
- doctors are evaluated infrequently (and things like personal traits are often included)
 - US has pretty good care but it is expensive per patient
 - expensive things (e.g. Da Vinci robot)
 - even if ml is not perfect, it may still outperform some doctors
 
medical education
- rarely textbooks (often just slides)
 - 1-2% miss rate for diagnosis can be seen as acceptable
 - how doctors think
    
- 2 years: memorizing facts about physiology, pharmacology, and pathology
 - 2 years learning practical applications for this knowledge, such as how to decipher an EKG and how to determine the appropriate dose of insulin for a diabetic
 - little emphasis on metal logic for making a correct diagnosis and avoiding mistakes
 - see work by pat croskerry
 - there is limited data on misdiagnosis rates
 - representativeness error - thinking is overly influenced by what is typically true
 - availability error - tendency to judge the likelihood of an event by the ease with which relevant examples come to mind
        
- common infections tend to occur in epidemics, afflicting large numbers of people in a single community at the same time
 - confirmation bias
 
 - affective error - decisions based on what we wish were true (e.g. caring too much about patient)
 - See one, do one, teach one - teaching axiom
 
 
political elements
- why doctors should organize
 - big pharma
 - day-to-day
    
- Doctors now face a burnout epidemic: thirty-five per cent of them show signs of high depersonalization
 - according to one recent report, only thirteen per cent of a physician’s day, on average, is spent on doctor-patient interaction
 - study during an average, eleven-hour workday, six hours are spent at the keyboard, maintaining electronic health records.
 - medicare’s r.v.u - changes how doctors are reimbursed, emphasising procedural over cognitive things
 - ai could help - make simple diagnoses faster, reduce paperwork, help patients manage their own diseases like diabetes
 - ai could also make things worse - hospitals are mostly run by business people
 
 
medical communication
“how do doctors think?”
- easy to misinterpret things to be causal
 - often no intuition for even relatively simple engineered features, such as averages
 - doctors require context for features (e.g. this feature is larger than the average)
 - often have some rules memorized (otherwise memorize what needs to be looked up)
    
- unclear how well doctors follow rules
 - some rules are 1-way (e.g. only follow it if it says there is danger, otherwise use your best judgement)
        
- 2-way rules are better
 - without proper education 1-way rules can be dangerously used as 2-way rules
 - doesn’t make sense to judge 1-way rules on both sepcificity and sensitivity
 
 
 - rules are often ambiguous (e.g. what constitutes vomiting)
 - doctors adapt to personal experience - may be unfair to evaluate them on larger dataset
 - sometimes said that doctors know 10 medications by heart
 - Overconfidence in Clinical Decision Making (croskerry 2008)
    
- most uncertainty: family medicine [FM] and emergency medicine [EM]
 - some uncertainty: internal medicine
 - little uncertainty: specialty disciplines
 - 2 systems at work: intuitive (uses context, heuristics) vs analytic (systematic, rule-based)
        
- a combination of both performs best
 
 - doctors are often black boxes as well - validated infrequently, unclear how closely they follow rules
 - doctors adapt to local conditions - should be evaluated only on local dataset
 
 - potential liabilities for physicians using ai (price et al. 2019)
 - What’s the trouble. How doctors think. New Yorker. 2007
 - JAMA Users’ Guide to the Medical Literature
 - TRIPOD 22 points paper
 - basic stats in the step1 exam
 - How to Read Articles That Use Machine Learning: Users’ Guides to the Medical Literature (liu et al. 2019
 - Carmelli et al. 2018 - primer for CDRs but also a good example of what sort of article I have envisioned creating.
 - Looking through the retrospectoscope: reducing bias in emergency medicine chart review studies. (kaji et al. 2018)
 
communicating findings
- don’t use ROC curves, use deciles
 - need to evaluate use, not just metric
 - internal/external validity = training/testing error
 - model -> fitted model
 - retrospective (more confounding, looks back) vs prospective study
 - internal/external validity = train/test (although external was usually using different patient population, so is stronger)
 - specificity/sensitivity = precision/recall
 
examples
succesful examples of ai in medicine
- ECG (NEJM, 1991)
 - EKG has a small interpretation on it
 - there used to be bayesian networks / expert systems but they went away…
 
icu interpretability example
- goal: explain the model not the patient (that is the doctor’s job)
 - want to know interactions between features
 - some features are difficult to understand
    
- e.g. max over this window, might seem high to a doctor unless they think about it
 
 - some features don’t really make sense to change (e.g. was this thing measured)
 - doctors like to see trends - patient health changes over time and must include history
 - feature importance under intervention
 
high-performance ai studies
- chest-xray: chexnet
 - echocardiograms: madani, ali, et al. 2018
 - skin: esteva, andre, et al. 2017
 - pathology: campanella, gabriele, et al.. 2019
 - mammogram: kerlikowske, karla, et al. 2018
 
medical imaging
- Medical Imaging and Machine Learning
    
- medical images often have multiple channels / are 3d - closer to video than images
 
 
improving medical studies
- Machine learning methods for developing precision treatment rules with observational data (Kessler et al. 2019)
    
- goal: find precision treatment rules
 - problem: need large sample sizes but can’t obtain them in RCTs
 - recommendations
        
- screen important predictors using large observational medical records rather than RCTs
            
- important to do matching / weighting to account for bias in treatment assignments
 - alternatively, can look for natural experiment / instrumental variable / discontinuity analysis
 - has many benefits
 
 - modeling: should use ensemble methods rather than individual models
 
 - screen important predictors using large observational medical records rather than RCTs