Thrombolytics for stroke: The evidence

A summary of the evidence for (or against) thrombolytics for stroke

Thrombolytics for stroke: undoubtedly the biggest controversy in emergency medicine. Also, the topic of this week’s Emergency Medicine Cases Journal Jam podcast. Rory Spiegel, Anton Helman, and I take a deep dive into the evidence. Why would we do this? No, it isn’t just that we have too much time on our hands. The journal jam podcast exists because we truly believe it is important to understand why we do what we do, both to ensure we are always providing the best care for our patients, but also so that we can explain that care to our patients. The evidence for (or against) thrombolytics is important precisely because the topic is so controversial. You will hear arguments on both sides. So will your patients. It is only through a familiarity with the studies, their strengths, and their weaknesses, that you will be able to decide for yourself what the evidence really shows and guide your patients to the best decision for their circumstances.

What follows are the notes I made while preparing for the podcast. First, I review the major randomized controlled trials looking at thrombolytics for stroke. That is followed by a discussion of the things I think are important to consider when trying to interpret this data. (Many folks might want to skip straight to this discussion section.)

The Major RCTs

These are the major RCTs in chronological order.


MAST-Italy

Multicentre Acute Stroke Trial–Italy (MAST-I) Group.Randomised controlled trial of streptokinase, aspirin, and combination of both in treatment of acute ischaemic stroke. Lancet (London, England). 1995; 346(8989):1509-14. PMID: 7491044

This is a randomized, multicenter, open-label, controlled trial

  • Patients: 622 adult stroke patients within 6 hours of symptom onset. (A total of 14,083 were screened to identify those 622).
  • Intervention: Streptokinase 1.5 million units over 1 hour. (There was also an aspirin versus no aspirin component to the trial.)
  • Comparison: No treatment
  • Primary outcome: Death and disability (modified Rankin scale 3-6) at 6 months.

Results

  • The trial was stopped early because of an indication of early harm. The original sample size was supposed to be 1500.
  • There was no difference in death or disability at 6 months (63% with streptokinase, 65% in controls, OR 0.9; 95% CI 0.7-1.3).
  • Morality at 6 months was increased (36% vs 24%, OR 1.7; 95%CI 1.2-2.5).
  • Disability (mRs 3-5) at 6 month was decreased (27% vs 40%, OR 0.5; 95%CI 0.4-0.8).
  • Other adverse events: 1 case of anaphylactic shock, 1 patients with severe non-intracranial hemorrhage requiring transfusions.

MAST I outcomes.PNG

Comments

  • This is an open label trial. This is incredibly important, as the modified Rankin score is subjective, and not perfectly reproducible.
  • This trial seems to be the basis for the idea that thrombolytics will either kill you or help you, but as we will see, I don’t think that common assertion is well supported by the remainder of the evidence.

ECASS

Hacke W, Kaste M, Fieschi C. Intravenous thrombolysis with recombinant tissue plasminogen activator for acute hemispheric stroke. The European Cooperative Acute Stroke Study (ECASS). JAMA. 1995; 274(13):1017-25. PMID: 7563451

This is a randomized, multicenter, double-blind, placebo controlled trial

  • Patients: 620 adult patients (aged 18-80) with moderate to severe hemispheric strokes and no major early changes on CT within 6 hours from symptom onset.
  • Intervention: rt-PA 1.1mg
  • Comparison: placebo
  • Primary outcome: Dual primary outcomes: A difference of 15 points Barthel index at 90 days and a difference of 1 points on the modified Rankin score at 90 days between the two groups (not improvement like we see elsewhere).

Results

  • There was no difference in the Barthel index at 90 days (75 vs 85, p=0.99)
  • There also was no difference in the modified Rankin score at 90 days (3 vs 3, p=0.41)
  • There was a statistically significant increase in mortality with treatment (22.4% vs 15.8%, p=0.04)
  • In the per protocol analysis, there was no difference in the Barthel index, but there was improvement in the mRs with tPa.

Comments

  • Despite those primary outcome numbers, the authors state: “we conclude that thrombolytic therapy is effective in improving some functional and neurologic measures in the subgroup of stroke patients with moderate to severe neurologic deficit and without extended infarct signs on the initial CT scan.” This frustrating pattern of conclusions that don’t match results recurs throughout this literature.
  • There were protocol violations in 17.4% of the patients, so the authors focus more on the per-protocol (or target population) analysis. However, there is no reason to think these same protocol violations wouldn’t occur in the real world, so I think it is more appropriate to focus on the intention to treat analysis.
  • They do something statistically that I have never seen before, which is adjust what they consider a statistically significant result after the data was collected by looking at the correlation between the Barthel and modified Rankin scores. It isn’t clear to me how or when this is appropriate.

NINDS 1

NINDS study group. Tissue plasminogen activator for acute ischemic stroke. The New England journal of medicine. 1995; 333(24):1581-7. PMID: 7477192 [free full text]

A randomized, double-blind, placebo controlled trial

  • Patients: 291 adult patients with a stroke measurable on the NIHSS with a clearly definable onset time within 3 hours.
  • Intervention: t-Pa (0.9 mg/kg; 10% given as a bolus and 90% over 60 minutes)
  • Comparison: Placebo
  • Primary outcome: 2 primary outcomes: completely resolutions of symptoms, or an improvement of greater than 4 points on the NIHSS at 24 hours.

Results

  • There was no significant difference between the two groups at 24 hours. (47% improved with tPa vs 39% with placebo, RR 1.2 (95% CI 0.9-1.6), p=0.21).
  • As a secondary outcome of part 1, they looked at the 3 month outcomes (the same outcomes as in part 2). The outcomes are the same as they saw with NINDS 2, which is discussed below.

NINDS 1 results.PNG

Comments

  • It is probably important to note how strange it is that, despite being a multi-center randomized trial, NINDS 1 didn’t get it’s own publication. This is a big, expensive trial, and every other study on this list made it into one of the elite journals, even if negative. Presumably, this was done to downplay the fact that it was a negative trial (which seems to have worked, because I never hear people talking about the negative NINDS trial).
  • It is not clear from the manuscript when they decided that NINDS should be 2 trials, and without a pre-published protocol, some people have argued that they shifted the goalposts (changed primary outcomes) after data had already been collected.

NINDS 2

NINDS study group. Tissue plasminogen activator for acute ischemic stroke. The New England journal of medicine. 1995; 333(24):1581-7. PMID: 7477192 [free full text]

  • Patients: 333 adult patients with a stroke measurable on the NIHSS with a clearly definable onset time within 3 hours. For brevity, I am not going to talk about the exclusion criteria for the other studies. However, if you are currently using tPa, it is almost certainly because of this one study, so its use should probably closely follow this study’s protocol. The exclusion criteria were:
    • Stroke or serious head trauma within previous 3 months
    • Major surgery within 14 days
    • History of ICH or symptoms suggestive of SAH
    • SBP >185mmHg or DBP >110mmHg; or aggressive treatment required to reduce BP to specified limits
    • Rapidly improving or minor symptoms
    • Symptoms suggestive of SAH
    • GI or GU bleeding within previous 21 days
    • Arterial puncture at noncompressible site within previous 7 days
    • Seizure at onset of stroke
    • Anticoagulants or antithrombotics within 48 hours preceding onset of stroke
    • Elevated PTT/PT or platelets <100k
    • Glucose 400 mg/dl
  • Intervention: t-Pa (0.9 mg/kg; 10% given as a bolus and 90% over 60 minutes)
  • Comparison: Placebo
  • Primary outcome: There were 4 primary outcome measures. Proportion of patients with minimal or no deficit at 3 months, as measured by either Barthel index, modified Rankin scale, Glasgow outcome scale, or NIHSS.

Results

  • There were 4 primary outcomes so it is hard to know how to present the data. The modified Rankin scale is talked about the most, which is funny because it is the one that show the biggest outcome difference. In terms of favourable outcome, the reported results are:

tPa

Placebo

P value

Barthel index 50% 38% 0.026
Modified Rankin scale 39% 26% 0.019
Glasgow outcome scale 44% 32% 0.025
NIHSS 31% 20% 0.033
  • There wasn’t a significant difference in mortality between the groups (17% with t-PA and 21% with placebo, p=0.30).
  • Symptomatic hemorrhage increased from 1% to 7%

NINDS2 main results.png

NINDS2 results by time.png

Comments

  • It isn’t good form to have 4 primary outcomes. It changes the math, and you really should adjust your statistics to account for multiple comparisons, but at least they were all consistent in this study.
  • In the data presented in the manuscript, the two groups look like they were pretty evenly matched when the trial started. However, there may be some important differences between the two groups that are hidden in the presentation of means and medians. When looking at the actual distribution of strokes, the placebo group had significantly more severe stroke (NIHSS>20) and fewer mild stroke (NIHSS 0-5). (See Mann 2002 for a description.) Baseline stroke severity (or NIHSS) is the single biggest determinant of ultimate outcome in stroke, so any difference in baseline stroke severity could significantly bias the results of a small trial with relatively narrow margins of benefit to harm.
  • In NINDS part 1 they report the outcome in terms of degree of improvement (a measure that takes into account how sick you were when you started). NINDS parts 2 only reports their data in terms of where patients end up. It’s not clear why they would use a different approach between the two trials, and they don’t present the data on degree of improvement in NINDS 2 at all.
  • They break the outcomes up into a 0-90 minute group and a 90-180 minute group. There are no differences at all between those two groups, which doesn’t fit well with the time is brain hypothesis. There is a constant pressure on us to give t-Pa as quick as possible, but it is important not to rush. Stroke is not as easy a diagnosis to make as STEMI, and we really need to make sure we aren’t giving t-Pa to patients who don’t have strokes.
  • On the other hand, if you believe the “time is brain” mantra, the NINDS group is not representative of the patients you will be treating. The patients were not recruited consecutively. There had to be 1 patient in the less than 90 minute group for every patient in the 90-180 minute group. In real life, the less than 90 minute group is essentially non-existent, meaning the results we see in NINDS will be much better than we see in real life (if times matters).

FOAM commentaries


MAST-Europe

Hommel M, Cornu C, Boutitie F, Boissel JP. Thrombolytic therapy with streptokinase in acute ischemic stroke. The New England journal of medicine. 1996; 335(3):145-50. PMID: 8657211 [free full text]

A multicenter, randomized, double-blind, placebo controlled trial

  • Patients: 310 patients with moderate to severe stroke in the territory of the middle cerebral artery presenting within 6 hours.
  • Intervention: Streptokinase 1.5 million units over 1 hour
  • Comparison: Placebo
  • Primary outcome: Death and disability using a Ranking score greater than or equal to 3 at 6 months.

Results

  • The trial was stopped early due to increased mortality. The sample size calculation called for 600 patients.
  • There was no difference in the primary outcome of death or disability (81.8% vs 79.5%, p=0.60)
  • There was a significant increase in mortality at 10 days (34% vs 18%, p=0.002), but it was no longer statistically significant by 6 months (47% vs 38%, p=0.13)
  • There was a large, statistically significant increase in symptomatic intracerebral hemorrhage (21.2% vs 2.6%, p<0.001).

Comments

  • One of the problems with stopping trials early is potentially under-estimating harms. Technically, there was no statistical difference at 6 months, but a 9% absolute increase in mortality would certainly be important.
  • The authors do make an important point in their discussion: “Only the NINDS trial reported less severe disability without an increase in the mortality rate due to intracranial hemorrhages. The possibility cannot be ruled out that the results of the NINDS trial are due to chance; the results of a single trial do not provide sufficient evidence of the efficacy and safety of a drug, especially when similar trials have conflicting results.”

MAST Europe outcomes.PNG

MAST Europe survival.PNG


ASK

Donnan GA, Davis SM, Chambers BR. Streptokinase for acute ischemic stroke with relationship to time of administration: Australian Streptokinase (ASK) Trial Study Group. JAMA. 1996; 276(12):961-6. PMID: 8805730

This is a randomized, double-blind, multicenter, placebo control trial.

  • Patients: 340 adult stroke patients (aged 18-80 years) within 4 hours of symptom onset.
  • Intervention: 1.5 million units of streptokinase over 1 hour
  • Comparison: Placebo
  • Primary outcome: Combined death and disability (Barthel index <60) at 3 months.

Results

  • The trial was stopped early due to increased mortality. The original sample size calculation called for 600 patients. The mortality rate was almost doubled with streptokinase (RR 1.77; 95% CI 1.11-20.82).
  • No difference in death or disability at 3 months (relative risk toward unfavourable outcomes with treatment of 1.08; 95% CI 0.74-1.58.)
  • Increased significant intracranial hemorrhage, defined here as a confluence of blood causing significant mass effect (13.2% with streptokinase vs 3% with placebo, p<0.01). There were also increases in hypotension (33% vs 6%) and anaphylactic shock (2 vs 0%) with treatment.
  • In terms of secondary outcomes, they do note that the outcomes look better in the group of patients treated within 3 hours, but the differences were not significant.

ASK trial results.PNG

ASK trial mortality.PNG


ECASS II

Hacke W, Kaste M, Fieschi C. Randomised double-blind placebo-controlled trial of thrombolytic therapy with intravenous alteplase in acute ischaemic stroke (ECASS II). Second European-Australasian Acute Stroke Study Investigators.. Lancet (London, England). 1998; 352(9136):1245-51. PMID: 9788453

A multicenter randomized, double-blind, placebo controlled trial

  • Patients: 800 adult stroke patients (aged 18-80) with a clinical diagnosis of moderate to severe hemispheric ischemic stroke within 6 hours of symptoms onset. Patients were excluded if they had brain swelling in more than 33% of the MCA territory on CT.
  • Intervention: t-PA (0.9mg/kg)
  • Comparison: placebo
  • Primary outcome: Proportion of patients with a favourable outcome (a score of 0-1 on the modified Rankin scale). (This trial was started post-NINDS, and had a pre-specified 0-3 hour subgroup.)

Results

  • There was no difference in the primary outcome. 40% of patients had a favourable outcome with t-Pa as compared to 37% with placebo.
  • There was no change in mortality (11% in both groups).
  • There was an increase in major intracerebral hemorrhage (11.8% vs 3.1%).
  • In the 0-3 hour group, there was no benefit for the primary outcome (34% with t-Pa vs 29% with placebo, p=0.28).
  • In the 0-3 hour group, mortality was higher with t-PA (14% vs 8%).
  • One secondary outcome, percentage of patients with mRs 0-2 did favor tPa.

ECASS 2 results.PNG

Comments

  • Once again, the conclusions are so incongruent with the result I feel compelled to point them out: “We conclude that alteplase at a dose of 0·9 mg/kg does not increase mortality or morbidity, despite a 2·5-fold increase in symptomatic intracranial haemorrhage. The safety data are consistent with those of the NINDS trial. These results support the view that alteplase should be part of the routine management of acute ischaemic stroke within 3 h of symptom onset, and probably beyond, in selected patients and in experienced centres.”
  • In this trial they specifically mention trying to mask the investigators to coagulation studies. This brings up a very important point that isn’t mentioned elsewhere: although most of the other trials are blinded, the subsequent care of these patients could easily result in a loss of blinding, whether it was bleeding from IV sites or the coagulation studies ordered. With relatively subjective outcome measures, like the modified Rankin score, unblinding becomes incredibly important.
  • If you look at the ordinal analysis, the only thing that seems to change between the group is that some 2s become 3s. A modified Rankin score of 2 is “slight disability; unable to carry out all previous activities, but able to look after own affairs without assistance.” A score of 3 is “requiring some help, but able to walk without assistance.” The difference between “unable to carry out all previous activities” and “some help” is very subtle, so you could imagine that any unblinding could sway these numbers.


ATLANTIS B

Clark WM, Wissman S, Albers GW, Jhamandas JH, Madden KP, Hamilton S. Recombinant tissue-type plasminogen activator (Alteplase) for ischemic stroke 3 to 5 hours after symptom onset. The ATLANTIS Study: a randomized controlled trial. Alteplase Thrombolysis for Acute Noninterventional Therapy in Ischemic Stroke.. JAMA. 1999; 282(21):2019-26. PMID: 10591384

A multicenter, placebo controlled, double-blind, randomized trial

  • Patients: 613 adult patients with acute ischemic stroke within 3-5 hours after symptom onset.
  • Intervention: rt-PA 0.9mg/kg
  • Comparison: placebo
  • Primary outcome: Excellent neurologic recovery (NIHSS≤1) at 90 days.

Results

  • There was no difference in the primary outcome (32% with placebo and 34% with t-PA, p=0.65)
  • Mortality was not statistically different, but was higher with t-PA (11.0% vs 6.9%, p=0.09)
  • There was an increase in symptomatic intracranial hemorrhage (6.7% vs 1.3%, p<0.001)

Comments

  • This trial was initially designed to look at the 0-6 hour time frame. It was stopped early on because there were safety concerns in the 5-6 hour time frame, and so the trial was turned into a 0-5 hour trial. (The patients enrolled up until this point were considered a separate trial, and are reported in “ATLANTIS A”.) It was then stopped again after NINDS came out, and they decided (apparently not understanding the need for replication in science) to stop enrolling patients in the 0-3 hours time frame. They also added more centers at that time. Finally, part way through they changed the enrollment criteria to include a CT criteria based on the ECASS 1 trial. The result is a bit of a mess, methodologically speaking.
  • More concerning than all the protocol changes along the way, they stopped the trial early despite the fact that the trial did not meet any of the pre-specified safety criteria for stopping.That is suspicious, especially in an industry sponsored study. (Genetech not only sponsored the trial, but was responsible for all the data management and analysis.) There is a clinically significant increase in mortality that might not have reached statistical significance specifically because they decided to under-power the trial.
  • I need to comment on the crazy biased discussions in these papers again. They state: “there was no significant difference in mortality between groups, although the trend toward improved 90-day mortality seen in the NINDS trial was not seen.” As a reminder, the mortality numbers in NINDS were 17% with t-PA and 21% with placebo, p=0.30. This trial had the same absolute difference (but now with worse outcomes for t-Pa) and a lower p value. How can you call the NINDS difference a trend and simply ignore the opposite trend here?!
  • Although they included patients in the 0-3 hours time frame, they don’t report those results anywhere in the manuscript.

ATLANTIS A

Clark WM, Albers GW, Madden KP, Hamilton S. The rtPA (alteplase) 0- to 6-hour acute stroke trial, part A (A0276g) : results of a double-blind, placebo-controlled, multicenter study. Thromblytic therapy in acute ischemic stroke study investigators. Stroke. 2000; 31(4):811-6. PMID: 10753980 [free full text]

A multicenter, placebo controlled, double-blind, randomized trial

  • Patients: 142 adult stroke patients presenting within 6 hours of symptom onsent.
  • Intervention: rt-PA 0.9mg/kg
  • Comparison: placebo
  • Primary outcome: 3 primary efficacy outcomes: decrease of 4 or more on the NIHSS at both 24 hour and 30 day, as well as infarct size at 30 days.

Results

  • There was a statistically significant improvement at 24 hours (21% of placebo have an improvement of 4 or more on the NIHSS as compared to 40% with t-Pa, p=0.02)
  • However, by 30 days the results were the opposite, with 75% of the placebo group and only 60% of the t-PA groups improving my 4 or more points (p=0.05).
  • Mortality was much higher with t-Pa (25% vs 7%, p<0.01)
  • Symptomatic intracranial hemorrhage was higher (11% vs 0%, P<0.01).
  • At 90 days, the median modified Rankin score was 2 in the placebo group and 5 in the t-Pa group – a much bigger difference than seen anywhere else (p=0.05).

DIAS 2

Hacke W, Furlan AJ, Al-Rawi Y. Intravenous desmoteplase in patients with acute ischaemic stroke selected by MRI perfusion-diffusion weighted imaging or perfusion CT (DIAS-2): a prospective, randomised, double-blind, placebo-controlled study. The Lancet. Neurology. 2009; 8(2):141-50. PMID: 19097942 [free full text]

A multicenter, placebo controlled, double-blind, randomized trial

  • Patients: 193 adult patients (aged 18-85) with hemispheric ischemic strokes (NIHSS 4-24) and a penumbra of potentially salvageable tissue measuring at least 20% on CT or MRI, presenting within 3-9 hours after symptom onset.
  • Intervention: Desmoteplase 90 mcg/mk or desmoteplase 125 mcg/kg
  • Comparison: placebo
  • Primary outcome: Clinical response at 90 days, defined by a composite of improvement of 8 or more points on the NIHSS or an NIHSS ≤ 1, a modified Rankin score 0-2, and a Barthel index of 75-100.

Results

  • 63% of patients received treatment between 6-9 hours.
  • There was no difference in the primary outcome between the 3 groups.
  • 90 day mortality was 6% with placebo, 5% with 90 mcg/kg and 21% with 125 mcg/kg
  • Symptomatic hemorrhage occurred in 3.5% of the 90 mcg/kg group, 4.5% of the 125 mcg/kg group, and 0% with placebo. (To get a sense for how much the definition vary among these trials, the rate of symptomatic hemorrhage using the NINDS criteria is: 8% with 90 mcg/kg, 10.6% with 125 mcg/kg, and 14.3% with placebo.)

Comments

  • The blinding technique sounds imperfect: the syringes had different volumes in them, but the volume was covered with a label. However, the physician still pushed the drug.
  • Although this is not unique in this literature, they do tell us that the sponsors were involved with everything, including interpretation of that data and writing the manuscript.
  • Desmoteplase is theoretically a superior thrombolytic because of high fibrin specificity, no activation of beta-amyloid, and a lack of neurotoxicity. Although it fails here in the 3-9 hour time frame, it seems strange they wouldn’t have tested it head to head against t-Pa in the 0-3 hour time frame.

ECASS III

Hacke W, Kaste M, Bluhmki E. Thrombolysis with alteplase 3 to 4.5 hours after acute ischemic stroke. The New England journal of medicine. 2008; 359(13):1317-29. PMID: 18815396 [free full text]

A multicenter, placebo controlled, double-blind, randomized trial

  • Patients: 821 adult stroke patients (aged 18-80) who were able to received the drug with a 3-4 hours time frame after symptom onset (later extended to 3-4.5 hours). They excluded patients with a NIHSS >25
  • Intervention: t-Pa 0.9mg.kg
  • Comparison: placebo
  • Primary outcome: Disability at 90 days (looking at a modified Rankin scale 0-1)

Results

  • More people in the treatment group ended up with a favourable modified Rankin score (0-1): 52.4% with t-Pa vs 45.2% with placebo.
  • There was not a statistically significant difference in the Glasgow outcome scale or the Barthel index between the two groups, and the global outcome (encompassing all of the above) was also not statistically significant, but was close: Odds ratio 1.28, 95%CI 1.00-1.65, p=0.05).
  • Mortality was unchanged: 7.7% with t-Pa and 8.4% with placebo, p=0.68
  • Symptomatic hemorrhage was higher. According to their definition 2.4% vs 0.2%, p=0.008. According to the NINDS definition, 7.9% vs 3.5%.

Comments

  • The placebo group started with an NIHSS score that was statistically higher (and a full point higher is probably also clinically significant). This baseline imbalance could have impacted the final results. (You would expect more people with milder strokes to end up with a modified Rankin score of 0 or 1.)
  • Also, the placebo group had a higher baseline rate of prior stroke (14.1% vs 7.7%, p=0.003) which would significantly impact the ultimate mRs.
  • Using the dichotomous cutoff on the modified Rankin scale of 0-1 can be misleading. It groups patients with a score of 2 (slight disability, still independent) with those who are dead. Clearly, those are very different from a patient’s perspective. (If you use the cut-off of mRs 0-2 in this study, there wasn’t a statistical difference between the groups).
  • They changed their definition of symptomatic intracranial hemorrhage to include a subjective component: the hemorrhage had t o be identified as the ‘predominant cause’ of neurologic deterioration. However, we know in these studies that there will be more bleeds in the t-Pa group, so adding any criteria that discounts bleeds biases the study in favour of the thrombolytic group.

IST 3

Sandercock P, Wardlaw JM. The benefits and harms of intravenous thrombolysis with recombinant tissue plasminogen activator within 6 h of acute ischaemic stroke (the third international stroke trial [IST-3]): a randomised controlled trial. Lancet (London, England). 2012; 379(9834):2352-63. PMID: 22632908 [free full text]

A multicenter open-label, randomized, controlled trial

  • Patients: 3035 adult patients with acute stroke within 6 hours of symptom onset. Patients were excluded if they “had a clear indication” for t-Pa and if there was a clear contraindication. Then, only if both the physician and the patient thought that the treatment was “promising, but not proven” would the patient be enrolled. 53% of the patients were older than 80 (excluded from other trials.)
  • Intervention: t-Pa 0.9mg/kg
  • Comparison: standard care
  • Primary outcome: Proportion of patients alive and independent at 6 months (using a Oxford Handicap score of 0-2).

Results

  • There was no difference in the primary outcome: 35% of the placebo group and 37% of the tPa group were alive and independent at 6 months, OR 1.06, 95%CI 0.92-1.24, p=0.41).
  • Mortality at 6 months was unchanged (27% vs 27%, p=0.672)
  • Mortality was increased at 7 days with t-Pa (11% vs 7%, OR 1.59, 95%CI 1.23-2.07, p=0.0004).
  • Symptomatic intracranial hemorrhage was increased with t-Pa (7% vs 1%, p<0.0001).

Comments

  • This is the largest trial of thrombolytics for stroke (by a large margin), but unfortunately it also has the worst methods. That being said, the methods here really should bias in favour of the tPa group, so it is worth noting that in the largest stroke trial we have to date (by a lot) there was no benefit at 6 months.
  • I don’t know who I should be more frustrated with, the authors or the journal editors, but once again despite a clearly negative primary outcome, the conclusion in the abstract is: “For the types of patient recruited in IST-3, despite the early hazards, thrombolysis within 6 h improved functional outcome.” There is nothing remotely scientifically accurate about that sentence.
  • Selection bias is clearly a huge problem in this trial. They didn’t take all-comers, but instead patients in whom both the physician and the patient thought tPa was promising but not proven – whatever that means. (Actually, maybe I fit into that category?) I have heard people argue both that this biases in favour of tPa and in favour of placebo. I don’t know, but I know it makes the data significantly less reliable.
  • The pilot phase (first 10%) was blinded. The rest of the trial wasn’t. They don’t explain this decision. I assume it was to save money, but it was unfortunate. During the small blinded phase, the the outcomes were better with control than with tPa. (See figure 3)
  • There were a variety of different follow up methods, but it sounds like the majority was via the mail. As I discuss below, the modified Rankin score is a highly subjective outcome. This is an unblinded trial, and patients or their family members were responsible for scoring the primary outcome.
  • More patients in the tPa group were treated in an ICU level environment, and we know that location of care affects outcomes in strokes.
  • They originally wanted to include 6000 patients, but changed the sample size due to slow recruitment. The original power calculation was based on wanting to show differences in the subgroups.
  • There was a secondary outcome that is reported as positive, but that outcome (the ordinal analysis) was added only after they realized they wouldn’t make it to the planned 6000 patients. An external statistician “persuaded” them that this was a more efficient analysis, but changing outcome measures after data has been collected is fraught with problems.
  • In one of many subgroups, they try to note that tPa looked better in the group that was treated in less than 3 hours. However, they ignore the fact that the 3-4.5 hour group did much worse, and the 4.5-6 hour group looked like it did better again, which looks more like the drunken walk of random chance than a real finding supporting the time is brain theory. Another possible explanation would be the baseline differences they noted between the groups.
  • Another major subgroup of interest was the patients over 80 years of age. They report a trend towards tPa helping in these patients. However, given that the overall result was negative, you know that positive affect had to be balanced, so patients under 80 years old did worse with tPA.

Further reading

  • SGEM: Episode 29 Stroke me, Stroke me
  • EM Lit of Note: IST 3
  • Hoffman JR, Cooper RJ. How is more negative evidence being used to support claims of benefit: The curious case of the third international stroke trial (IST-3).. Emergency medicine Australasia : EMA. 2012; 24(5):473-6. PMID: 23039286
  • Radecki RP, Chathampally YG, Press GM. Annals of Emergency Medicine Journal Club. rt-PA and Stroke: does IST-3 make it all clear or muddy the waters?. Annals of emergency medicine. 2012; 60(5):666-7. PMID: 23089093

In Japan, tPa was licensed at a dose of 0.6mg/kg. Registry data seemed to indicated a decrease in intracerebral hemorrhage without a sacrifice in efficacy. This led to the following RCT comparing standard and low dose tPa:

ENCHANTED

Anderson CS, Robinson T, Lindley RI. Low-Dose versus Standard-Dose Intravenous Alteplase in Acute Ischemic Stroke. The New England Journal of Medicine. 2016; 374(24):2313-23. PMID: 27161018 [free full text]

This is a multicenter, randomized, open-label, non-inferiority trial. It has a 2×2 design, with a subset of the patients being investigated for aggressive blood pressure control. I focus on the interventional therapy here:

  • Patients: 3310 adult (18 years and older) stroke patients who were eligible for thrombolytic therapy within 4.5 hours.
  • Intervention: Low dose tPa (0.6mg/kg)
  • Comparison: Standard dose tPa (0.9mg/kg)
  • Primary outcome: Death or disability (2-6 on the modified Rankin Scale) at 90 days.

Results

  • The primary outcome, death or disability (mRs 2-6) (the inverse of the usual “excellent outcome”) occurred in 53.2% of the low dose group and 51.1% of the standard dose group (OR 1.09; 95%CI 0.95 – 1.25). This confidence interval crossed the prespecified non-inferiority boundary of 1.12, so fails to show non-inferiority.
  • Using the ordinal analysis, the low dose was statistically non-inferior (OR 1.00; 95% CI, 0.89 to 1.13; P = 0.04 for noninferiority).
  • The number of patients alive and independent (mRs 0-2) at 90 days was essentially identical (62.4% and 63%).
  • Major symptomatic intracerebral hemorrhage was less common with the lower dose (1.0% vs 2.1, P = 0.01). (Symptomatic hemorrhage by NINDS criteria were much higher, but still statistically different (5.9% vs 8.0%, p=0.02).
  • 7 day mortality was less in the low dose group (3.6% vs 5.3%, P = 0.02), but 90 days mortality was not statistically different (8.5% vs 10.7%, P=0.07).

Comments

  • This is an open-label trial. I assume they did this practically because they decided to use different bolus sizes (10% in the standard group and 15% in the low dose group). However, this is unfortunate. It would have been incredibly easy to blind this study, as they were just looking at different doses, and it would have made the findings much more reliable.
  • Although the outcome was negative by their definition of non-inferior, the numbers here are essentially identical from a clinical standpoint, and mortality was higher in the standard dose group.
  • Over 60% of the population was recruited in Asia. It is not clear if the population in Asia has different outcomes with thrombolytics (some studies have hinted at higher rates of hemorrhage), but if so that could limit the study’s external validity.
  • They had to screen almost 70,000 patients to find 3310 eligible, which reminds us that most stroke patients are not eligible for tPa.
  • Although common in stroke studies, the primary outcome was based on a phone assessment.

Further Reading


Here is a nice table from Ken Milne that summarizes these papers in a colour coded fashion:

SGEM table.png

Trying to understand these studies

There is no simple summary of these studies. If there was, there probably wouldn’t be any controversy. There are some things we need to know if we are going to try to make sense of this research to guide our practice and speak to our patients.

The outcomes

All of these studies use neurologic scoring systems or scales as their primary outcomes. To understand the studies, it is important to understand the scales. The most common scale used is the modified Rankin scale (mRs), so I will delve into it in a bit more detail. For reference, other scales used are:

The modified Rankin scale has 7 categories:

  • 0: No symptoms
  • 1: No significant disability. Able to carry out all usual activities, despite some symptoms.
  • 2: Slight disability. Able to look after own affairs without assistance, but unable to carry out all previous activities.
  • 3: Moderate disability. Requires some help, but able to walk unassisted.
  • 4: Moderately severe disability. Unable to attend to own bodily needs without assistance, and unable to walk unassisted.
  • 5: Severe disability. Requires constant nursing care and attention, bedridden, incontinent.
  • 6: Dead.

At first glance, the scale seems pretty straightforward, but humans are complex and real life examples blur the lines. What counts as being unable to carry out all previous activities? How much help is “some help”. I am in my thirties with no neurologic problems, but I require a lot of help getting through my life. My grandfather’s days are spent sitting in front of the TV smoking. My grandmother looks after him completely. He could have a stroke resulting in a dense hemiplegia of the upper extremity, and as long as he had one hand to smoke and use the remote control, you wouldn’t know the difference. So is he “able to carry out all his usual activities, despite some symptoms” or does he “require some help, but able to walk unassisted”. All of us have good days and bad days, and our score here could vary widely.

Most examples will be more subtle than that, but these are elderly patients with comorbidities and existing disabilities. Furthermore, there are complex social interactions that influence one’s sense of disability, especially after an acute illness. The score one is given might depend on the time of day, day of the week, or even who you happen to be talking to.

This is borne out in the literature. Even among trained neurologists, when more than one person scores the same patient, they frequently arrive at different results. See, for example, this paper (Quinn 2009) where the kappa values for the modified Rankin scale is 0.46 (very poor).

The bottom line is that the categories on these scales are not black and white. There is a subjectivity present, which is especially important in unblinded trials, or if blinding was broken.

Blinding

Leaving aside IST3, the majority of trials here were blinded. Unfortunately, there are some potential problems with blinding. If you have ever used tPa, it has a somewhat foamy appearance in the syringe. These trials used saline as the placebo and clinicians might have been able to tell the difference between the two in the syringe. More importantly, if you have ever looked after a patient given tPa, you know this drug is almost impossible to blind. Nurses may note excess bleeding from IV sites or gingival bleeding. Also, changes in routinely drawn lab values might make it obvious which group the patient had been allocated to.

Obviously, we are always concerned about blinding in trials. However, it is especially important if the outcome being measured is not objective. I trust the mortality differences reported here, even if the groups became unblinded. However, as discussed above, the primary outcome in all these trials relied on scoring systems that involve subjective assessment. The difference between a modified Rankin scale of 1 and 2 is not obvious at the best of times, but especially not if the data is collected from family members over the phone (or by mail) who may have been unblinded.

Definitions

Every trial has different definitions of intracranial hemorrhage. When summarizing the data, I tried to include the definition that I think it most important, which is the large symptomatic bleeds. If you are comparing the bleeding rates from different trials, or at your own hospital, make sure you read the definitions carefully. Honestly, though, I don’t worry as much about the bleeds themselves, because I want to know their clinical outcomes, which will be captured in the death and disability numbers.

Some statistics

What is a p value? You might be surprised to find that the topic is hotly debated, and a mere mention of p values can get a statistician far more excited that you ever thought possible. I am not going to wade into that morass, for now, but it is important to know what we mean when we say a study is “statistically significant’. (If, for whatever reason, you want to read a little more about p value, this is an excellent article). In the eyes of the statistician who invented the p value – Ronald Fisher – the p value is an informal way to judge whether data was worthy of a second look. The p value doesn’t define truth. The foundation of science is replication, and the p value was only intended to tell us which studies are worth replicating. It is also worth noting that the p value of 0.05 has no special meaning. We use it in medicine because we had to choose something and it’s relatively practical for biologic experiments. However, in physics the standardly accepted p value is 0.0000003. (See Fatovich and Phillips 2017)

Most importantly, the p value can only be judged in the context of the scientific literature. Much like diagnostics in our clinical practice, to properly interpret a p value, we must know the pretest probability. Unfortunately, there is no clear method to judge a study’s pre-test probability. In general, we know the chance that a new medication will help patients is low, because the vast majority of trials of new medications are negative. If you start with a general pretest probability of 5% for any new treatment (it’s probably not that high), you might increase that slightly before NINDS was started, because we knew that thrombolytics work for MI, and the pathophysiology is similar. On the other hand, we have to bring our estimate back down somewhat because there were already 2 negative trials of the same therapy for stroke.

If we were to assume a 10% pretest probability that thrombolytics would work in NINDS (and I think that is being generous), applying a p value of 0.025 (the average of the 4 primary outcome p values), you would get a post-test probability of 77%. (This assumes you ignore the problems with NINDS and just take the numbers as they are.) A 77% post-test probability is good. Based on this result it is now more likely than not that thrombolytics work, but there is still a 23% chance that they don’t work. Clearly, this study should be replicated. A second statistically significant result, starting with this new pretest probability, might be enough to convince us that thrombolytics work. With a 77% post-test probability, it might even be reasonable to use this experimental treatment while you are waiting for further research to be completed. What is clearly inappropriate is to stop all further study and declare the treatment to be the standard of care.

(The most tenuous part of these calculations is determining the pretest probability. Based on the fact that at least 90% of newly tested medications fail to show benefit, I think that the the numbers here are too high. However, if you what I was being too conservative, and want to make the pretest probability that tPa would work in NINDS 25%, the post-test probability is still only 91%. Higher, and more convincing, but still in need of replication. You can play around with these numbers using this calculator.)

This is a fundamental rule of science that we often forget in medicine. Research needs to be replicated. P values don’t tell us the truth, they just alter our post-test probability. Consider sepsis protocols. Consider therapeutic hypothermia. The results are very similar. In both those cases, much like with thrombolytics for stroke, we allowed our clinical practice to get somewhat ahead of the evidence. That is understandable, because we want to help our patients, but we must learn from these (not unexpected) reversals. Where is the validation study for thrombolytics?

The fragility index

The fragility index is a powerful and intuitive statistical concept. The index tells you how many people in a study would have had to have a different outcome in order for the study to become “not statistically significant” (to have a p value above 0.05). A fragility index of 100 tell you that 99 patients could have slipped from a good to a bad outcome and the trial would have still been statistically significant. A fragility index of 1, on the other hand, tells you that if even a single patient had a different outcome, the trial would have been reported as negative instead of positive. It is a powerful tool, because it gives you a sense of how easily random chance could have changed the results of a trial.

How does this help us when looking at the stroke literature? Given that the foundation of our current practice is NINDS, it would be good to know if NINDS was likely to give us the same results if replicated. The fragility index can give us a sense of the robustness of the results. Unfortunately, NINDS had 4 primary outcome measures, so there isn’t just a single fragility index. However, the results are very similar. For 3 of the primary outcomes, the fragility index was 3. In other words, if 3 extra people in the control group had had a good outcome, the trial would have been statistically negative. (For the fourth primary outcome, the fragility index is 4). Clearly, this is a small number. Random chance (not to mention the various sources of bias in the trial) could easily have turned this trial from positive to negative. Therefore, it would not be surprising if a replication of NINDS turned out to be negative. (At this point, I would probably be more surprised at the trial being run than I would be to find out it was negative).

Josh Farkas talks about a related concept, called the instability index, which he suggests tells you how much imprecision there is in the final outcome based on imbalance between groups, post-randomization crossover, and loss to follow up. You are better off just reading his post here. For NINDS, he calculated the instability index as 6.5. Meaning, we shouldn’t be surprised if 6.5 patients changed outcomes if the trial was repeated. Compared to the fragility index of 3, this tells us NINDS is unlikely to give us the same results if it were replicated. (Please note, unlike the fragility index, which is a well recognized statistical tool, the instability index is just a (useful) product of Josh’s imagination).

There is one other study supporting tPa use: ECASS 3. The fragility index? One!!

Baseline imbalances

There are two positive trials listed above. In both of those trials, the placebo group had a higher NIH stroke score than the tPa group when patients were enrolled. This isn’t anything nefarious. It is just something that happens by chance when you are dealing with small trials. Unfortunately, the single largest predictor of outcome in strokes is how severe the stroke is at baseline. These trials did not measure how much you improved, but instead asked how many patients were functionally independent at the end of the trial.

Imagine we were testing two different pain medications. Drug A takes patients pain from an average of 7/10 on arrival to 3/10 at 1 hour. Drug B reduces pain from 8/10 to 4/10. If we are interested in the change in pain, both drugs reduce pain by 4/10 and we would consider them equivalent. However, if we discovered that patients consider any pain score of less than or equal to 3 to be “minimal pain”, we might ask: “how many patients had a pain score of 0-3 at the end of the trial?” If we set that as our primary outcome, the baseline imbalance between the two groups makes Drug A seem superior to Drug B.

Because the placebo groups in both NINDS and ECASS 3 has sicker patients to begin with, it is not surprising that less of those patients were independent at 3 months.

Conflicts of interest

Pharma runs our studies. We understand that, but we also know that pharma run studies have a much higher likelihood of being positive than studies run by non-conflicted sources. There is clear evidence of bias in the way that these studies are written up (ignoring primary outcomes and emphasizing secondary outcomes in conclusions). Aside from IST3, all the major RCTs had significant industry involvement. It’s hard to know how much industry involvement affects the results, but until we can fix the underlying scientific system, we have to account for this inherent bias when interpreting published studies. This, of course, is not a problem unique to studies of thrombolytics for stroke, but is nevertheless a problem evident in these trials.

A common misconception: Rapid recovery

When discussing thrombolytics with other doctors, I hear a lot of anecdotes. The people who are convinced that tPa works almost never suggest that NINDS was a great study, nor do they point out a promising analysis of the data. The refrain is almost always the same: I pushed tPa and the patient got better in front of my eyes.

Witnessing such an event is indeed powerful. We all want to help our patients, and in these cases it seems like the medication being pushed saved the patient. Unfortunately, it is a mirage. Thrombolytics simply don’t work that way. In none of these trials did patients improve immediately. NINDS part 1 was specifically designed to look for improvement at 24 hours, and there was none. Thrombolytics may provide some long term benefit, but there is no evidence here that they have an immediate impact. (There was a difference with tPa, but it did not reach statistical significance. NINDS may simply have been underpowered.)

I know people don’t like data. So, instead, consider the many other stroke anecdotes. You are called urgently to a room to assess a patient with a dense hemiplegia. You rapidly activate the stroke protocol and the patient is whisked off to the CT scanner. You meet them back in the room after a rapid review of the images, ready to discuss the harms and benefits of tPa, only to discover that their symptoms are resolving. I have seen this hundreds of times. Sometimes, their symptoms have resolved before I can even order the CT. We tend to ignore these patients, because we were not the saviours – because we don’t get the credit – but patients rapidly resolving on their own are far more common than patients rapidly resolving after tPa. If a stroke patient’s symptoms resolve in the first 24 hours, whether or not you gave tPa, that is called a transient ischemic attack.

I should also mention, it isn’t really clear how thrombolytics could have an effect at 3 months but not at 24 hours. This is one of the facts that leads me to believe that much of the difference we are seeing in NINDS was due to baseline imbalance between the groups. (At 24 hours, they measured a 4 point improvement on the NIHSS – so it wouldn’t matter where you started. At 3 months they measured how many patients were doing well, in which case it really matters how sick you were at the outset.)

How do these studies compare to thrombolytics in MI?

There were over 60,000 patients in the MI studies, as compared to about 10,000 with stroke. All of the MI studies were positive. Thrombolytics improved mortality in MI. Every thrombolytic agent worked. The thrombolytics worked early and late. None of this is true for thrombolytics in stroke.

Of course, the major difference is that thrombolytics only worked in one specific type of MI: the STEMI. By defining this narrow population, they were able to ensure benefit (2.5% decreased mortality) despite a very narrow therapeutic window (1% major bleeding). We have not identified any such subgroup for stroke.

Why not include meta-analyses here?

There is probably too much clinical heterogeneity here to simply pool the results together. With different inclusion criteria, timing, definitions, and agents used, it isn’t clear that a single number can summarize this data. On the other hand, it is inappropriate to simply ignore data based on retrospectively chosen criteria.

Meta-analyses are great for increasing statistical power, but do nothing to help us with bias. The various flaws discussed above are simply compounded when trials are combined. The larger sample size doesn’t get us any closer to the truth.

Combining trials together also makes larger trials more important. Unfortunately, in the stroke literature, the largest trial, contributing almost half of all patients in current meta-analyses, is the deeply flawed, open-label IST3 trial. Allowing such a biased trial to overpower the others doesn’t make much sense.

Stopping negative trials early and effect on meta-analyses

Another problem with meta-analyzing the data here is the imbalance created by stopping only negative trials early. The trials stopped early should have included a total of 3900 patients based on their initial study protocols. (I include MAST-I in this group). Instead, they only enrolled 1827 patients. The 2000 missing patients in the negative trials is more patients as were actually enrolled in the positive trials (NINDS and ECASS 3) combined. The result is a significant imbalance.

Imagine that I wanted to prove to you that my basketball team was an excellent 3 point shooting team, being able to make 50% of their shots. We run a few trials. Player 1 misses the first 2 shots and we decide to stop the trial early “because of statistical futility”. Player 2 also misses the first 2 shots and we stop the trial early. Player 3 makes 7 out of 10 tries. Player 4 makes 1 of 4 shots, but our statistician tells us this is unlikely to reach significance, so we stop. Finally, player 5 makes 6 out of 10 shots. Is this team a good three point shooting team? Three of the five players were awful, shooting 0% or 25%. However, in total, the team has attempted 28 shots, and made 14 of them. By eliminating the poor shooting of 3 players, we created an unbalanced sample that was able to “prove” that my team can shoot 50%. Although this is a relatively simplistic example, it gives you a sense of the problem of combining results when all the negative trials were stopped early.

Why do we ignore the negative trials?

One of the greatest scourges in modern medicine is publication bias – where negative trials never see the light of day. This is not (as far as we know) the case when in comes to thrombolytics for stroke, but for some reason, we have decided to ignore the negative trials anyway.

The current stroke literature, as outlined above, includes 4 papers stopped early for harm or futility, 6 negative trials, and 2 positive trials. That is almost exactly the random distribution that you might expect from studying an intervention with no effect. Just because we decided to paint a bullseye around the two positive trials, doesn’t mean that we actually hit the mark. (The Texas sharpshooter fallacy.)

Does agent matter?

Three of these trials (ASK, MAST-Italy, and MAST-Europe) used streptokinase. All three were negative trials. In fact all three were stopped early due to harm. The question is, should these three trials be treated as different because streptokinase is in some way different than t-Pa? It isn’t clear what the answer should be. There are theoretical reasons that streptokinase might be different than t-Pa, but also theoretical reasons that would indicate it shouldn’t be. There is no difference between the various thrombolytic agents in the management of STEMI and it is rare in medicine to see true differences between different medications of the same class. However, the outcomes with lytics in stroke are clearly different from those in STEMI, so it is not easy to extrapolate from that literature. If you look at the outcomes of these three trials and compare them to the outcomes of the t-Pa trials, it is hard to see any clear differences, although the mortality numbers are the highest with treatment in these three trials. The Cochrane reviews have not identified a difference between the agents. (See Wardlaw 2014)

Does time matter?

Although there is some convincing evidence that harms increase the later that tPa is given, it is not clear that the “time is brain” mantra is based in science. Physiologically, it never made much sense, as neurons die 3-6 minutes after their blood supply is lost – orders of magnitude different from the 180-270 minute timeframes we are talking about. This Cochrane review concluded that the current data does not support a significant difference in outcomes between the 0-3 and 3-6 hours groups. IST3 provides us with the rather implausible result that patients presenting less than 3 hours after symptom onset benefit, those between 3 and 4.5 hours are harmed, and those in the 4.5-6 hour time frame are again helped by tPa. Maybe, for patients presenting at 3.5 hours, we should wait a bit?

Although it is certainly possible that early treatment results in earlier reperfusion to an ischemic penumbra that is not yet dead, there is another explanation that would explain the better outcomes seen in earlier presenters: selection bias. A patient seen at 90 minutes might still be a TIA which will self-resolve, but by 3 hours that is less likely. Similarly, a patient at 90 minutes might still be post-ictal or having a migraine, but the longer you wait, the more of these self-resolving stroke mimics will resolve. The result is that more patients in the early group are likely to have TIAs and stroke mimics. These patients will, of course, have much better 3 month outcomes than patients having strokes, making earlier treatment erroneously look more effective than later treatment.

The idea that patients treated early might naturally be expected to fare better is important when interpreting NINDS. As part of the NINDS protocol, there had to be an even distribution between patients in the 0-90 minute group and the 90-180 minute group. What that means is that, even among the patients in the 0-3 hour window, the majority were excluded from this trial (because so few patients show up in the first 90 minutes). The result is a very select group of patients presenting, on average, much earlier than patients present in real practice. Consequently, we should expect these patients to fare better than the patients we actually see.

Another common mistake: hemorrhage versus good outcome

Often, when skeptics of thrombolytics discuss this topic, the potential benefit of tPa is weighed against the 6% rate of symptomatic intracerebral hemorrhage. I think this is a mistake, when you consider the primary outcomes here. These trials looked at death and disability at 3-6 months. That is a reasonable, patient centered outcome (aside from the subjectivities mentioned above.) The harms of a intracerebral hemorrhage are included within that outcome. So it is not a balance of whatever benefit you think these studies show against the harms of hemorrhage; it’s the overall benefit despite the harms of hemorrhage.

That isn’t to say there isn’t harm from these medications. I think these trials fairly consistently demonstrate an increased risk of early death. That is a harm. The risk of death trends back towards neutral with time, but that is expected. Run any trial long enough, and the mortality rate will be 100% in both groups. In a group of older stroke patients, with multiple comorbidities, we should expect a number of patients to die in both groups, independent of tPa, which tends to make the groups seem more similar. So there is probably harm here, and the benefits are uncertain, but we should not set up a false balance between functional benefit on one hand and bleeding on the other.

Conclusion

The thrombolytics debate isn’t about numbers or statistics. This isn’t a question that can be answered simply by dissecting these trials (believe me, I have tried). The reason that this issue is still debated is all about the reliability of the data.

Stroke is a devastating condition and every clinician wants to do everything in their power to help their patients. Unfortunately, good intentions are not enough, and it is generally our sickest patients in whom we need to be most careful about the delicate balance between doing good and doing harm. I have read all this literature through more times than should ever be done. I can’t tell you for sure whether thrombolytics work. Physiologically speaking, they are clearly doing something, as is evidenced by the increase in bleeding. There is a hint at benefit throughout a number of these papers, but that has to be tempered by the various sources of imbalance and bias in this literature.

My guess is that there must be some subgroup of patients who are benefiting to balance out harms in others. (Although I am not absolutely certain that there is any benefit here). Unfortunately, our currently approach is akin to giving thrombolytics to all chest pain patients, or at least to any patient with a positive troponin. In that population, lytics fail. We don’t have an ST elevation equivalent to guide us in stroke. My biggest concern is that the push to define tPa as the “standard of care” has robbed us of the important research that would have discovered this subgroup.

Bottom line? I don’t know. If NINDS was replicated today, I would open the odds between 4:1 and 9:1 against the same results. (In other words, I think there is about a 10-20% chance that if the same protocol was run, we would see the same results). I think we clearly need more research. I think basic philosophy of science and statistical tenants tell us that we must attempt to replicate NINDS. Or maybe this whole debate will simply disappear, as endovascular therapy becomes the new norm. More on that next time…

How do I discuss this with my patients?

I don’t work at a stroke center, and because of EMS bypass protocols this isn’t a conversation I have frequently. I tend to say something like:

“There is a treatment we sometimes use for stroke that is supposed to break down the clot causing the stroke. The treatment is controversial, and you will probably hear different things from different doctors. The issue is that out of 13 major trials, only 2 have shown benefit, and both of those trials have some problems, and they were both paid for by the people who make the drug. There are some risks that we’re certain about: about 1 in 12 patients will have severe bleeding resulting in worse neurologic outcome. Despite that risk, in the best case scenario, about 1 in 10 people given this drug early will have a noticeable improvement in their function after 3 months. Unfortunately, it isn’t clear how reliable the science has been, and we don’t know which patients have the greatest chance at benefit or harm. The choice to receive this medication remains up to each individual patient.”

Other FOAMed

theNNT: Thrombolytics for stroke

SGEM: Episode 85 Won’t get fooled again (tPa for CVA)

St. Emlyn’s: Kicking against the prick: Systematic Review of stroke thrombolysis

FOAM Cast: ACEP tPA policy; Dr Jerome Hoffman on ACEP’s tPa clinical policy

Life in the Fastlane: The Use of Thrombolysis as a Treatment for Acute Stroke

EMCrit: tPA for ischemic stroke debate

A special thanks for Dr. Ken Milne for reviewing the discussion section of this post to ensure I wasn’t making any major errors.

Author: Justin Morgenstern

Community emerg doc, FOAM enthusiast, evidence junkie “One special advantage of the skeptical attitude of mind is that a man is never vexed to find that after all he has been in the wrong.” - William Osler

17 thoughts on “Thrombolytics for stroke: The evidence”

  1. What an absolutely amazing review. Thank you so much. Especially love the section on ‘how do I discuss this with patients’.

    Like

    1. Great job. Hopefully I will feel much more confident about pulling the trigger in the future. I was very careful about using tPA for stroke until a 44 yo woman with essentially no medical nor vasculopathic history was brought to ER with her husband who gave an excellent 1 hour history of the evolution of her symptoms. She was talking with her husband and began to stutter and then devolved quicky into a word salad with apparent sentence structure intact. Very weird. She sat there talking about “German beach peanut butter has perfect tax value after weasel dropped dimples…” in response to “how are you.” She appeared oblivious to the fact she wasn’t making sense. After she was ruled in, tPA was administerded and her word salad stopped mid sentence she told us she felt fine and didn’t see what the fuss was about.

      Like

      1. Thanks for the comment.
        As I mention in the post, you have to be very careful considering anecdotes. It is possible that tPa works that quickly, but so far the evidence indicates that it doesn’t; that there is no improvement at 24 hours.
        More importantly, the use of anecdotes promotes intrinsic biases in human thought. You remember this one story because it was very powerful, but what about the hundreds of times that you pushed tPa and nothing happened? Why choose the one positive story over the many negative? And what about the many time that patients rapidly improved exactly like this, but tPa was never given?
        Anecdotes are powerful. But they are dangerous, because they can and do easily over-ride rational thought.

        Like

      2. Yes. Exactly. I was pointing out how easy it is for anecdotes to be biased; how easy it is to just select the anecdotes that suit your purpose. The point is that you must focus on the evidence, not the anecdotes.

        Like

  2. Here is the response from our neurologist:

    The authors of this open this up to all studies – relevant or not

    1) first study not relevant – it was not tpa
    2) second study was 6 hours not 4.5 or 3. we know that it is very time sensitive. Therefore study not really relevant
    3) ninds 1 and 2 both showed significantly positive results at 3 months as noted in the article
    4) mast europe and ask were also not tpa and therefore not relevant
    5) ecass 2 was 6 hours – not relevant
    6) atlantis b is such a mes I dont think you can draw any conclusions either way

    etc

    so exclude > 4.5 hours, exclude low dose, exclude not iv tpa and all studies – the way we are using the drug – showed benefit

    somebody DID do a pooled analysis of just alteplase and just < 4.5 hours and found benefit similar to the smaller individual studies

    https://www.ncbi.nlm.nih.gov/pubmed/20472172

    I get it that people are not convinced by the two positive studies. But there are, as far as I can tell ONLY positive studies for the way we treat (<4.5 hours, IV altpelase, 0.9mg /kg). You cannot include up to 6 hours or include streptokinase or include the wrong dose and say “see, I told you it does not work”

    Like

    1. Thank you so much for the comment.

      The cherry picking of trials is precisely our problem with the stroke literature. We ignore trials, as you say, because of the agent used, the time frame, the dose, or the population. However, all of these caveats are added after the fact. This is the Texas sharp-shooter fallacy: we are painting the bullseye after the shots have been fired. Remember, all these trials were run precisely because the researchers thought the intervention would work. If there was a good reason to think that treatment up to 6 hours wouldn’t work, they wouldn’t have run the trials. It was only when retrospectively trying to come up with explanations for why NINDS2 was positive while all the other trials were negative that those distinctions were made.

      And the evidence for those distinctions is very week. From the stroke literature, there is no evidence that tPa is superior to the other agents used. From the MI literature, all the lytic agents, in massive trials, have proven to be identical, with one exception: tPa has the highest rate of intracerebral hemorrhage.

      There are two possible explanations for why there are two positive studies among many negative. First, it is possible that there was something special about those studies, such as the drug used or the timing in which it was given. Second, (and after reviewing the literature I think this is more likely), is that this is simply what you would expect from random results in small trials. The only way to determine whether the positive results were the result of random chance (or baseline imblances between the groups) or because of a real benefit from tPa is to replicate the trial.

      Replication is the foundation of science. It was the real focus of this post. From this science I can’t be sure that thrombolytics don’t work. But, we also can’t be sure that they do. This is clearly an outstanding question that desperately needs to be settled, and the only way to do that is to replicate NINDS.

      Like

      1. My neurology group states that this is all true .. except for one thing – which is the time thing and why many of the early trials chose 6 hours. Didn’t think 6 hours was picked because there was any certainty that it would work till then but rather that they wanted enough patients to run the trial and back in the early days the thought of doing a trial with a three hour cut off was radical and challenging – you need to assess, CT, get labs, consent, randomize, etc. In fact the NINDS trial was run at three hours after some of the negative 6 hour ones

        We do see this sort of evolution in trials – look at the EST trials. Before 2015 we had all negative trials
        After 2015 we had 4 positive trials. What changed? Patient selection and devices – part of which was based on learning from prior failures. This actually happens all the time – for example In MS they recently ran a rituxan trial for primary progressive ms – not effective. However younger earlier patients were a subgroup who did benefit. So ocrelizimab (the drug they ended up getting approved for ms) ran a trial on just this group and showed benefit. (The real problem is that when the FDA approval comes out its for ALL primary progressive ms – not just younger and early). so yes, you can torture data till it shows you what you want BUT you can also use it to learn what works and what does not and re-run the trial with the selected group

        Replication of NINDS would certainly be useful but at this point don’t see how it’s going to happen. You’d need to get funding to run the trial as well as IRB approval to run a trial with placebo against FDA approved standard of care treatment. These things should not be a barrier but I bet they would.

        Like

      2. Two points:
        1) Although it is true that occasionally we see refinement in technique that allows us to identify sub-populations that truly benefit from therapies, this is rare, and more importantly not the case with thrombolytics for stroke. NINDS was followed by a series of negative studies. There is no evidence that the selection criteria, agent or timing were what made NINDS positive. If they had determined the correct selection criteria, we would have expected multiple positive trials after NINDS, but instead we see mostly negative trials (unlike the endovacular literature). This question is exactly why we need a replication of NINDS: its possible that they happened to find the right population, but it is also possible, and probably more likely, that they positive result was the consequence of random chance and the baseline imbalances in the groups.
        2) Although it is true that it will be difficult to run a proper replication of NINDS, that is precisely what we should be fighting for. It is unfortunate that we let scientific misunderstanding and corporate interest guide what we call standard of care. As physicians, we have a responsibility to our patients that involves speaking up about these misunderstandings.
        There is an ethical imperative to replicate these studies. Although it is true that thrombolytics might help some patients, and if we ran a randomized trial, some patients would be denied this therapy, it is also possible that tPa is currently hurting patients. Instead of testing this in a few thousand patients, we allow those harms to be propagated over millions of patients over multiple decades. That is simply unacceptable. Because of its widespread use, the potential harm resulting from our uncertainty is orders of magnitude larger than any potential harm of running a proper randomized control trial.

        Like

  3. Thank you Justin for what I think is a phenomenal review of the literature and an objective and honest conclusion. I can’t imagine the time involved in doing so. I appreciate your generosity.
    What is incredible to me (but not surprising considering profit motives) is how the house of medicine has built an empire of “stroke centers of excellence” based on such an unconvincing body of evidence. It is also not surprising to see so many neurologist “convinced” by the literature as many have also profited from the increased demand of their specialty. Not to mention the travesty of how many doctors have been sued in the last 15 years because thrombolytics in stroke have been deemed the standard of care based on the same unconvincing literature. It’s a sad commentary for patients, doctors, and medicine.

    Like

  4. Brilliant post as usual. My favourite part though was: Morality at 6 months was increased (36% vs 24%, OR 1.7; 95%CI 1.2-2.5).
    Don’t change it though, made the long read worthwhile.

    Like

    1. Usually I am mortified by my typos, especially as I proof read these posts 20 times. This time, I agree with you that this improves the post. If only tPa could increase morality – then we finally would have the miracle drug that has been so widely advertised.

      Like

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s