The PICO Method and Endovascular Stroke Trials

Update 9/21/2017: I’ve noticed that this post is one of the blog’s most-read–not just at the time it was written, but even yesterday 5 people clicked on it. I feel compelled to attach this little update to inform readers that in the 4.5 years since I wrote it, a lot has happened! There have been a bunch of positive endovascular stroke trials that completely changed the landscape for this condition. Here’s a 2015 post about the then-new and exciting results. Now things are getting even more interesting, because the DAWN and DEFUSE 3 studies, which enrolled subjects in very prolonged time windows (up to 16 -24 hours from onset / last known well), have been halted prematurely due to evidence of benefit. Published results are still forthcoming, and I’m sure there will be a big splash at this year’s International Stroke Conference in L.A. Anyway, I just want to make sure that people who read this post don’t rely on its outdated information. The broader points about the PICO method remain valid, however.

Many thanks to Dr. Marcus Chacon for hosting journal club yesterday, and to Dr. Wendy Chacon for tolerating our presence in her house until well after the kids were in bed. Kudos to Dr. Cahill for presenting two of the papers with very short notice and to Dr. Alsherbini for taking on the third one. You both did a good job in somewhat difficult circumstances.

The first thing I wanted to follow up on is the PICO method. This acronym stands for Patient (or Population), Intervention, Comparison, and Outcome. It’s a useful way to formulate clinical questions and also to evaluate clinical research papers. An example pertinent to yesterday’s papers would be:

In patients with acute ischemic stroke presenting within 12 hours of last known well and having NIHSS scores between 5 and 25, does endovascular intervention with the Solitaire device, as compared to standard care, result in better neurological outcomes expressed on the modified Rankin scale?

The details of each of these can and should be carefully explored in any rigorous evaluation of a paper (eg, what’s “standard care”, exactly?), and I suggest that going forward, we use this format in any journal club regarding clinical studies.

Yesterday, we reviewed the three papers referenced in my post of February 10th. I found it difficult to guide us in a PICO-focused way because seemingly, every statement made was quickly rejoined by a vigorous defense of the endovascular perspective. If nothing else, we did succeed in identifying many weaknesses in the papers, but I’d like to use this opportunity to organize some of those criticisms in PICO format.

The first paper discussed was the SYNTHESIS study of IV tPA vs. endovascular treatment. A criticism of the subject Population was that they were not selected for the presence of a large vessel occlusion. To the extent that some subjects did not have such occlusion, exposing them to the risk of endovascular intervention (catheterization and, in this study, the highly unusual regional IA tPA administration) without the possibility of benefit from large vessel recanalization biases the trial in favor of conservative therapy

An additional criticism of the Intervention, beyond the regional IA tPA treatment in some subjects, was that the endovascular subjects did not receive IV tPA even though the study population was comprised exclusively of patients presenting within the 4.5 hour window. In clinical practice, of course, we always give IV tPA to those eligible–the pressing question is whether to follow that with endovascular treatment or not.

A criticism of the Outcome measure was that limiting “favorable” outcomes to modified Rankin scale scores of, in this case, 0-1, prohibits the detection of a significant clinical benefit in someone who might have gone from Rankin 5 to Rankin 3.

The second study was IMS III. This study also did not select for Patients with documented large vessel occlusions. Unlike SYNTHESIS, subjects randomized to endovascular therapy but lacking an occlusion at angiography did not receive any further treatment. Still, having subjects in the intervention group who don’t actually receive the intervention biases the study toward the null. (Think of it this way: What would happen if no subjects had large vessel occlusions? We’d be testing IV tPA vs. IV tPA and there should be no difference in the two arms of the trial).

The main criticism of the Intervention was that because the trial was begun a decade ago, the devices used in the beginning were outdated by the end, and numerous protocol amendments were required to keep pace with the rapidly changing technology.

My criticism of the Intervention was that, for much of the trial, subjects in the endovascular arm only received 0.6 mg/kg of tPA–less than standard of care. In my view, this is ethically OK as long as you’re offering something potentially better as part of the intervention. But since, as per the supplemental data (Figure S1, page 24) 100 of the 434 subjects randomized to endovascular care received either no angiogram at all or no endovascular treatment, my concern was that those subjects had to forgo standard care and didn’t even receive the possibility of further benefit in the intervention arm. (One of the later amendments changed the dose in the IV arm to the usual 0.9 mg/kg).

The final trial was MR RESCUE. The study Population was stratified according to the presence or absence of a “penumbral” pattern on CT or MR perfusion. There was strong criticism of the operational definition of penumbral imaging for being hard to understand, even by those highly experienced in the field. It was also asserted that the maximum allowable core infarct volume for the penumbral arms, 90 cc, was too high. In actuality, the mean predicted core infarct volumes in the two penumbral arms was 36-37 cc and the maximum was 58 cc. I forgot to mention this yesterday, but the investigators did also perform a sensitivity analysis to determine if a different infarct volume cutpoint would have affected their results and concluded that it would not have.

The fireworks in this journal club were occasioned by the ending discussion of how to interpret these studies and whether they should change our practice. My assertion that, flawed as they are (and as all studies are to some extent), these trials “failed to show a benefit of endovascular therapy” was met by one enraged denial, accusation of bias against endovascular therapy, and the rebuttal that they in fact show that “the treatments are equivalent”. In the interest of civility, I didn’t push the matter too hard last night, but it bears further examination. First, here’s a quote from the SYNTHESIS paper:

The study was designed to verify or refute an absolute difference of 15 percentage points between the proportions of patients with a favorable outcome in the two treatment groups.


We calculated that a sample of 900 patients would provide an effect size of 10 percentage points (the absolute difference between the endovascular-therapy and intravenous t-PA groups in the proportion of participants with a modified Rankin score of ≤2 at 90 days) . . .

From MR RESCUE (hyperlink added);

The primary study hypothesis was that the presence of substantial ischemic penumbral tissue and a small volume of predicted core infarct, as visualized on multimodal CT or MR imaging, would identify patients who were most likely to benefit from mechanical embolectomy . . .

The hypothesis was tested by analyzing whether the pretreatment penumbral pattern had a significant interaction with treatment assignment (embolectomy vs. standard medical care) as a determinant of functional outcome scores across all seven levels of the modified Rankin scale (shift in disability levels).

For secondary analyses, patients with scores of 0 to 2 on the modified Rankin scale were classified as having a good functional outcome.

For SYNTHESIS, the most rigorous way to state the primary result is that the trial failed to show a 15% difference in the proportion of subjects achieving a Rankin score of 0-1. IMS III failed to show a 10% difference in the proportion of subjects achieving a Rankin score of 0-2. MR RESCUE failed to show an interaction between the presence of a penumbral imaging pattern and the effect of embolectomy on the Rankin score. Secondary analyses failed to show significant differences in the proportion of subjects (whether all subjects or only the penumbral subjects) achieving Rankin scores 0-2.

I suppose it is technically correct to state that since there were not significant differences in outcomes, then the treatment effects were similar. Clinically, however, that is absurd. When a new treatment becomes available, one that is expensive and invasive, it should be shown to be superior to cheaper, non-invasive treatments before it is widely adopted. Indeed, in their justification of the effect size they chose to seek in the SYNTHESIS trial, the authors state that it was based in part on “the need for a clinical effect sufficiently large to justify the switching from a well-established and simple procedure to one that is newer, more expensive, and more difficult to perform”. In this sense, all three trials have failed.

The more reasonable question is to what extent should these results alter our practice? After all, we just finished picking the studies apart. But we can also pick apart DEFUSE, DEFUSE 2DIASDEDAS, EPITHET, and other studies supporting the hypothesis that sophisticated imaging selects patients well-suited to IV and endovascular treatments in expanded time windows. When those studies, some randomized but early phase and others observational, comprised the bulk of our evidence base, I think it was reasonable to be cautiously optimistic about the promise of multimodal imaging and endovascular treatments and to treat patients accordingly. For me, that meant disclosing to patients and families the still-uncertain status of such treatment approaches, but generally being quite willing to offer them.

Now that we have three (imperfect) randomized phase III trials, I think that we need to be more cautious. I wouldn’t throw out all of the foundational work noted above and use only these 3 trials to guide us, but these trials must have some effect. They must cause us to be more cautious in holding out multimodal imaging and endovascular treatment as being beneficial to our patients. We must disclose to our patients that notwithstanding our sincere hope that such treatments will ultimately prove beneficial, especially with our talented interventionalists, top-notch imaging, and latest devices, we actually don’t know for sure and indeed have recent evidence (yes, I’ll say it again) that failed to show benefit. To simply reject all three trials as flawed and not alter our practice at all would be irresponsible.

Again, what we really need to do is get involved in one of the ongoing or upcoming trials of these treatments . . .

Correction: 2/15/2013 @ 1803: I edited this post to correct an error in my analysis of IMS III. The 0.6 mg/kg tPA dose was in the intervention arm of the study–not the comparison arm as originally stated.

About Justin A. Sattin

I'm a vascular neurologist and residency program director. I created this blog in order to share some thoughts with my resident and other colleagues, and to foster my own learning as well.
This entry was posted in Medical Knowledge and tagged , , . Bookmark the permalink.

3 Responses to The PICO Method and Endovascular Stroke Trials

  1. Aaron Struck PGY-3 Neurology University of Wisconsin says:

    I am glad that I had the opportunity to attend this journal club, and I wanted to present the neophyte perspective. First I was glad to see the passion that was exhibited by the residents and attendings. If anything will help the science along it is the shared enthusiasm that was evidenced during this journal club. It is always a challenge to me to take novel study results and to transfer that knowledge to how I practice or practice practicing, so this meeting was a wonderful opportunity to see experts take on that very task.

    These studies represent the highest rigor of empiric medical science, the RTC, (even if the questions they ask are not ideal) as such they have to be seriously consider despite the warts of outdated technology and imperfect protocols. One of the issues addressed by both Dr. Sattin and during the club is the assertion that if a study has a negative result it implies equivalence in the study arms. To me this is not an accurate statement, the purpose of this kind of study is to test the null hypothesis and if the null hypothesis is not rejected, it does not prove the converse. It simply is not rejected. The chance that the two treatments are different and it was not detected by the study is 1-Power and usually in the area of 20-30%. Meaning that 30% of the time there could be a true difference between the arms and it was not detected. A 30% chance that two treatments are different does not mean two things are equivalent. To take the extreme example a study with a power of 0% and a beta error of 100%, i.e. if you have two patients and you randomly treat one patient and they die and you don’t treat the other patient and they live, if you use inferential statistics in this horribly designed experiment it would not show a statistical difference, but that in no way implies the treatment is safe, intuitively you would be suspicious it wasn’t. Likewise I don’t think you can infer from these studies that endovascular treatment is safe. That said you also cannot say it is harmful.

    I think I agree with the consensus of people in attendance. These studies are not going to abrogate endovascular therapy. Instead they change a few things for me. I would counsel patients about endovascular therapy differently as Dr. Sattin and Dr. Niemann agreed. The results must also color the multidisciplinary discussions regarding these treatments. I hope that these studies serve as encouragement to develop further investigations to isolate the group of patients that we can help the most with these powerful and rapidly evolving imaging and surgical tools. I personally think that the advancing imaging techniques will be critical in aiding in patient selection. I am gald Dr. Rowley was able to provide expert opinion regarding the methods used for perfusion in MR-Rescue as that tempered my conclusion of the study.

    In my mind the real meat of reducing the morbidity and mortality of the stroke epidemic is trivariate: 1) Improving the infrastructure to rapidly assess and treat new stroke patients (e.g. the Berlin Stroke-bulance), 2) improving the awareness of stroke in public at large (I don’t think anybody in the US doesn’t has Chest Pain=MI branded into their grey matter, the same can’t be said for stroke), and 3) public health means of addressing vascular risk factors. If at the same time we can broaden the scope of patients that can treated with advanced imaging and improve the effect size of treatment with endovascular techniques that is a huge boon.

    As an aside I definitely learned that chest pain = heart attack by watching TV and movies as a child. Think about all those people you saw clutching their chests in pain, if only 1/2 of them could have had a gaze preference and hemiplegia we would be way ahead of the game, maybe our California trained stroke neurologists could use some of their connections.

    Example of public health efficacy: If not smoking reduces your per-annum rate of stroke by about 50%, and roughly 25% of people smoke, reducing the smoking rates by 50% would get rid of about 10% of strokes. Think about that in comparison to the roughly 1% of people that get TPA and the 13% of that 1% that benefit from TPA.

    • Justin A. Sattin says:

      HTML Online Editor Sample

      Thanks, Aaron, for the very thoughtful comments. To the question of whether two treatments can be considered equivalent, I'd add that some trials are specifically designed to test for "non-inferiority". One example was the SPACE trial of carotid stenting vs. endarterectomy. This study failed to show that stenting is non-inferior. Another example is the RE-LY trial of dabigatran vs. warfarin for stroke prevention in atrial fibrillation, in which dabigatran was found to be non-inferior.

      I fully agree with your public health perspective on stroke. Dr. William Powers, a very experienced stroke expert and recent visitor to our department said to me something like, "if we took all the money we spend on unproven acute stroke interventions and paid people at each primary care visit if their BP was at goal, we'd have a much more profound effect on the burden of stroke".

  2. Dave Weisman says:

    “We must disclose to our patients that notwithstanding our sincere hope that such treatments will ultimately prove beneficial, especially with our talented interventionalists, top-notch imaging, and latest devices, we actually don’t know for sure and indeed have recent evidence (yes, I’ll say it again) that failed to show benefit.”

    With respect, Justin, we can’t leave this to the patients and their families to sort out within the context of an acute stroke. At those times they’re confused, emotional, and, unless they’re stroke neurologists, ignorant.

    The trials were negative and I can think of no reason to offer these interventions at all, end of story. It might be rational for some doctors to look at the small differences in subgroups and offer these interventions to the slim minority of patients who come in with large anterior strokes, with documented clots after tPA, and in which tPA is given < 2 hours from stroke onset AND endovasc treatment can start 90 min later. (and for basliar strokes and for non-tPA strokes).

    Personally, I will no longer offer these interventions in situations where we know they don't work, unless there is a further clinical trial where equipoise is met.

    "To simply reject all three trials as flawed and not alter our practice at all would be irresponsible."

    Beyond irresponsible. We were operating within an ignorant framework based on shoddy "date" which really came to a bunch of anecdotes and PROACT II. We filled in the gaps in our knowledge with best guesses, some of which was biased by hope and financial imperatives. Now we have more data. To ignore it is, first off, to wallow in ignorance and would also represent a callous disregard for the honor and dignity of our profession. Which is why I say what I say above.

Comments are closed.