A few years ago, Joe Friel reviewed a scientific publication on his blog which purported to demonstrate the benefits of something called polarized training: "Polarized training has greater impact on key endurance variables than threshold, high intensity, or high volume training" by Thomas Stöggl and Billy Sperlich. (This publication is hereafter referred to as Stöggl and Sperlich). In today's post, I will review this publication, and in so doing, will attempt to avoid being the kind of "sadistic scientist who hurries to hunt down errors", but rather attempt to take a common sense approach to analyzing this study, to ascertain how much a reasonable person can take away from it about how best to train. Joe Friel thinks highly of this publication and I think highly of Joe Friel, and on that basis, I will make every effort to give this publication the benefit of the doubt.
One of the most insidious sources of error in science is cognitive bias. Stephen J. Gould provides an excellent discussion of this trap in his book "The Mismeasure of Man" (which I highly recommend to scientists and non-scientists alike) but in short, as much as we scientists would like to pretend that we dispassionately consider each question with no preconception about what answer we get, very often, we have a preference for one answer over another. When that happens, it is well known that even when we are trying to be unbiased, we will be especially sensitive to any evidence that supports our preferred answer and relatively blind to evidence that contradicts it, and all of this is subconscious and thus very difficult to avoid. When I read the Joe Friel blog on this publication, I was frankly horrified; what it said about training went against everything I was doing. Thus, my preference is to find that the Stöggl and Sperlich publication is wrong. Consider this fair warning both to myself and my readers that this trap is set and waiting for me. I will proceed nonetheless, doing my best to give this publication fair consideration.
Science is hard. Exercise science is harder.
In my opinion, one of the most important sentences in Stöggl and Sperlich is the following:
"An experimental study is difficult to conduct in elite athletes because typically neither the athletes nor their coaches like to have the athletes’ training intensity, duration or frequency altered."
When considering the ideal experimental design we might like to have for a study such as this, fairness and common sense demand that we bear in mind the real world difficulties of carrying out such a study. Sometimes, we just have to be grateful for what we can get.
What Is Polarized Training?
At the most informal, intuitive level, there could be nothing simpler; polarized training means concentrating on very high intensity ("fast") and very low intensity ('slow") training, and avoiding everything in between. That said, turning this into a precise definition that can be used for hypothesis testing is not so easy. How fast is fast? How slow is slow? Is one to do no riding at speeds in between or just less than other training approaches? How much less?
Further complicating the issue are the different ways that intensity of effort ("fast" or "slow") are measured. Obviously, speed in miles per hour is not a useful measure of intensity; 25 miles per hour is an impossible speed for me but easy for a professional bicycle racer. Thus, other, more biological measures are used, such as heart rate, breathing volume, and the amount of lactic acid in the blood. Even for these, there can be person to person variability so that measurement is often expressed as the percentage of the highest value that person can achieve for that variable. Perhaps the most common way of measuring exercise intensity is heart rate. Heart rate intensity zones (typically, Zone 1 through Zone 5) are determined as a percentage of the maximum heart rate one can reach, or alternatively, of the heart rate at lactate threshold. Stöggl and Sperlich, on the other hand, use either the percentage of the maximum breathing volume (%VO2peak) or levels of blood lactate to define exercise intensity. This creates two problems for me. First, I lack the resources to measure my %VO2peak or my blood lactate. Second, there is not a direct correspondence between %VO2peak, blood lactate, and heart rate, nor is it clear which of the three is the better measure of effort. The thing I found most disturbing about Joe Friel's column is the assertion that LOW intensity rides should be ridden in Zone 1 rather than Zone 2. However, it is not at all clear to me when I read Stöggl and Sperlich that their LOW intensity exercise corresponds to Zone 1 as Joe Friel says, or if it could correspond to Zone 2.
Overall Study Design
The participants were members of the Austrian national cross country ski, triathlon, running, or cycling teams. Training consisted of either running, cycling, or roller skiing. In the six months before the study, all participants had been engaged in an exercise program similar to HVT, consisting mostly of LOW (slow) workouts. However, their training also included up to two days of LT (medium fast) workouts a week, making it also similar to the THR training plan. The normal schedule for these participants was 5 days of training per week and they had been training for 8 to 20 years. In reviewing this study, the cautionary note that Joe Friel struck was that the relevance of this study to you depends very much on how similar you are to these participants. If, for example, you are a beginning athlete, you might do better using a different training plan than what scored best in this study.
The study started out with 48 participants, 12 assigned to each of the four training plans. Due to dropouts, the study finished (and reports the results from) 11 participants in the HVT plan, 8 participants in the THR plan, 12 participants in the POL plan, and 10 participants in the HIIT plan. The picture at the top of the post diagrams the four training programs compared in this study.
According to Stöggl and Sperlich, "Five key variables have been used as a benchmark to compare athletic performance in and between endurance athletes: (i) VO2peak (ii) velocity/power output at the lactate threshold (V/PLT) (iii) work economy (iv) peak running velocity or power output (V/Ppeak); and (v) time to exhaustion (TTE)." In some of my previous posts, I have discussed the problems that can occur if what you measure is not exactly the same as what you really want to know. These problems most definitely apply here, and I will discuss them in a general way at the end of this post, but for now, let's accept these measurements and see where that leads us.
What is the point?
The conclusion of Stöggl and Sperlich is that Polarized Training is better than High Volume Training, Lactate Threshold Training, or High Intensity Interval Training. Note that this is a much broader claim than the narrow conclusion that one of the specific training plans tested in this study is better than the other three. Rather, it says that when comparing any two training protocols, all things being equal, the polarized protocol will most likely be better. Does this study support that conclusion? I think not. I think this is a well designed and executed study, and that it convincingly supports the narrow conclusion but not the broad one. The protocol they designate at POL is almost certainly the best of the four tested, but it is not at all clear that it is best because it is polarized or that it is best for some other reason. Thus, it is not clear to me that polarized training is better in general.
What are other explanations for the results of Stöggl and Sperlich? The first two possibilities that jump to mind are:
- Training can be too hard, too easy, or just right. Unsurprisingly, the best results are obtained when training is just right. How hard a training protocol is, is determined by both the volume of riding (how many hours) and the intensity of riding (how fast.) Both the volume and the intensity of the four training programs differ one from another. It could be that the POL training program was "just right" whereas the others were "too hard" or "too easy."
- It is generally believed that there is value in varying one's training from time to time, that if one rides the same rides over and over, after a while, one will derive little benefit. The riders in this study had spent the previous six months doing some mix of the HVT and THR protocols. One or both of these protocols could have been "more of the same", and the benefit of the POL protocol may have been that it was something different.
What's not the point?
These are points that Stöggl and Sperlich never claim to have demonstrated, but rather are the known and accepted limits of the study:
- This Study Does Not Apply to Everyone. We have already discussed that even if you believe this study is correct for its participants, it may not be correct for you. That said, besides being talented and experienced athletes, these athletes were at a particular point in their training cycle; they had just come off of 6 months of a program that was predominantly HVT but which also included some THR. Thus, I do not find it surprising that they got more fitness by switching from HVT rather than participating in more of the same. Similarly, even if we assume that POV is the best protocol at this point in their training cycle, would more POV be the best strategy after the 9 weeks of POV done in this study was completed, or would changing to something else be better?
- This Study Does Not Measure All Aspects of Fitness. Are all forms of fitness interchangeable? Alternatively, do the metrics used in this study cover all forms of fitness? I do not think the authors of this study would claim either of these statements to be true. Rather, they had to use some metric of fitness to compare their training protocols and selected a fairly broadly accepted and easy to measure set, hoping that they would be applicable to at least some sorts of real world fitness. Assume these metrics do a fairly good job of identifying a cyclist who will do well in a typical bicycle race. Does that mean that these same metrics would do as good a job identifying a sprinter, or a randonneur? I would not assume so.
- This Study Does Not Evaluate All Effort Levels. Context: There has been an early version of the polarized training idea kicking around the training community for some time in the form of the "Zone 3 Syndrome" (see for example this website.) The notion is that Heart Rate Zone 3 (halfway between easy and hard) is the "grey zone"; too fast to build endurance, too slow to build speed. As usual, the discussion is more complicated than that, but I think it gets the essence. When I read Joe Friel's post, my horror came from the notion that the "grey zone" had been expanded from "Zone 3" to "Zone 2 AND Zone 3 AND Zone 4"; only training in Zone 1 and Zone 5 is worth doing. I argued above that it is not clear that the LOW intensity training in Stöggl and Sperlich is Zone 1, but out of respect for Joe Friel, let's assume it is. Even if we make that assumption, I do not think Stöggl and Sperlich supports such an extreme conclusion. If we assume low intensity training corresponds to heart rate Zone 1, that THR training corresponds to the top of Zone 4, and HIGH training corresponds to Zone 5, then this study never looks at any training protocols that contained any significant effort in Zone 2, Zone 3, or in the bottom half of Zone 4, so cannot speak to the value of exercise at these levels.
If You Can't Say Something Nice...
I'm afraid that despite my best efforts, this post reads like the output of the sadistic scientist I was trying not to be. Can't I find something of value in this study? Yes I can. The "polarized" training program worked; it was not worthless or harmful. If you are a talented, fit cyclist who has been riding long, slow miles, a training protocol that replaces some of your long, slow rides with high intensity intervals will probably do you more good than continuing with just long, slow rides.
The second lesson I take away from this paper is probably not something the authors intended and is very specific to me; this paper has forced me to confront my cognitive bias and to reconsider the value of exercise in Zone 1 versus Zone 2. I had stuck in my head the notion that Zone 1 was just for warm up, cool down, and recovery rides. Looking back at my collection of training books, it is clear that Zone 1 (along with Zone 2) is also recommended for building endurance. Many of the comments I have received on this blog have been to the effect that I might be doing my endurance training at too high an intensity. In the past, I have resisted this suggestion, but this study has caused me to reconsider. I don't yet know what impact this will have on my training; stay tuned to find out.
I have spent a great deal of time thinking about how I would have done this study differently. In fact, I don't know if I could have made it much better. My objections to the study have much more to do with the difficulty of studying exercise than anything else. That said, let me toss out a few thoughts.
- I would have skipped the HIIT training plan, I think it is too extreme.
- I probably would have skipped the HVT training plan, not because I don't find it interesting, but rather that if I were limited to only four plans, HVT would not make the cut. Also, it is too similar to what the athletes were doing before the study, and thus may be doomed to fail.
- I would have modified the THR plan by replacing the two shorter LT rides with one HIGH ride and one long LOW ride. This would have made it into a more mixed plan, one that covered the range of exercise intensities and more directly challenged the theory of polarized training by asking if POL is better than THR because of the absence of Zones 2 through 4 from POL, or because of the absence of Zone 5 from THR.
- This leaves me with two more plans I can test. One of them would be just like the POL plan, but I would replace the two long LOW ride with rides in Zone 2 rather than in Zone 1. To make up for the extra load this would have put on the riders, I would reduce the length of the two shorter LOW rides. The purpose of this plan is to test the relative value of Zone 1 and Zone 2 rides.
- My fourth plan would be just like POL except that I would replace the two short LOW rides with R (recovery) days, days with no riding. This would reduce the load of this plan, a disadvantage, but I think that disadvantage would be justified to test whether short Zone 1 rides have very much value. I would want to ask this question because the medical community has suggested that, for the purposes of building health (rather than fitness), they do not.
I am not a coach, so these might be terrible ideas that no experienced coach would ever suggest, and I would listen carefully to any coach who told me as much. In fact, I would most interested to hear a coach comment on the logic behind the details of the protocols selected by Stöggl and Sperlich. Of course, Joe Friel, an exceptionally experienced coach, did blog about this study, but he did not comment on the specifics of the exercise plans. In the absence of such objections, however, these are the plans I came up with to answer the questions I have. What do you think?