Dr. Jim Elliott and I recently had a publication accepted in Physical Therapy in which we evaluated the performance of the Tampa Scale for Kinesiophobia in a sample of people with neck pain of varying cause and duration. Here's a link to the abstract. If you're reading this, chances are you already know about the TSK and what it is meant to measure. In case you don't, in brief, it's an 11-item scale (there is also a 17-item original version) that is intended to measure fear of movement or fear of injury/reinjury. Originally it was developed and tested for use in people with chronic low back pain, but since it has seen increasing use in other MSK conditions of late, we thought it prudent to test the properties of the scale when completed by people with neck pain.
This post isn't meant to simply rehash the results of that paper. But as a short summary, we compared the performance of the TSK against the Rasch model, which, in simple terms, evaluates the degree to which a scale can be considered a linear 'ruler' of sorts, rather than simply an ordinal-level scale. Along with other approaches like Item Response Theory, Rasch analysis is considered one of the 'new' approaches to scale construction and evaluation, in contrast to traditional approaches which are drawn from Classical Test Theory (e.g. factor analysis, internal consistency, convergent validity, etc...). It has been said that Classical Test Theory is a weak approach to evaluating the properties of a scale for a variety of reasons, not the least of which is that a scale can never be 'proven' valid, only supported, and that it is actually not difficult to find evidence of adequate psychometric properties of any even half-baked scale using Classical Test approaches. Rasch and IRT are considered more rigourous approaches owing to the existence of sound mathematical models to which the data must adequately fit before we can be satisfied. Rasch also allows some deeper exploration of scale function, such as Differential Item Functioning (do the scale properties differ across clinically relevant subgroups) and response ordering (do the response options function the way they're expected to?). At the end of our paper, we found that the 11-item version of the TSK fit surprisingly well to the Rasch model, and when using the Rasch-transformed (logit) scores, the magnitude of association with things like pain and disability changed to a potentially important degree. So, the scale appears to work in people with neck pain.
Astute readers will notice I said 'surprisingly well' in the previous sentence. Truth be told, going into this analysis I didn't expect the TSK to look that good. Personally, I'm not a big fan of the scale, or perhaps more appropriately, I'm not a fan of the way it's conceptualized. There are two broad reasons for this that I'll describe below, then I'll wrap it all up at the end to let you, the reader, understand what it all means.
Why I'm not a big fan of the TSK, reason 1: I don't generally like opinion-based scales
At least, not the way that they're frequently used. There are all sorts of issues with opinion-based scales in my, ahem, opinion. A scale like the TSK, that does not provide a 'neither agree nor disagree' option, forces people to indicate an opinion on each statement, even though that opinion may not really be well-formed or in fact even exist. It takes a certain type of person, prehaps the more pensive ones, to be able to look at a list of statements and have a well-defined opinion about each one. This is especially problematic for people with acute problems - most are probably still trying to understand their condition and have yet to establish firm opinions. Incidentally, the results of our analysis did suggest that the TSK may not work as well for people with acute problems. But there's an additiona problem with many opinion-based scales, including the TSK, and that is the options: strongly disagree, disagree, agree, strongly agree. While ordinal scales may not necessarily have equidistant response options, they should still be logically ordered. In my mind, if one were to imagine a continuum of agreement, there would be two poles with a 'neutral' in the middle (see figure).
Agreement and disagreement are simply two valences of opinion. By definition, whether I mildly, moderately or strongly disagree, I still disagree. Same goes for agreement. Based on our analysis it appears as though most respondents guessed correctly that the 'agree' or 'disagree' options meant some level of agreement/disagreement below 'strong', but that should really be more explicit in the scale. As another consideration, I propose that it would be fairly rare for anyone to have a strong opinion on the majority of items in this scale, which might limit its ability to detect change.
But, there is another reason for skepticism with this scale, that is...
Why I'm not a big fan of the TSK, reason 2: Kinesiophobia is not that well-defined
A phobia, by definition, is an irrational fear. The DSM-IV sets out fairly stringent criteria for diagnosis of a phobia . Notably, not only is the fear and associated extreme anxiety/panic response largely irrational, it is also generally known to be irrational by the sufferer. It's not simply a case of incorrect knowledge or misinterpreted information. People who flee in terror at the sight of a dog, for example, will tell you they realize the fear is irrational, but the response is so automatic that they have a hard time controlling it.
In reviewing the items on the TSK, I'm not convinced that they are all inherently irrational. In fact, a statement such as 'pain lets me know when to stop exercising so I don't hurt myself' would probably be quite a rational belief in a lot of conditions including chronic low back pain for which the scale was originally intended. While this may seem a bit nitpicky, I do believe it's a relevant issue considering the majority of people who would use this scale are not psychologists and might not realize that kinesiophobia is not a diagnosable condition. It's just a word.
At the end of the day though, our analysis revealed that the TSK actually functioned well. So perhaps my concerns are unfounded. Or perhaps, what is really needed is simply a reconceptualization of thisscale and what it is, and more importantly isn't, measuring. The Rasch analysis suggested that the response options worked well enough, so presumably respondents guessed the ordering correctly, but I still believe a change to 'strongly disagree' , 'somewhat (or mildly, or slightly) disagree', 'somewhat agree' and 'strongly agree' would be more in keeping with theoretical understandings of opinions. But it's the 'what is it measuring?' question that is arguably more pressing here. In looking at the items, it is clearly measuring some general aversion to movement and activity/exercise. Whether this aversion is rational or irrational is probably irrelevant to what the scale might mean - those that score higher are more likely to also score themselves as more disabled (naturally), probaly more painful, and probably less likely to respond to physical rehabilitation, which generally includes movement and exercise.
I believe it's time to do away with the term 'kinesiophobia', and perhaps call it something more like the 'Tampa Scale for Exercise Aversion'. Further, its probably best to not assume that everyone who scores high on this scale requires psychological intervention. It may simply be an indication of poor pain control or lack of understanding of one's condition. By changing the way we conceptualize this scale, I believe it can remove some of the stigma and lead to novel ways to intervene.
But of course, that's just my opinion :)