From the 1Department of Physical and Rehabilitation Medicine, Turku University Hospital and University of Turku, Finland, 2Division of PM&R, Department of Orthopedic Surgery, Stanford University, Redwood City, 3Musculoskeletal Research Laboratory, VA Palo Alto Health Care System, Department of Bioengineering, Stanford University, CA, USA, 4Department of Orthopedics, Turku University Hospital and University of Turku, Finland and 5PM&R Sports Medicine Service, Department of Orthopaedic Surgery, Stanford University School of Medicine, CA, USA
Objective: To evaluate the evidence regarding the effectiveness of conservative treatment in reducing patellofemoral pain.
Data sources: CENTRAL, MEDLINE, CINAHL, and PEDro databases.
Study selection: Adults with patellofemoral pain, randomized controlled trials only, any conservative treatment compared with placebo, sham, other conservative treatment, or no treatment. Two independent reviewers.
Data extraction: Data were extracted from the full-text of the articles, based on Cochrane Collaboration recommendations. The outcome of interest was the difference between groups regarding change in pain severity.
Data synthesis: The majority of studies were underpowered. More than 80% of the 37 trials did not show a clinically significant benefit. Clinically significant effects of different sizes were found for 7 trials (6 studies out of 7 had short follow-ups). These effects were found for: (i) pulsed electromagnetic fields combined with home exercise –33.0 (95% CI –45.2 to –20.8); (ii) hip muscle strengthening –65.0 (95% CI –87.7 to –48.3) and –32.0 (–37.0 to –27.0); (iii) weight-bearing exercise –40.0 (95% CI –49.4 to –30.6); (iv) neuromuscular facilitation combined with aerobic exercise and stretching –60.1 (95% CI –66.9 to –54.5); (v) postural stabilization –24.4 (95% CI –33.5 to –15.3); and (vi) patellar bracing –31.6 (95% CI –35.2 to –28.0).
Conclusion: There is no evidence that a single treat-ment modality works for all patients with patellofemoral pain. There is limited evidence that some treatment modalities may be beneficial for some subgroups of patients with patellofemoral pain.
Key words: patellofemoral pain syndrome; chondromalacia patellae; conservative treatment; comparative effectiveness research; anterior knee pain.
Accepted Oct 20, 2017; Epub ahead of print Jan 25, 2018
J Rehabil Med 2018; 50: 00–00
Correspondence address: Mikhail Saltychev, Department of Physical and Rehabilitation Medicine, Turku University Hospital and University of Turku, PO Box 52, FI-20521, Turku, Finland. E-mail: firstname.lastname@example.org
Patellofemoral pain (PFP) is a common condition that is best understood as non-specific anterior knee pain resulting from dysfunction in the mechanical forces between the patella and the femur. While there is lack of consensus regarding the precise pathophysiology of PFP, a variety of factors have been implicated in previous studies and systematic reviews to increase the risk of developing PFP. These include many biomechanical factors, such as patellar maltracking, lower extremity muscle weakness (especially of the quadriceps as well as hip abductors and external rotators), delayed activation of vastus medialis, inflexibility of the lower extremity, and foot overpronation (1–4). Numerous different treatment strategies have been suggested to address these underlying factors (5).
Nearly 100 reviews on PFP management have been published, with almost 70 being systematic, including 17 meta-analyses. Among these reviews, there is significant variability regarding the interventions studied, inclusion criteria, statistical methods, sex of the target group, and outcome measures. The most recent meta-analyses reported a potential effect of exercise to improve altered biomechanics of the knee (6) and a potential positive effect of exercise over no treatment (7, 8). There is conflicting evidence regarding the influence of taping and orthoses (8–11). Finally, some studies report a positive effect of incorporating specific interventions, namely hip muscle training, into the rehabilitation programme (12). Otherwise, the primary commonality shared among previous reviews is the acknowledgement of a lack of sufficient evidence to uphold any specific treatment paradigm in the management of PFP. Expert consensus proposes the utility of a comprehensive, multimodal approach that implements a combination of interventions targeting a patient’s individual risk factor profile. Nevertheless, there is a self-evident need to clarify the existing evidence for which treatment strategies, if any, have a positive and clinically significant effect on PFP. There remains the unanswered question: “Is there any evidence that any of treatment methods affect the severity of patellofemoral pain and what is the magnitude of such potential effects?”
While previous reviews on the topic have focused on particular interventions, this study was dedicated to a particular outcome of interest: pain relief. In order to achieve a useful result for clinicians, the outcome was limited to a single measure: a visual analogue scale (VAS). Thus, the emphasis was on providing clinicians with information regarding whether there is evidence that any conservative treatment might decrease the severity level of PFP amongst adult patients. By employing quantitative methods, the study focused on clarifying evidence based on clinical significance.
The inclusion criteria for considering studies for this review were based on the PICO (Population, Intervention, Comparison, and Outcome) framework, as follows: Population: adults with PFP; Types of studies: randomized control trials (RCTs); Intervention: any conservative treatment excluding injections or equivalent; Comparison: placebo, sham, other conservative treatment, or no treatment (including education), the comparison between different forms of similar methods (e.g. different stimulations) was beyond the scope of this review and Outcome: the difference between intervention and control groups regarding change in pain level during follow-up, measured by visual analogue (0–100) or numeric rating scale (0–10).
The following exclusion criteria were chosen: Adolescents (<17 years) and elderly people (>70 years of age); History of moderate or severe osteoarthritis of knee or hip joints, acute trauma in the trunk or low extremities, disease of the peripheral nerves (e.g. diabetic neuropathy or radiculopathy), occlusive arterial disease of the low extremities, other probable specific cause of pain, e.g. patellar tendinitis, pre-patellar bursitis, plica syndrome, Sinding–Larsen–Johansson syndrome, and Osgood–Schlatter disease. Records in language other than English, abstracts not available and Invasive interventions, surgery, or pharmacological therapy as the only treatment.
The Cochrane Controlled Trials Register (CENTRAL), MEDLINE (via PubMed), CINAHL and Physiotherapy Evidence (PEDro) databases were searched in March 2017. The search clauses are shown in Table I. In order to avoid missing relevant studies, the use of limits was restricted and further selection was conducted manually. References for identified articles and reviews were also checked for relevance.
Table I. Summary of search strategy
Two independent reviewers screened the titles and abstracts of articles, assessed the full-text of potentially relevant studies, and rated the methodological quality of the included trials (Fig. 1). Disagreements between reviewers were resolved by consensus or by a third reviewer.
Fig. 1. Search and data extraction flow.
The ultimate goal of the review was to evaluate the available data quantitatively. Therefore, when extracting data, more records were omitted due to inability to provide the statistical data needed for analysis. For example, a study was excluded if variance was not reported or pain severity was assessed by tools other than visual analogue or numeric rating scales. Data needed for a quantitative analysis were extracted from the included trials using a standardized form based on recommendations by the Cochrane Handbook for Systematic Reviews of Interventions 5.1.0 Edition, part 7.6.9 (13).
Methodological quality was assessed according to the Cochrane Collaboration’s domain-based evaluation framework (13). Main domains were assessed in the following sequence: (i) selection bias (randomized sequence generation and allocation concealment); (ii) performance bias (blinding of participants and personnel); (iii) detection bias (blinding of outcome assessment); (iv) attrition bias (incomplete outcome data, e.g. due to dropouts); (v) reporting bias (selective reporting); (vi) other sources of bias. The scores for each bias domain and the final score of risk of systematic bias were graded as low, high, or unclear risk.
The effect sizes of the included trials were calculated as raw mean difference in change of pain severity between groups. Thus, the effect sizes preserved the meaningful units; VAS points. When reported as numeric rating scale points, from zero to 10, the statistics were transformed into a continuous form of VAS points from zero to 10 by multiplying by 10. A statistically significant result does not necessarily mean that it would also be clinically significant. Thus, the estimated effect sizes were compared with a minimal clinically important difference (MCID) for pain level, set at 15 VAS points. It has been suggested previously that the MCID of pain VAS may vary from 11 to 19 points out of 100 (14, 15). In some studies, a cut-off point of 20 points or even a 30% reduction has been suggested (16). For this study, a cut-off point MCID of VAS was defined as 15 points out of 100. This number has also been demarcated previously as the width of repeatability error of VAS measurement (14, 17). The sensitivity tests included setting the pre-post correlation coefficient at 0.8 and 0.6. As the results of the test were similar, the estimates were reported with coefficient set at 0.6. The effect sizes were accompanied by their 95% confidence limits (95% CI).
The effect sizes were calculated using Comprehensive Meta Analysis (CMA), Version 3.3 (available from www.meta-analysis.com). The study power analysis was conducted using PS: Power and Sample Size Calculations, Version 3.0 (available from http://biostat.mc.vanderbilt.edu/PowerSampleSize).
Of the 296 records retrieved from databases, 169 were screened, based on their titles and abstracts, and irrelevant records were excluded by agreement between 2 independent reviewers. Subsequently, 82 records were screened based on their full-text (Fig. 1). Sixty-two studies were considered potentially suitable for data extraction, of which 37 reported the results in such a form and breadth that they were considered sufficient for calculating effect sizes as planned. Within the samples, the sizes of treated groups varied from 7 to 111 (Table II). The majority of studies were underpowered, below the critical size of a sample calculated based on the 15-point cut-off for the change in pain VAS (29 participants in each group for a power of 0.80). In fact, only a few studies have been conducted on samples that were sufficient to identify the difference between changes of 15 points on the VAS (18–20). The dropout rates were generally low. Most studies were conducted on persons younger than 30 years. Only a few studies reported effect sizes. Thus, for the sake of conformity, effect sizes were calculated for all of the studies using the same procedure. The calculated effects of most of the studies appeared to be clinically (and mostly statistically) insignificant (Table III). The risk of systematic bias was considered high in 13 studies (Table III). Information regarding the analysis of methodological quality of the studies is available in more detail on request from a corresponding author.
Table II. Descriptive characteristics of the studies included in the quantitative analysis
Table III. Effect sizes of the studies included in the quantitative analysis. Clinically significant results are in bold
The trials were roughly categorized according to the intervention studied. Kinesio taping was compared with exercise, or placebo, or no treatment in 4 studies (21–24). Two trials compared open kinetic chain exercises with closed ones (25, 26). Five studies assessed the effectiveness of different electrical methods (27–31). Seven trials evaluated the effectiveness of bracing and insoles (19, 20, 32–36). The additional value of hip strengthening was assessed by 10 studies (18, 37–45). Two trials studied the effectiveness of weight-bearing exercise (46, 47). Diverse exercise programmes were evaluated by 3 studies (48–50). In addition, single studies assessed the effectiveness of biofeedback (51), ischaemic compression to trigger points (52), risk factor modifications (53), and postural stabilization (54). In total, the effect sizes of only 7 studies were clinically significant.
In a 12-month follow-up, the study by Servodio et al. conducted on 14 young cases and 17 controls demonstrated a superiority of pulsed electromagnetic fields when combined with a home exercise programme over home exercise alone, showing an effect size of –33.0 (95% CI –45.2 to –20.8) points (31).
In the sample of 50 cases and 50 controls (approximately 30 years old), the study by Timm et al. demonstrated a clinically significant effect of –31.6/95% CI –35.2 to –28.0) points of patellar bracing vs no treatment after 4 weeks (20).
The effect size of the study by Khayambashi et al. (45) was –65.0 (95% CI –87.7 to –48.3) points, displaying the advantage of hip muscle strengthening over no treatment at the end of a 6-week programme in a small sample of women (14 cases and 14 controls). Respectively, the trial by Fukuda et al. (39) found effect size of –32.0 (–37.0 to –27.0) points when comparing hip and knee muscle strengthening with knee exercises alone amongst young women (28 cases and 26 controls).
In the sample of 30 male participants (15 cases and 15 controls), the study by Herrington & Al-Sherhi achieved an effect size of –40.0 (95% CI –49.4 to –30.6) points when comparing 6-week weight-bearing exercise with no treatment. On the other hand, the same study did not demonstrate this effect when a weight-bearing exercise programme was compared with non-weight-bearing exercise (46).
The study by Moyano et al. showed a clinically significant effect of both proprioceptive neuromuscular facilitation combined with aerobic exercise and stretching over education: –60.1 (95% CI –66.9 to –54.5) and –26.7 (95% CI –33.1 to –20.35) points, respectively. The duration of both programmes was 16 weeks and the outcomes were assessed at the end of the programme (50).
Finally, a postural stabilization exercise demonstrated a clinically significant effect of –24.4 (95% CI –33.5 to –15.3) points in a 3-month follow-up in the study by Yilmaz Yelvar et al. amongst women (22 cases and 20 controls) (54).
Of the studies with clinically significant effect sizes, 4 were considered to have a low risk of bias (31, 39, 46, 50). Another 3 studies were considered to have a high risk of systematic bias (20, 45, 54).
In this systematic review, the effect sizes were calculated from the data extracted for 37 randomized controlled trials and the results were interpreted from the point of clinical significance of effects. Of the 37 trials, 30 were unable to report a clinically significant result understood as a significant decrease in pain severity level (more than 15 VAS points). Studies conducted on relatively small samples reported clinically significant effects of: (i) pulsed electromagnetic fields combined with home exercise; (ii) hip muscle strengthening; (iii) weight-bearing exercise; (iv) neuromuscular facilitation combined with aerobic exercise and stretching; (v) and postural stabilization. One larger study with high risk of systematic bias demonstrated a clinically significant effect of patellar bracing. The fact that more than 80% of the 37 trials did not show a clinically significant benefit, combined with the relatively small sample sizes used in the majority of the 7 studies that did show a benefit, makes it difficult to provide strong clinical recommendations.
A weakness of this study lies in the fact, which is a weakness of the PFP research field in general, that there is still no common agreement on the definition of PFP. Thus, the practical value of the results may be substantially affected by the diversity of inclusion criteria across the identified trials. No meta-analysis was conducted. The included trials were so diverse in their populations, settings, and interventions, and their overall risk of systematic bias was so high that potential meta-synthesis was considered inappropriate (13). The review was limited to only 1 outcome (reduction in pain severity) measured by only one type of measure: VAS or NRS. However, an attempt was made to produce as comprehensive a view on the topic as possible. In addition, the results of the review were based on quantitative analysis of the effect sizes of trials calculated on a meaningful scale.
The results are consistent with previous reviews, in that there is a lack of strong evidence on the effectiveness of different approaches to deal with PFP. Most of the studies conducted so far have had sample sizes insufficient to detect a clinically significant reduction in PFP. Among the studies included in this review, 7 demonstrated a clinically significant effect; only 2 of them were conducted on a sample of sufficient size (20, 50). One of these 2 was considered to have a high risk of systematic bias (20).
Except for the single study by Moyano et al. (50), all the others trials that have reported clinically significant results were conducted on cohorts of a population of people with PFP: amongst men (46) or women (39, 45, 54) exclusively, or amongst very young adults with a predomination of women (31). Thus, the possibility of a particular treatment being effective for specific subgroups of patients with PFP should be taken into account.
The concept of clinical significance has often been neglected in favour of the statistical significance of results. Indeed, as shown in Fig. 2 and Table III, several of the included studies demonstrated statistical significance (upper confidence interval to the left of zero) but not clinical significance (upper confidence interval to the left of the MCID of 15 points on the VAS).
Fig. 2. Forest plot of effect sizes across the included studies. Mean values and 95% confidence intervals. Solid vertical line (zero-line): level of statistical significance. Vertical dashed line: limit of clinical significance (visual analogue scale 15 points) of effect that favours the intervention. Effect sizes or their confidence intervals containing the delineated area are clinically insignificant.
Further controlled trials, conducted on sufficiently large samples, are needed to shed light on this topic. Future studies should examine whether subgroups of patients with PFP with different characteristics might benefit differently from particular treatments. As mentioned above, the major problem with PFP is a lack of definitive description. Since PFP is a multifactorial syndrome there may be an effective treatment for some aetiologies, but the same treatment may not be effective for others. This could lead to a situation in which trials fail to approach an intervention, based on the risk factors for PFP existing within the sample. Thus, a null result could be observed if a proportion of a sample treated with an intervention did not have the associated risk factor needed for the treatment to be effective. For example, the effect of hip strengthening may be observed as null effect if the substantial part of the sample does not have underlying hip weakness, or hip strengthening may work for the young female with poor neuromuscular control, whereas stretching may be the better choice for the older male with tight soft tissues. In other words, when the entire sample is analysed “as a whole”, the possible effect of treatment for a specific patient group may be “washed out”.
This study focused only on conservative treatment of PFP. Thus, it says nothing about the effectiveness of invasive methods, e.g. surgery or injections. The lack of evidence on the effectiveness of conservative methods does not mean that an invasive approach should be preferred. It is likely that the situation regarding missing evidence on effectiveness will also be the same for invasive treatment. A comprehensive systematic review on the effectiveness of surgery among patients with PFP is urgently needed. There are also no exact data on the role of early osteoarthritis in PFP (55). It has been reported that these 2 conditions are correlated, but the causality remains unclear and needs to be investigated.
Only a few studies have employed a placebo as a control intervention. For example, Herrington et al. reported the difference in treatment effects between weight-bearing exercises and no exercise at all, but no difference between weight-bearing and non-weight-bearing exercise. This leads to speculation that “it does not matter what treatment you give as long as you do something”.
Future studies should use a sufficient study power (at least 0.8) and the results should be tested for a confidence interval that exceeds the MCID of 15 VAS points rather than looking for a statistical difference between groups. Our calculations show that a study requires at least 29 experimental subjects and 29 control subjects to be able to reject the null hypothesis that the population means of the experimental and control groups are equal with probability (power) 0.8 if the true difference in the experimental and control means is 15. The calculations were based on the assumption that standard deviations (SD) are around 20 VAS points and the type I error probability was set at 0.05. In the real situation, many of the studies included in this review demonstrated a SD greater than 20.0 points. With a SD set at 25.0 points, a study would require 44 subjects per group in order to achieve the level of clinical significance.
The message to clinicians from this review is that that there is so far no evidence that a single treatment modality works for all patients with PFP. There is limited evidence that some treatments modalities may be beneficial for some subgroups of patients with PFP.
Registration. PROSPERO International prospective register of systematic reviews ID=CRD42014013828.
The authors have no conflicts of interest to declare.