VOLUME 14 NUMBER 2 PAGES 117–240 April 2009
Editors
International Advisory Board
Ann Moore PhD, GradDipPhys, FCSP, CertEd, FMACP Clinical Research Centre for Health Professions University of Brighton Aldro Building, 49 Darley Road Eastbourne BN20 7UR, UK Gwendolen Jull PhD, MPhty, Grad Dip ManTher, FACP Department of Physiotherapy University of Queensland Brisbane QLD 4072, Australia
K. Bennell (Victoria, Australia) K. Burton (Huddersfield, UK) B. Carstensen (Frederiksberg, Denmark) M. Coppieters (Queensland, Australia) E. Cruz (Setubal, Portugal) L. Danneels (Maríakerke, Belgium) S. Durrell (London, UK) S. Edmondston (Perth, Australia) J. Endresen (Flaktvei, Norway) L. Exelby (Biggleswade, UK) D. Falla (Aalborg, Denmark) J. Greening (London, UK) C. J. Groen (Utrecht, The Netherlands) A. Gross (Hamilton, Canada) T. Hall (West Leederville, Australia) W. Hing (Auckland, New Zealand) M. Jones (Adelaide, Australia) S. King (Glamorgan, UK) B.W. Koes (Amsterdam, The Netherlands) J. Langendoen (Kempten, Germany) D. Lawrence (Davenport, IA, USA) D. Lee (Delta, Canada) R. Lee (London, UK) C. Liebenson (Los Angeles, CA, USA) L. Maffey-Ward (Calgary, Canada) E. Maheu (Quebec, Canada) C. McCarthy (Coventry, UK) J. McConnell (Northbridge, Australia) S. Mercer (Queensland, Australia) D. Newham (London, UK) J. Ng (Hung Hom, Hong Kong) S. O’Leary (Queensland, Australia) L. Ombregt (Kanegem-Tielt, Belgium) N. Osbourne (Bournemouth, UK) M. Paatelma (Jyvaskyla, Finland) N. Petty (Eastbourne, UK) A. Pool-Goudzwaard (The Netherlands) M. Pope (Aberdeen, UK) G. Rankin (London, UK) E. Rasmussen Barr (Stockholm, Sweden) D. Reid (Auckland, New Zealand) A. Rushton (Birmingham, UK) C. Shacklady (Manchester, UK) M. Shacklock (Adelaide, Australia) D. Shirley (Lidcombe, Australia) W. Smeets (Tongeren, Belgium) C. Snijders (Rotterdam, The Netherlands) R. Soames (Dundee, UK) P. Spencer (Barnstaple, UK) M. Sterling (St Lucia, Australia) P. Tehan (Victoria, Australia) M. Testa (Alassio, Italy) M. Uys (Tygerberg, South Africa) P. van der Wurff (Doorn, The Netherlands) P. van Roy (Brussels, Belgium) B.Vicenzino (St Lucia, Australia) H.J.M. Von Piekartz (Wierden, The Netherlands) M. Wessely (Paris, France) A. Wright (Perth, Australia) M. Zusman (Mount Lawley, Australia)
Associate Editor’s Darren A. Rivett PhD, MAppSc, (ManipPhty) GradDipManTher, BAppSc (Phty) Discipline of Physiotherapy Faculty of Health The University of Newcastle Callaghan, NSW 2308, Australia E-mail:
[email protected] Deborah Falla PhD, BPhty(Hons) Department of Health Science and Technology Aalborg University Fredrik BajersVej 7, D-3 DK-9220 Aalborg Denmark Email:deborahfvhst.aau.dk Tim McClune D.O. Spinal Research Unit. University of Huddersfield 30 Queen Street Huddersfield HD12SP, UK E-mail:
[email protected] Editorial Committee Timothy W Flynn PhD, PT, OCS, FAAOMPT RHSHP-Department of Physical Therapy Regis University Denver, CO 80221-1099 USA Email:
[email protected] Masterclass Editor Karen Beeton PhD, MPhty, BSc(Hons), MCSP MACP ex officio member Associate Head of School (Professional Development) School of Health and Emergency Professions University of Hertfordshire College Lane Hatfield AL10 9AB, UK E-mail:
[email protected] Case reports & Professional Issues Editor Jeffrey D. Boyling MSc, BPhty, GradDipAdvManTher, MCSP, MErgS Jeffrey Boyling Associates Broadway Chambers Hammersmith Broadway London W6 7AF, UK E-mail:
[email protected] Book Review Editor Raymond Swinkels MSc, PT, MT Ulenpas 80 5655 JD Eindoven The Netherlands E-mail:
[email protected] Visit the journal website at http://www.elsevier.com/math doi:10.1016/S1356-689X(09)00009-5
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 117e118 www.elsevier.com/math
Editorial
Bring back the biopsychosocial model for neck pain disorders The traditional pathoanatomical (biomedical) approach to the diagnosis of neck pain disorders is widely acknowledged as inadequate. It is well recognised that in the vast majority of individuals no pathology can be imaged which can reliably account for symptoms. Equally, pain is commonly the patient’s presenting complaint and certainly its dimensions cannot be imaged with conventional radiological techniques. The biopsychosocial model was introduced as a diagnostic and management paradigm to recognise correctly, the multidimensional nature of pain. The model retained the biomedical aspect and added the role that psychological and social factors could contribute to pain perception and activity limitation. There is no argument about the multidimensional nature of neck pain. However in the absence of ‘red flags’ or demonstrable pathoanatomy, it now appears to be becoming commonly accepted, often without supporting data, that psychosocial features are the strongest drivers of neck pain especially in compensable cases of neck pain. No imageable pathoanatomy seems to be commonly and incorrectly interpreted as the lack of any injury or biological event. The consequence is that the management advocated for these individuals then focuses on psychosocial aspects with little or no regard to biological features. While the pendulum was held at the far left in the pathoanatomical (biomedical) model, it now seems to have swung to the far right in a psychosocial model for both acute as well as persistent neck pain. Our question is where has the middle ground of the biopsychosocial model gone? There is undoubtedly an association between psychological features and neck pain, but this association is often not as strong as is commonly believed. For example, Kyhlback et al. (2002) found that baseline psychosocial factors of gender, age and self efficacy accounted for only 24% of the variance in pain and 36% of the variance in disability 12 months following whiplash injury. Similarly a model proposed by Young Casey et al. (2008), which included standardised psychosocial measures of cumulative exposure to trauma, baseline depression, pain and pain beliefs in people with acute neck and back pain, only 1356-689X/$ - see front matter Ó 2009 Published by Elsevier Ltd. doi:10.1016/j.math.2009.01.004
accounted for 28% of the variation in pain intensity and 58% of the variance in disability 3 months later. Despite these relatively weak relationships, it is almost considered a fact that psychosocial factors play a stronger role than physical ones in presentation and development of neck pain, although few studies have actually included physical or biological factors in their analysis. The magnitude of the contribution of psychosocial features can only be evaluated and perhaps interpreted appropriately with an understanding of concomitant biological features. This is clearly in evidence in studies which have measured both psychosocial and biological features simultaneously. Data from such studies demonstrate that some biological (physical) features are stronger predictors of pain and disability than psychosocial factors or at least significantly contribute to predictive models that include measures of both biological and psychosocial substrates. In a cross-sectional study of office workers with and without neck pain, Johnston et al. (in press) demonstrated that when considering psychosocial domains, individual factors, task demands, quantitative sensory measures and measures of motor function concomitantly, sensory and motor impairments had stronger influences on pain and disability than workplace and psychosocial features. In the case of whiplash, sensory disturbances, in particular increased cold sensitivity, are predictive of poor functional recovery (Kasch et al., 2005; Sterling et al., 2005). The inclusion of the physical variables of movement loss and sensory disturbances to a predictive model comprising psychosocial factors almost doubled the percentage of successful classification of individuals with persistent symptoms 12 months post whiplash injury (Sterling et al., 2005). We support the biopsychosocial model. We advocate that future research addresses biological (physical), psychological and social features concurrently to more fully understand the interactions between these features in a pain state and in recovery. Studies must prestate hypotheses to test whether nominated biological or psychological features are mediators or moderators of the presenting pain and functional states to better understand their respective roles. Importantly the
118
Editorial / Manual Therapy 14 (2009) 117e118
clinician’s assessments of individuals with neck pain must follow a similar approach and individual patients should not be fitted to a predetermined management approach. Current evidence for the management of neck pain disorders does not support any singular line of management whether biologically or psychologically based. Rather, the evidence supports multimodal approaches and a clearer understanding of the interactions between biological, psychological and social features of various neck pain disorders will inform better management the aim of the biopsychosocial model.
References Johnston V, Jimmieson NL, Jull G, Souvlis T. Contribution of individual, workplace, psychosocial and physiological factors to neck pain in female office workers. European Journal of Pain, in press. Kasch H, Qerama E, Bach F, Jensen T. Reduced cold pressor pain tolerance in non-recovered whiplash patients: a 1 year prospective study. European Journal of Pain 2005;9:561e9. Kyhlback M, Thierfelder T, Soderlund A. Prognostic factors in whiplash associated disorders. International Journal of Rehabilitation 2002;25:181e7.
Sterling M, Jull G, Vicenzino B, Kenardy J, Darnell R. Physical and psychological factors predict outcome following whiplash injury. Pain 2005;114:141e8. Young Casey C, Greenberg M, Nicassio P, Harpin R, Hubbard D. Transition from acute to chronic pain and disability: a model including cognitive, affective and trauma factors. Pain 2008;134:69e79.
Gwendolen Jull NHMRC Centre of Spinal Pain Injury and Health School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Qld 4072, Australia Michele Sterling NHMRC Centre of Spinal Pain Injury and Health School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Qld 4072, Australia Centre of National Research on Disability and Rehabilitation, The University of Queensland, St Lucia, Qld 4072, Australia
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 119e130 www.elsevier.com/math
Systematic Review
The validity and accuracy of clinical tests used to detect labral pathology of the shoulder e A systematic review Wendy Munro*, Raymond Healy University of Salford, Salford, Greater Manchester M6 6PU, UK Received 16 July 2007; received in revised form 8 August 2008; accepted 27 August 2008
Abstract Labral tears frequently require repair [Kim S, Ha K, Han K. Biceps Load test: a clinical test for superior labrum anterior and posterior lesions in shoulders with recurrent anterior dislocations. The American Journal of Sports Medicine 1999;27(3):300e3]. Physiotherapists need confidence in clinical tests used to detect labral pathology to accurately identify this condition. This review systematically evaluates the evidence for the accuracy of these tests with reference to study quality and key biases. Cochrane, Medline, Cinahl, AMED, DARE and HTA databases were searched to identify 15 studies evaluating 15 clinical tests for labral pathology against Magnetic Resonance Imaging MRI or surgery. Two independent reviewers assessed methodological quality using Quality Assessment of Diagnostic Accuracy Studies (QUADAS). Meta Disc calculated likelihood ratios (positive LR > 10, providing convincing diagnostic evidence of ruling a condition in; negative LR < 0.2 providing large to moderate evidence of ruling the condition out) and true positive rates (TPRs) against false positive rates (FPRs) in receiver operator characteristic (ROC) plots and summary receiver operator curves (SROCs). Probable overestimation of accuracy was caused by use of case control design, verification bias and use of a lesser reference standard. Six accurate tests; Biceps Load I (þLR: 29.09; LR: 0.09) Biceps Load II (þLR: 26.32; LR: 0.11), Internal Rotation Resistance (IRRT) (þLR: 24.77; LR: 0.12), Crank (þLR: 13.59 and 6.46; LR: 0.1 and 0.22), Kim (þLR: 12.62; LR:0.21) and Jerk (þLR: 34.71; LR: 0.27) tests were identified from high quality single studies in selected populations. Subgroup analysis identified varying results of accuracy in the Crank test and the Active Compression (AC) test when evaluated in more than one study. Further evaluation is needed before these tests can be used with confidence. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Labral pathology; Screening; Sensitivity and specificity; Likelihood ratios
1. Introduction Assessment and diagnosis has become an increasingly important aspect of the physiotherapist’s role in clinical specialist and extended scope roles. Differential diagnosis of the shoulder is a problematic area, with no standardised definitions and diagnostic criteria for defining disorders being inconsistent and unreliable (Green et al., 2003). Hanchard * Corresponding author. Directorate of Physiotherapy, Mary Seacole Building, Frederick Road Campus, University of Salford, Salford, Greater Manchester M6 6PU, UK. Tel.: þ44 0161 295 2502; fax: þ44 0161 295 2432. E-mail address:
[email protected] (W. Munro). 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.08.008
et al. (2004) advocate an evidence based conservative management approach which does not differentiate between subacromial impingement syndrome (SIS), posterior superior glenoid impingement (PSGI) and superior labral anterior posterior (SLAP) lesions suggesting that such clear cut diagnosis is unnecessary. However, the presence of signs, possibly indicating glenoid labral damage e.g. pain on overhead activities, deep shoulder pain, painful catching and popping or clicking (Musgrave and Rodosky, 2001), should lead the clinician to consider further management outside the scope of physiotherapy such as arthroscopy or surgery. Symptoms of labral pathology can make it difficult to differentiate from other shoulder pathologies such as impingement and
120
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
acromio-clavicular joint arthritis (Musgrave and Rodosky, 2001). Knowledge of the tests available to assist in the differentiation of this diagnosis, the validity of these tests and the skills to perform them are therefore required. Physical examination has been described as more of an art than a science although carefully planned diagnostic test accuracy studies will provide more of a science to this art (Reider, 2004). Although SLAP lesions commonly occur in the young active overhead athlete (Andrews et al., 1985) and following a compressive or distraction force on the shoulder (Andrews et al., 1985; Snyder et al., 1990; Maffet et al., 1995), labral pathology may result from a sudden fall onto the outstretched hand or elbow with the shoulder in a somewhat adducted and extended position. This can lead to secondary symptoms of impingement caused by superior translation of the humeral head (Kumar et al., 1989; Altchek et al., 1992; Schmitz, 1999). Hasan (2006) has suggested the superior labrum to have a more meniscoid attachment to the glenoid than the rest of the labrum, making it susceptible to degenerative as well as traumatic lesions. Tests for labral pathology therefore need to be accurate in both general and athletic population settings in a wide age group of patients. Liume et al. (2004) and Jones and Galluch (2007) have systematically reviewed studies relating to clinical tests for instability and labral lesions and superior glenoid labral lesions respectively. Liume et al. (2004) reviewed 17 studies evaluating clinical tests for shoulder instability or labral lesion suggesting the Relocation test and the Anterior Release test to be most clinically relevant in diagnosing instability, and the Biceps Load tests I and II, the Pain provocation test and the Internal Rotation Resistance test (IRRT) to be most promising for labral tears. Jones and Galluch (2007) reviewed 12 studies and concluded that SLAP specific physical examination results cannot be used alone to diagnose SLAP lesions. This review, including additional studies, focuses on studies evaluating tests for labral pathology and adds to the previous literature with a thorough quality assessment of the included studies using Quality Assessment of Diagnostic Accuracy Studies (QUADAS), receiver operating characteristic and forest plots. Previous studies have either used QUADAS only, or levels of evidence to control for study quality. Subgroup analysis is carried out on single tests evaluated in different studies. 2. Methods 2.1. Search strategy Publications were identified by searching the following databases: Cochrane (1995e2007), Medline (1996eJune 2007), Cinahl (1982eJune 2007), AMED (1985eJune 2007), Health Technology Assessment (1995eJune 2007) and the Database of Abstracts of Reviews of Effectiveness (1995eJune 2007). A combination of MeSh terms (exp ‘sensitivity and specificity’/, exp shoulder joint/, exp joint instability/, exp shoulder injuries/, exp shoulder pain/) and text words (specificity, false negative, accuracy, screening, labral pathology, SLAP lesions, SLAP,
glenoid labrum, instability and individual test names) based on Deville´ et al.’s (2000) optimal search strategy were used. The search was limited to articles of English language. 2.2. Inclusion and exclusion criteria The titles of the articles were screened and filtered and the abstracts of the filtered articles were screened by one reviewer (WM) for fulfilment of the inclusion and exclusion criteria. Inclusion criteria were: cohort and case control design, shoulder pain, clinical examination tests used to evaluate labral pathology, comparison against a reference standard, and inclusion of sensitivity and specificity values. Exclusions were: other pathologies leading to shoulder pain (e.g. referred from spine or internal organs, Cerebrovascular accident CVA) and studies omitting values of either sensitivity or specificity. Where the first reviewer was uncertain whether a study should be included, a second reviewer (RH) was consulted and a decision made by consensus. To ensure completeness of the literature search, the references of the included studies were hand searched for further references and a citation search was carried out. No further studies were identified. 2.3. Data extraction and quality assessment A standardised extraction form was piloted and then used independently by two reviewers (WM and RH) to maintain quality and objectivity (Deeks and Morris, 1996). Any disagreements were decided by consensus. Quality assessment was carried out on all studies which met the inclusion and exclusion criteria using the QUADAS tool (Whiting et al., 2003). This ensured that all studies were evaluated for individual quality items rather than being given a quality score as advocated by Whiting et al. (2005). The QUADAS tool has been developed based on expert consensus and empirical evidence. It has been shown to have varied reliability, with agreement on individual checklist items of 90% (Whiting et al., 2006) 76% (Davis et al., 2007) and 78% (Hollingworth et al., 2006) and kappa scores of 0.65 (Whiting et al., 2006), 0.39 (Davis et al., 2007) and 0.22 (Hollingworth et al., 2006) demonstrating good, fair and fair inter-rater reliability respectively (Altman, 1999). Differences appear to be down to numbers of reviewers, working proximity of the reviewers and experience in diagnostic accuracy systematic reviews (Hollingworth et al., 2006). The QUADAS tool has gained positive feedback in a pilot study by twenty reviewers, with eighteen considering the tool to cover all important items (Whiting et al., 2006). The tool includes questions relevant to spectrum bias, selection bias, disease progression bias, verification bias, incorporation bias, execution of the index and reference test, index test and reference standard test review bias, uninterpretable tests and withdrawals from the studies. These terms are explained in Table 1. Each item was scored yes, no or unclear according to the scoring guidelines of the tool (Whiting et al., 2003). The meaning of the questions relevant to the clinical applicability was discussed and agreed by the reviewers prior to use of the tool. For the purpose of the review, it was assumed that
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130 Table 1 Glossary (QUADAS tool). Spectrum bias
Selection bias
Disease progression bias
Verification bias
Incorporation bias
Execution of the tests Index test and reference standard review bias
Uninterpretable test results Withdrawals
An appropriate sample of patients is considered to have a range of mild to severe, treated and untreated disease and different but commonly confused disorders. This would rule out the chance of spectrum bias occurring by minimising the prevalence of the condition for which the index test is testing. The ideal study sample is a consecutive series of randomly selected patients. Selection bias may occur if patients are selected in a non-random manner e.g. only those who are having surgery. The index and reference test should be carried out at the same time to avoid bias due to the progression of the disease. This occurs when not all those who have the index test have the reference standard (partial verification bias) or when the index tests are verified by different reference standards (differential verification bias). The index and reference tests are required to have an independent result. Bias occurs when the index test is used as part of the reference standard. The test details are required to be sufficient to be able to perform the tests again. This occurs when the examiner is not blinded to the result of either the index or the reference standard when interpreting the results of the other test. Lack of inclusion of uncertain test results can bias the assessment of the test characteristics. Lack of reporting of withdrawals from the study can introduce bias.
in prospective studies the index test would be interpreted without knowledge of the reference standard. Retrospective studies were marked as unclear unless it was stated that the diagnosis was given prior to the interpretation of the reference standard. An appropriate spectrum of patients was considered to be a sample consisting of a wide age range of male and female patients with a range of conditions. 2.4. Data analysis The objectives of the data analysis were guided by the recommendations of the Cochrane Methods Group on Systematic Review of Screening and Diagnostic Tests (1996): To identify the number, quality and scope of primary studies; To provide an overall summary of the diagnostic accuracy of tests studied; To compare different tests in terms of their accuracy; To determine if accuracy estimates depended on the study quality; To determine whether accuracy varied in subgroups according to patient and test characteristics; To determine further areas for research. All studies with raw data available which were evaluated using QUADAS were included in the data analysis in order to
121
assess the effect of study quality on the accuracy of the tests. The raw data included the numbers of true positives (disease present and positive index test), false negatives (disease present and negative index test), false positives (disease absent with positive index test) and true negatives (disease absent with negative index test). These values were extracted from the individual studies and presented in 2 2 contingency tables and inputted into Meta Disc (Zamora et al., 2006) to provide results of sensitivity, specificity, and positive and negative Likelihood ratios (LR) with their confidence intervals. 2.5. Sensitivity, specificity and LR Sensitivity is the proportion of patients with the pathology correctly identified by a positive index test (Peat et al., 2002) calculated using the formula TP/TP þ FN. Specificity is the proportion of patients without the pathology correctly diagnosed by a negative index test (Peat et al., 2002). This is calculated by TN/FP þ TN. LR are particularly relevant to clinicians with the positive LR corresponding to the clinical concept of ruling in a condition and the negative LR in ruling out the condition. The advantages of the LR are that they do not vary as the underlying probability of the disease varies and they provide richer information to the clinician (Ebell, 1998). Positive LR are calculated using the formula sensitivity/1 specificity and negative LR are calculated using 1 sensitivity/specificity. Where the raw data was unavailable to calculate the above results, the studies were excluded from data analysis (Guanche and Jones, 2003; Myers et al., 2005; Nakagawa et al., 2005; Parentis et al., 2006). Values of sensitivity, specificity and positive and negative LR from these studies are reported in Table 4 where available. The sensitivity and specificity values for each test in each study were evaluated in relation to each other in a receiver operating characteristic (ROC) plot. Analysis was carried out according to the quality of the studies such that it was possible to determine from the graph (Fig. 1) which were the most valid and accurate tests. The ROC plot (Fig. 1) indicates the relationship between the true positive rate (TPR) and the false positive rate (FPR) of each test. Using Sackett’s (1992) rule of SpPin and SnNout, tests which are in the top left corner show high accuracy, with high sensitivity and high specificity. This means that the tests in this area are useful at ruling a disorder in when positive and in ruling a disorder out when negative. Tests close to the diagonal line indicate that discrimination between a positive and negative tests is no better than chance. LR were presented in forest plots (Figs. 2 and 3). An overall summary of the diagnostic accuracy of individual tests in varying population groups was provided using summary receiver operator curves (SROCs) (Figs. 4 and 5). 3. Results 3.1. Study selection Searches retrieved 1924 references from which, following assessment against the review inclusion/exclusion criteria, 19
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
122
articles were obtained for closer examination. Four studies were excluded (Bennett, 1998; Berg and Ciullo, 1998; Holtby and Razmjou, 2004; Liu et al., 1996b) (see Fig. 6). 3.2. Quality assessment Fifteen studies were included in the review, evaluating various tests for labral pathology (Table 2). Results of quality assessment are presented in Table 3.The major limitations in quality of the studies related to spectrum and selection bias, verification bias, clinical test review bias, diagnostic test review bias, reporting of reference test details, availability of clinical information and information regarding disease progression between tests. Strengths in the studies related to avoidance of incorporation bias, index test details and avoidance of withdrawal bias. Uninterpretable results were explained in the majority of studies. Fig. 1. Plot in receiver operating characteristic space of estimates of the FPR and TPR of 11 individual tests used to detect labral pathology on clinical examination. These tests are represented in relation to the quality of the studies according to key sources of bias in diagnostic accuracy studies. 2 or less key sources of bias, 3 key sources of bias, - 5 or more key sources of bias present or unclear. AC1, Active Compression (O’Brien 1998); AC2, Active Compression (Stetson 2002); AC3, Active Compression (McFarland 2002); C1, Crank (Mimori 1999); C2, Crank (Liu 1996); C3, Crank (Stetson 2002); AS1, Anterior Slide (Kibler 1995); AS2, Anterior Slide (McFarland 2002); BLI, Biceps Load I (Kim 1999); BLII, Biceps Load II (Kim 2001); NPPT, New Pain Provocation (Mimori 1999); IRRT, Internal Rotation Resistance (Zaslav 2001); K, Kim (Kim 2005); J, Jerk (Kim 2005); PIS, Posterior Impingement (Meister 2005); CR, Compression Rotation (McFarland 2002).
3.3. Relationship of test results to study quality Lack of raw data from the primary studies meant that tests from only 11 of the fifteen studies were included in the data analysis (Fig. 1). Results demonstrating the plotting of the tests in ROC space in relation to the quality of the studies using Lijmer et al.’s (1999) and Whiting et al.’s (2004) evidence of key sources of bias which can overestimate the accuracy of tests are presented in Fig. 1. These key sources of bias are case control design, partial and differential verification bias, absent/inappropriate reference standard, clinical test and diagnostic test review bias, and availability of clinical information. Tests with high TPRs (sensitivity) against minimal FPRs (1 specificity) with minimal bias (red) in the studies demonstrate an optimum profile (Fig. 1). These were the Biceps Load tests I and II (Kim et al., 1999, 2001), the IRRT (Zaslav, 2001), the Crank test (Liu et al., 1996a) as well
Kibler 1995 (AS) Kim 1999 (BLI ) Kim 2001 (BLII) Kim 2005 (K) Kim 2005 (J) Liu 1996 (C) Meister 2004 (PIS) McFarland 2002 (CR) McFarland 2002 (AS) McFarland 2002 (AC) Mimori 1999 (NPPT) Mimori 1999 (C) O’Brien 1998 (AC) Steson 2002 (C) Stetson 2002 (AC) Zaslav 2001 (IRRT)
0.01
1
4.26 (2.16-8.39) 29.09 (7.34- 115.27)* 26.32 (8.61-80.45)* 12.62 (6.54-24.35)* 34.71 (11.10-108.55)* 13.59 (3.55-52.10)* 5.03 (1.75-14.46) 0.99 (0.50-1.94) 0.49 (0.16-1.47) 1.05 (0.73 -1.49) 7.17 (1.62-31.78) 1.06 (0.61-1.83) 43.59 (15.47-122.84) 1.06 (0.51-1.18)* 0.78 (0.51-1.18)* 24.77 (8.08-75.90)*
100.0
Positive LR Fig. 2. Positive Likelihood Ratios with 95% CI. Positive likelihood ratios demonstrating large and conclusive changes from pre-test to post test probability are marked in bold (Jaeschke 1994). * Studies identified with 2 or less items of key bias using the QUADAS tool.
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
123
Kibler 1995 (AS) 0.26 (0.17-0.41) Kim 1999 (BLI ) 0.09 (0.01-0.61)* Kim 2001 (BLII) 0.11 (0.04 -0.27) Kim 2005 (K) 0.21 (0.10-0.44) Kim 2005 (J) 0.27 (0.15-0.49) Liu 1996 (C) 0.10(0.03-0.30)* Meister 2004 (PIS) 0.29 (0.17-0.49) McFarland 2002 (CR) 1.00 (0.81-1.25) McFarland 2002 (AS) 1.10 (0.99-1.22) McFarland 2002 (AC) 0.96 (0.70-1.32) Mimori 1999 (NPPT) 0.03 (0.00-0.39)* Mimori 1999 (C) 0.22 (0.07-0.71) O’Brien 1998 (AC) 0.01 (0.00-0.15) Steson 2002 (C) 0.95 (0.61-1.50)* Stetson 2002 (AC) 1.50 (0.80-2.81)* Zaslav 2001 (IRRT) 0.12 (0.04-0.35)*
0.01
1
100.0
Negative LR Fig. 3. Negative Likelihood Ratios with 95% CI. Negative likelihood ratios demonstrating large and conclusive changes from pre-test to post test probability are marked in bold (Jaeschke 1994). * Studies identified with 2 or less items of key bias using the QUADAS tool.
as the Kim and Jerk tests (Kim et al., 2005). These results are reinforced by the positive and negative LR (Figs. 2 and 3). The Biceps Load test I (Kim et al., 1999), Biceps Load test II (Kim et al., 2001), Crank test (Liu et al., 1996a), IRRT (Zaslav, 2001), the Kim and Jerk tests (Kim et al., 2005) and the AC test (O’Brien et al., 1998) had high positive LR (>10) showing large and conclusive changes from pre-test to post-test probability of the target disorder. The Biceps Load test I (Kim et al., 1999), New Pain Provocation test (NPPT) (Mimori et al., 1999), AC test (O’Brien et al., 1998), all had negative LR less than 0.1 again providing large and conclusive changes from pre-test to post-test probability of the target disorder. However O’Brien et al.’s (1998) study evaluating the AC test and Mimori et al.’s (1999) study evaluating the NPPT and the
Crank test were subject to many of the key biases suggested by Lijmer et al. (1999) and Whiting et al. (2004) to overestimate the accuracy of test results. Studies without raw data and therefore not inputted into Meta Disc demonstrated three (Myers et al., 2005; Parentis et al., 2006) two (Guanche and Jones, 2003) and one (Nakagawa et al., 2005) key sources of bias. Of these, only the Resisted Supination External Rotation (RSER) test (Myers et al., 2005) demonstrated sensitivity and specificity values over 80%. 3.4. Subgroup analysis Tests evaluated in more than one study were the AC (O’Brien et al., 1998; McFarland et al., 2002; Guanche and
Fig. 4. SROC of the Crank test carried out in different studies (Liu et al., 1996a; Mimori et al., 1999; Stetson and Templin, 2002).
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
124
Fig. 5. SROC of the AC test carried out in different studies (O’Brien et al., 1998; McFarland et al., 2002; Stetson and Templin, 2002).
Jones, 2003; Myers et al., 2005; Nakagawa et al., 2005; Parentis et al., 2006), Anterior Slide (AS) (Kibler, 1995; McFarland et al., 2002; Nakagawa et al., 2005), Compression Rotation (CR) (McFarland et al., 2002; Nakagawa et al., 2005), Crank (Liu et al., 1996a; Mimori et al., 1999; Stetson and Templin, 2002; Myers et al., 2005; Nakagawa et al., 2005; Parentis et al., 2006), and the NPPT (Mimori et al., 1999; Parentis et al., 2006). Of these, only the Crank and AC tests had sufficient raw data to input into Meta Disc (Zamora et al., 2006) to provide SROCs. Within Meta Disc, 0.5 was added to all cells in the table to allow for calculation of statistics where there were 0 values in any cells as suggested by Cox (1970) cited in Zamora et al. (2006). These SROCs are demonstrated in Figs. 4 and 5. These graphs demonstrate the Crank test to be the better test, with the area under the curve being closer to 1, although there is one outlier. The area under the curve for the AC test is closer to 0.5 demonstrating this test to be no more discriminating between a positive and negative result than chance (Hopley and van Scalkwyk, 2007). 4. Discussion The findings of the review are that 6 tests for labral pathology, which demonstrated both high sensitivity and specificity values and LR, were identified (Table 5). These were found to come from studies of moderately sound methodological quality and results provided convincing or moderately strong diagnostic accuracy (ranging between 91 and 96%). The tests were:
Biceps Load test I (Kim et al., 1999, n ¼ 75); Biceps Load test II (Kim et al., 2001, n ¼ 127); IRRT (Zaslav, 2001, n ¼ 110); Crank test (Liu et al., 1996a, n ¼ 62); Kim test (Kim et al., 2005, n ¼ 172); Jerk test (Kim et al., 2005, n ¼ 172);
Confidence intervals for sensitivity and specificity were clinically acceptable (0.70e1.0), except for the Biceps Load test I and the Kim and Jerk tests which ranged between 0.54 and 1.00. Although encouraging, the results should be treated with caution as test accuracy has been based on single studies, with the tests performed by the people who developed them (and therefore expected to be unusually skilled) in specialist settings on people referred for surgery. It cannot be assumed that the tests will produce the same results when carried out by less skilled examiners in unselected populations. Where tests were evaluated by more than one study, the results were less consistent (Figs. 4 and 5, Table 4). Overestimation of results appears to occur in studies demonstrating key bias (Kibler, 1995; O’Brien et al., 1998; Mimori et al., 1999) and where there are skilled practitioners who have developed the tests (Kibler, 1995; Liu et al., 1996a; O’Brien et al., 1998). Variations in thresholds used, mean age of the population and quality of the studies may account for the differences in results between O’Brien et al. (1998), McFarland et al. (2002) and Stetson and Templin (2002) on evaluation of the AC test (Fig. 5). Similarly, on analysis of the Crank test, although the same index test description and threshold were used by all authors (Liu et al., 1996a; Mimori et al., 1999; Stetson and Templin, 2002), the main difference between the studies was the quality of the methodology and the mean age of the patients. Lower accuracy levels are apparent when a number of tests are evaluated at the same time (McFarland et al., 2002; Stetson and Templin, 2002; Guanche and Jones, 2003; Nakagawa et al., 2005; Parentis et al., 2006; Myers et al., 2005) and where there is a higher mean age (McFarland et al., 2002; Stetson and Templin, 2002; Guanche and Jones, 2003; Myers et al., 2005; Parentis et al., 2006). The IRRT (Zaslav, 2001), Kim and Jerk tests (Kim et al., 2005) in studies by their developers have however performed well on patients with an older mean age. Unsurprisingly, other tests when re-evaluated
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130 Included trials
125
Excluded trials
Articles retrieved (n=1924)
Articles obtained for closer examination (n=19)
Not specific to area of review (n=1902)
Systematic Reviews (n=3)
Studies of tests no traditionally used to detect labral pathology (n=3)
Studies omitting values of either sensitivity or specificity (n=1)
Studies identified for quality appraisal (n=15)
Studies with 2 or less key sources of bias (n=9)
Studies with 2 key sources of bias (n=4)
Abduction Inferior stability (n=1) Active Compression (n=3) Anterior slide (n=1) Biceps load I (n=1) Biceps Load II (n=1) Crank (n= 4) Clunk (n=1) Compression rotation (n=1) Forced Abduction (n=1) Internal rotation resistance test (n=1) Jerk (n=1) Kim (n=1) Posterior jerk (n=1)
Studies with 5 or more keysources of bias (n=3)
Active Compression (n=1) Anterior slide (n=1) Crank (n=1) New pain provocation (n=1)
Active compression (n=3) Anterior slide (n=2) Compression Rotation (n=1) Crank (n=2) New pain provocation (n=1) Posterior impingement (1) Resisted Supination external rotation (n=1)
Fig. 6. Flow chart of literature search.
in this age group have not been so accurate, as patients are more likely to have co-existing pathologies such as rotator cuff pathology and gleno-humeral arthritis. All the studies came from specialist settings (people referred to hospital consultants for surgery), making the studies subject to selection bias. It cannot be assumed that similar results would be seen in a population with a wider spectrum of shoulder problems of lesser severity and co-existence of pathologies, such as that found in primary care. In all of the studies, doctors performed the testing but none reported the experience of the examiners. Therefore the results cannot be assumed to be applicable to other health care professionals, such as physiotherapists, or doctors regardless of their level of experience.
4.1. Considerations for future research In considering the results of this review, the analytical methods must be considered. Deville´ et al. (2002) advocates Table 2 Tests reviewed. Abduction Inferior Stability test (ABIS) Active Compression (AC) Anterior Slide (AS) Biceps Load I (BLI) Biceps Load II (BLII) Compression Rotation (CR) Clunk (Cl) Crank (C)
Forced Abduction (FA) Internal Rotation Resistance (IRRT) Jerk (J) Kim (K) New Pain Provocation (NPPT) Posterior Impingement Sign (PIS) Resisted Supination External Rotation (RSER)
Table 3 Quality assessment. Study (first author)
Design Sample Mean % Test details size age Males
Guanche 2003 Kibler 1995 Kim 1999
P C U U P C Kim 2001 P C Kim 2005 U C Liu 1996a P U McFarland R C 2002
Reference Appropriate Inclusion, Approtest spectrum exclusion priate reference criteria test
Disease progression bias avoided
Partial verification bias avoided
Differential verification bias avoided
Incorpo- Index Reference Test review test test ration bias details details bias avoided avoided
Diagnostic Clinical test review info available bias avoided
Uninterpretable results explained
Withdrawal bias avoided
A
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
U
U
U
Y
A
N
U
Y
U
N
N
Y
Y
Y
U
U
U
N
Y
61
38
77
226
24.6
75
Crank, Active Compression Anterior Slide
75
24.8
85
Biceps Load I
A
N
Y
Y
Y
Y
Y
Y
Y
N
Y
U
U
Y
Y
127
30.6
70
Biceps Load II
A
N
Y
Y
U
Y
Y
Y
Y
N
Y
U
N
Y
Y
172
43
61
Kim, Jerk
A
Y
Y
Y
U
Y
Y
Y
Y
N
U
Y
U
Y
Y
62
28
65
Crank
A
N
Y
Y
U
Y
Y
Y
Y
Y
Y
U
U
Y
Y
426
44
42
Compression Rotation, Anterior Slide, Active Compression Posterior Impingement New Pain Provocation, Crank Resisted Supination External Rotation, Crank, Active Compression Forced Abduction, Compression Rotation, Abduction Inferior Stability, Clunk, Anterior Slide, Crank, Active Compression, Posterior Jerk Active Compression
A
Y
Y
Y
Y
Y
Y
Y
Y
Y
U
U
U
Y
Y
A
N
N
Y
U
Y
Y
Y
Y
N
U
U
U
Y
Y
A M
N
Y
Y N
U
N
N
Y
Y
Y
Y
U
U
Y
Y
A
N
N
Y
Y
Y
U
Y
Y
N
U
Y
U
U
U
A
N
Y
Y
Y
Y
Y
Y
Y
N
Y
Y
U
U
U
R M CD A
U
N
N
U
N
N
U
Y
N
U
U
N
Y
Y
Y
U
Y
U
Y
Y
Y
Y
N
U
U
U
Y
Y
A
Y
N
Y
N
Y
Y
Y
Y
Y
Y
U
U
Y
Y
S
Y
Y
Y
U
Y
Y
Y
Y
N
Y
U
U
Y
Y
Meister 2004 Mimori 1999
P C P U
69
23
U
32
20.9
94
Myers 2005
P
40
23.9
97.5
Nakagawa 2005
P
54
23
96
O’Brien 1998
P
268
U
U
Parentis 2006
C P C
132
45
74
Stetson 2002 Zaslav 2001
P NR P C
65
45.9
73
110
44
58
Active Compression, Anterior Slide, New Pain Provocation, Crank Active Compression, Crank Internal Rotation Resistance
Design: P, prospective; U, unreported; C, consecutive; NR, non-randomised. Reference test: A, arthroscopy; R, radiography; M, MRI; S, surgical findings, CD, clinical data. Response to quality checks: Y, yes; N, no; U, unclear/unreported. For glossary of terms, refer to Table 1 Key sources of bias as identified by Lijmer et al. (1999) and Whiting et al. (2004) are: case control design, partial and differential verification bias, absent/inappropriate reference standard, clinical test and diagnostic test review bias and availability of clinical information.
Table 4 Results. Sensitivity (95% CI)a
Specificity (95% CI)a
O & SM
0.290
0.900
SLAP lesions
O & SM
Guanche 2003
Glenoid labral lesions
O
0.474 (0.310e0.642) 0.630
0.547 (0.495e0.599) 0.730
Active compression
Myers 2005
SLAP lesions
SM
0.778
0.111
Active Compression Active Compression
Nakagawa 2005 O’Brien 1998
SLAP lesions Labral tears
O & SM O & SM
Active Compression
Parentis 2006
SLAP lesions
O & SM
0.540 1.000 (0.933e1.000) 0.652
0.600 0.985 (0.957e0.997) 0.486
Active Compression
Stetson 2002
Labral tears
O & SM
Anterior Slide
Kibler 1995
SM
Anterior Slide
McFarland 2002
Superior glenoid labral tears SLAP lesions
O & SM
Anterior Slide
Nakagawa 2005
SLAP lesions
O & SM
0.538 (0.334e0.734) 0.784 (0.684e0.865) 0.079 (0.017e0.214) 0.050
Biceps Load I
Kim 1999
SLAP lesions
O
Biceps Load II
Kim 2001
SLAP lesions
O & SM
Clunk
Nakagawa 2005
SLAP lesions
O & SM
Compression Rotation Compression Rotation
McFarland 2002
SLAP lesions
O & SM
Nakagawa 2005
SLAP lesions
O & SM
Crank
Guanche 2003
Glenoid labral lesions
O
Crank
Liu 1996a
O & SM
Crank
Mimori 1999
Glenoid labral tears SLAP lesions
O
Crank
Myers 2005
SLAP lesions
SM
Crank
Nakagawa 2005
SLAP lesions
Crank
Parentis 2006
Crank
Stetson and Templin 2002 Nakagawa 2005
Test
First author
To evaluate
Setting
Abduction Inferior Stability
Nakagawa 2005
SLAP lesions
Active Compression
McFarland 2002
Active Compression
Forced Abduction
þLR (95% CI)a
LR (95% CI)a
Accuracy (%)a 62.0
1.046 (0.735e1.489) 2.340
0.962 (0.702e1.319) 0.506
54.0
59.5 57.0 98.8
57.746 (20.432e163.20)
0.009 (0.001e1.149)
0.308 (0.170e0.476) 0.816 (0.657e0.923) 0.837 (0.796e0.873) 0.930
0.778 (0.550e1.175) 4.256 (2.161e8.385) 0.485 (0.160e1.472)
1.500 (0.801e2.810) 0.240 (0.16e0.36) 1.100 (0.992e1.220)
40.0
0.909 (0.587e0.998) 0.897 (0.758e0.971) 0.440
0.969 (0.892e0.996) 0.966 (0.904e0.993) 0.680
29.091 (7.342e115.27) 26.325 (8.613e80.455)
0.094 (0.014e0.608) 0.106 (0.042e0.269)
96.0
0.241 (0.103e0.435) 0.250
0.755 (0.700e0.805) 1.000
0.987 (0.501e1.945)
1.004 (0.809e1.246)
0.400
0.730
0.906 (0.750e0.980) 0.833 (0.516e0.979) 0.346
0.933 (0.779e0.992) 1.000 (0.292e1.000) 0.700
O & SM
0.580
0.720
SLAP lesions
O & SM
0.087
0.826
Labral tears
O & SM
0.955 (0.608e1.497)
O & SM
0.564 (0.396e0.722) 0.400
1.059 (0.612e1.831)
SLAP lesions
0.462 (0.266e0.666) 0.670
O
0.885 (0.698e0.976)
0.964 (0.899e0.993)
24.769 (8.083e75.902)
0.120 (0041e0.347)
94.5
O
0.733 (0.541e0.877)
0.979 (0.940e0.996)
34.711 (11.099e108.55)
0.272 (0.150e0.493)
93.6
O
0.800 (0.614e0.923)
0.937 (0.883e0.971)
12.622 (6.543e24.351)
0.214 (0.104e0.437)
91.3
O
1.000 (0.846e1.000) 0.174
0.900 (0.555e0.997) 0.899
7.174 (1.619e31.782)
0.025 (0.002e0.394)
96.8
5.034 (1.752e14.463)
0.288 (0.170e0.487)
95.8
85.8 76.8 54.0
94.5 57.0 70.6 63.0
1.481
0.821
13.594 (3.547e52.099) 6.462 (0.477e87.549)
0.100 (0.034e0.296) 0.220 (0.068e0.711)
91.9 86.6 44.4 66.0 33.8 67.0
Internal Rotation Resistance
Zaslav 2001
Jerk
Kim 2005
Kim
Kim 2005
New Pain Provocation New Pain Provocation
Mimori 1999
Differentiate intra-articular pathology from impingement Posteriore inferior labral lesion Posteriore inferior labral lesion SLAP lesions
Parentis 2006
SLAP lesions
O & SM
Posterior Impingement Sign Posterior Jerk
Meister 2004
O & SM
0.755 (0.611e0.867)
0.850 (0.621e0.968)
Nakagawa 2005
Posterior labral tears and rotator cuff tears SLAP lesions
O & SM
0.250
0.800
56.0
Myers 2005
SLAP lesions
SM
0.828
0.818
82.5
Resisted Supination External Rotation
O, orthopaedic; SM, sports medicine. Shading indicates tests not included in data analysis due to lack of availability of raw data from the primary studies. a Gaps indicate that confidence intervals and values are not calculable due to lack of raw data.
128
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
Table 5 Test descriptions. Test
Description
Threshold
Biceps Load test I (BLI) to identify SLAP lesions
Patient is supine. Examiner sits adjacent to patient on the same side as affected arm, grasps wrist and elbow. Arm is abducted to 90 with the forearm supinated. External rotation is applied to the point of apprehension. Patient is asked to flex the elbow while the examiner resists with one hand and observes for any change in symptoms. The arm is elevated to 120 and externally rotated to its maximum point. The elbow is positioned in 90 flexion with the forearm supinated. The patient is asked to flex the elbow against resistance.
Positive if apprehension increases or pain reproduced. Negative if apprehension decreases or discomfort decreases
Biceps Load test II (BLII) to identify SLAP lesions
Internal Rotation Resistance test (IRRT) to differentiate between intra-articular pathology and impingement Kim test (K) to identify postero-inferior labral lesions
Jerk test (J) to identify postero-inferior labral lesions
Crank test (C) to identify glenoid labral tears.
Standing, the examiner is behind the patient. Arm in 90 abduction in coronal plane and 80 ER. Manual isometric muscle test for ER compared with that for internal rotation Patient is sitting. With the arm in 90 of abduction, the examiner holds the elbow and lateral aspect of the proximal arm and a strong axial load is applied. The arm is moved into 45 of elevation while the axial force is maintained and a posterior and inferior force is applied to the proximal arm. Patient is sitting. The scapula is stabilised by the examiner with one hand and the patient’s arm is abducted to 90 and internally rotated 90 . An axial force is applied with the examiners other hand holding the elbow and a simultaneous horizontal adduction movement is applied. Patient is upright with the arm elevated to 160 in the scapular plane. Joint load is applied along the axis of the humerus with one hand whilst the other hand performs humeral rotation. The test can be repeated in supine.
the construction of 2 2 tables by extracting the raw data for data analysis and the reporting of pairs of complimentary outcome measures. Although this was performed in the majority of studies, several studies (Guanche and Jones, 2003; Myers et al., 2005; Nakagawa et al., 2005; Parentis et al., 2006) omitted raw data. More sophisticated and accurate techniques such as reporting the relationship between TPR and FPR in ROC space and reporting LR are less frequently used. This is unfortunate, as LR have been identified as the most powerful results to judge the clinical utility of a test (Hayden and Brown, 1999), and it has been suggested that authors may overstate the value of test results in the absence of LR (Honest and Khan, 2002). Several of the studies included in this review demonstrate flaws consistent with sources identified by Lijmer et al. (1999) and Whiting et al. (2004) to overestimate diagnostic accuracy, which may have inflated their estimations of accuracy. Future research needs to address these methodological issues in order to provide confidence in the results. The quality of appraisal of study methodologies is often hindered by insufficient detail in the reporting (Deville´ et al., 2002; Mallet et al., 2003), and so it was found in this review in relation to availability of clinical information and demographic details of the study populations.
Test is positive if there is pain during resisted elbow flexion or if there is an increase in the pain already present. Test is negative if no pain is elicited or if the pre-existing pain is unchanged or diminished by resisted elbow flexion Positive if patient with positive impingement has good strength ER with weakness IR, this is predictive of non-outlet impingement A positive test is indicated by a sudden onset of posterior shoulder pain, regardless of a clunk.
A positive test is indicated by a sharp pain with or without a clunk or click.
A positive test is indicated during the manoeuvre (usually external rotation) with or without a click or if there is reproduction of symptoms as felt by the patient during overhead or work activities.
5. Limitations to the review Like all systematic reviews, the results are dependent on the articles identified in the searches. The authors had undertaken exhaustive searches of the published English-language literature, however searches for unpublished studies and foreign language studies were not carried out which means a few relevant papers may have been missed. The power of the analysis is also dependent on the number of studies identified for each individual test. This was low; for many tests there was only one paper on which to base the analysis and hence the SROCs of the Crank and the AC tests may be of limited value. The authors’ decision to include case control studies could be seen as a weakness as this is not the optimal design. However it was felt necessary to include for quality appraisal, studies which have regularly been reported in narrative reviews to have high accuracy (Kibler, 1995; O’Brien et al., 1998). 6. Conclusion There is limited evidence from single well carried out studies to suggest that the Biceps Load tests I and II, the IRRT, the Kim test and the Jerk test are accurate in differentiating
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
labral pathology from other pathologies in selected populations. However other tests for labral pathology (AC, AS and Crank) when re-evaluated in studies not carried out by the developers of the tests have not produced such accurate results. There is a need therefore for further evaluation of labral pathology tests to see whether these tests are as accurate when carried out in different populations by less skilled examiners. Physiotherapists working in extended roles are in an ideal position to do this. Further to this, future research needs to address the key sources of bias relative to diagnostic test screening and provide more detailed demographic information and adequate raw data in order to produce clinically relevant LR and allow for results to be analysed fully. Acknowledgements The authors would like to thank Dr Sarah Tyson from the University of Salford for her comments assisting with the redrafting of this review. References Altchek DW, Warren RF, Wickiewicz TL. Arthroscopic labral debridement: a three year follow up study. The American Journal of Sports Medicine 1992;20(6):702e6. Altman DG. Inter-rater agreement, Practical statistics for medical research. 1st ed. London: Chapman & Hall; 1999. 14.3, p. 403e8. Andrews JR, Carson WG, McLeod WD. Glenoid labrum tears related to the long head of biceps. The American Journal of Sports Medicine 1985;13: 337e41. Bennett WF. Specificity of the Speed’s test: arthroscopic technique for evaluating the biceps tendon at the level of the bicipital groove. Arthroscopy 1998;14(8):789e96. Berg EE, Ciullo JV. A clinical test for superior glenoid labral or ‘SLAP’ lesions. Clinical Journal of Sport Medicine 1998;8(2):121e3. Davis P, Fitzgerald A, Alderson P. Feasibility of the QUADAS tool for quality assessment of diagnostic studies in guideline development. In: 4th Annual guidelines international network conference. http://www.g-i-n.net/ download/files/b33_davies.pdf; 2007 [accessed 25.01.08]. Deeks JJ, Morris JM. Evaluating diagnostic tests. Bailliere’s Clinical Obstetrics and Gynaecology 1996;10(4):613e30. Deville´ WLJM, Bezemer PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. Journal of Clinical Epidemiology 2000;53:65e9. Deville´ WLJM, Buntinx F, Bouter LM, Montori VM, De Vet HC, Van Der Windt DA, et al. Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Medical Research Methodology:1, http://www. biomedcentral.com/content/pdf/1471-2288-2-9.pdf, 2002;2 [accessed 26.06.07]. Ebell MH. An introduction to information mastery: reading an article about diagnosis. Department of Family Medicine, Michigan State University, http://www.poems.msu.edu/InfoMastery/Diagnosis/Diagnosis.htm; 1998 [accessed 12.03.07]. Green S, Buchbinder R, Glazier R, Forbes A. Interventions for shoulder pain (Cochrane review), The Cochrane Library. Chichester, UK: John Wiley & Sons, Ltd; 2003. Issue 4. Guanche C, Jones DC. Clinical testing of the glenoid labrum, arthroscopy. The Journal of Arthroscopic and Related Surgery 2003;19(5):517e23. Hanchard N, Cummins J, Jeffries C. Evidence based clinical guidelines for the diagnosis, assessment and physiotherapy management of shoulder impingement syndrome. London, UK: Chartered Society of Physiotherapy; 2004. Section 7, p. 41e56.
129
Hasan SA. Superior labral lesions. Emedicine, http://www.emedicine.com/ orthoped/topic317.htm 2006 [accessed 26.06.07]. Hayden SR, Brown MD. Likelihood ratio: a powerful tool for incorporating the results of a diagnostic test into clinical decision making. Annals of Emergency Medicine 1999;33(5):575e80. Holtby R, Razmjou H. Accuracy of the Speed’s and Yergason’s tests in detecting biceps pathology and slap lesions: comparison with arthroscopic findings. Arthroscopy: The Journal of Arthroscopic and Related Surgery 2004;20(3):231e6. Hollingworth W, Lenkinski R, Shibata DK, Bernal B, Zurakowski D, Comstock B, et al. Interrater reliability in assessing quality of diagnostic accuracy studies using the QUADAS tool: a preliminary assessment. Academic Radiology 2006;13(7):803e10. Honest H, Khan KS. Reporting of measures of accuracy in systematic reviews of diagnostic literature. BMC Health Services Research(4), http://www. biomedcentral.com/1472-6963/2/4, 2002;2 [accessed 26.06.07]. Hopley L, van Scalkwyk J. The magnificent ROC (receiver operating characteristic curve), http://www.anaesthetist.com/mnm/stats/roc/Findex.htm; 2007 [accessed 12.03.08]. Jaeschke R. Users guide to the medical literature III. How to use an article about a diagnostic test A. Are the results of the study valid? JAMA 1994; 271(5):389e91. Jones GL, Galluch DB. Clinical assessment of superior glenoid labral lesions: a systematic review. Clinical Orthopaedics and Related Research 2007; 455:45e51. Kibler WB. Specificity and sensitivity of the anterior slide test in throwing athletes with superior glenoid labral tears. Arthroscopy 1995;11(3): 296e300. Kim S, Ha K, Han K. Biceps load test: a clinical test for superior labrum anterior and posterior lesions in shoulders with recurrent anterior dislocations. The American Journal of Sports Medicine 1999;27(3):300e3. Kim SH, Ha KI, Ahn JH, Choi HJ. Biceps load test II: a clinical test for SLAP lesions of the shoulder. Arthroscopy 2001;17(2):160e4. Kim SH, Park JS, Jeong WK, Shin SK. The Kim test: a novel test for posteroinferior labral lesion of the shoulder e a comparison to the Jerk test. The American Journal of Sports Medicine 2005;33:1188e92. Kumar VP, Satku K, Balasubramaniam P. The role of the long head of biceps brachii in the stabilization of the head of the humerus. Clinical Orthopaedics 1989;244:172e5. Lijmer JG, Moll WM, Heisterkamp S, Bonsel GJ, Prins MH, Van der Meulen J, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282(11):1061e6. Liu SH, Henry MH, Nuccion S, Shapiro MS, Dorey F. Diagnosis of glenoid labral tears. A comparison between magnetic resonance imaging and clinical examinations. The American Journal of Sports Medicine 1996; 24(2):149e54. Liu SH, Henry MH, Nuccion SL. A prospective evaluation of a new physical examination in predicting glenoid labral tears. The American Journal of Sports Medicine 1996;24(6):721e5. Liume JJ, Verhagen AP, Miedema HS, Kuiper JI, Burdorf A, Verhaar JAN, et al. Does this patient have an instability of the shoulder or a labrum lesion? JAMA 2004;292(16):1989e99. Maffet MW, Gartsman GM, Moseley B. Superior labrumebiceps tendon complex lesions of the shoulder. The American Journal of Sports Medicine 1995;23(1):93e8. Mallet S, Summerton N, Deeks J, Halligan S, Altman D. Systematic reviews of diagnostic tests in cancer: assessment of methodology and reporting quality [abstract]. In: XI Cochrane colloquium: evidence, health care and culture; 2003, Oct 26e31. Barcelona, Spain. McFarland EG, Kim TK, Savino RM. Clinical assessment of three common tests for superior labral anterioreposterior lesions. The American Journal of Sports Medicine 2002;30(6):810e5. Meister K, Buckley B, Batts J. The posterior impingement sign: diagnosis of rotator cuff and posterior labral tears secondary to internal impingement in overhand athletes. The American Journal of Orthopaedics 2004;33(8):412e5. Mimori K, Muneta T, Nakagawa T, Shinomiya K. A new pain provocation test for superior labral tears of the shoulder. The American Journal of Sports Medicine 1999;27(2):137e42.
130
W. Munro, R. Healy / Manual Therapy 14 (2009) 119e130
Musgrave DS, Rodosky MW. SLAP lesions: current concepts. The American Journal of Orthopaedics 2001;1:29e38. Myers TH, Zemanovic JR, Andrews JR. The resisted supination external rotation test. The American Journal of Sports Medicine 2005;33(9): 1315e20. Nakagawa S, Yoneda M, Hayashida K, Obata M, Fukushima S, Miyazaki Y. Forced abduction and elbow flexion test: a new simple clinical test to detect superior labral injury in the throwing shoulder. The Journal of Arthroscopic and Related Surgery 2005;21(11):1290e5. O’Brien SJ, Pagnani MJ, Fealy S, McGlynn SR, Wilson JB. The active compression test: a new and effective test for diagnosing labral tears and acromioclavicular joint abnormality. The American Journal of Sports Medicine 1998;26(5):610e3. Parentis MA, Glousman RE, Mohr KS, Yocum LA. An evaluation of the provocative tests for superior labral anterior posterior lesions. The American Journal of Sports Medicine 2006;34:265e8. Peat J, Mellis C, Williams K, Xuan W. Health science research, a handbook of quantitative methods. London: Sage Publications Ltd; 2002. Section 2, p. 237. Reider B. Physical examination. The American Journal of Sports Medicine 2004;32(2):299e300. Sackett DL. A primer on the precision and accuracy of the clinical examination. JAMA 1992;267(19):2638e44. Schmitz MA. The recognition and treatment of superior labral anterioreposterior (SLAP) lesions in the shoulder. Medscape General Medicine(1), www. medscape.com/viewarticle/408488, 1999;1 [accessed 30.10.02]. Snyder SJ, Karzel RP, Del Pizzo W, Ferkel RD, Friedman MJ. SLAP lesions of the shoulder. Arthroscopy 1990;6:274e9.
Stetson WB, Templin K. The Crank test, the O’Brien test, and routine magnetic resonance imaging scans in the diagnosis of labral tears. American Journal of Sports Medicine 2002;30(6):806e9. The Cochrane methods working group on systematic review of screening and diagnostic tests: recommended methods, http://www.nihs.go.jp/dig/ cochrane/cochrane/sadtdoc1.htm; 1996 [accessed 26.06.07]. Whiting P, Rutjes AWS, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology:25, http://www.biomedcentral.com/1471-2288/3/25, 2003;3 [accessed 26.06.07]. Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Annals of Internal Medicine 2004;140(3):189e202. Whiting P, Harbord R, Kleijnen J. No role for quality scores in systematic reviews of diagnostic accuracy studies. BMC Medical Research Methodology:19, http://www.biomedcentral.com/1471-2288/5/19, 2005;5 [accessed 26.06.07]. Whiting PF, Weswood ME, Rutjes AWS, Reitsma JB, Bossuyt PNM, Kleijnen J. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Medical Research Methodology:9, http://www/ biomedcentral.com/1471-2288/6/9, 2006;6 [accessed 23.01.08]. Zamora J, Abraira V, Muriel A, Khan KS, Coomarasamy A. Meta-DiSc: a software for meta-analysis of test accuracy data. BMC Medical Research Methodology 2006;6:31. Zaslav KR. Internal rotation resistance strength test: a new diagnostic test to differentiate intra-articular pathology from outlet (Neer) impingement syndrome in the shoulder. Journal of Shoulder & Elbow Surgery 2001; 10(1):23e7.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 131e137 www.elsevier.com/math
Original Article
Physiotherapists’ treatment approach towards neck pain and the influence of a behavioural graded activity training: An exploratory study Frieke Vonk a,*, Jan J.M. Pool b, Raymond W.J.G. Ostelo b, Arianne P. Verhagen a b
a Department of General Practice, Erasmus MC, PO Box 2040, 3000 CA Rotterdam, The Netherlands Institute for Research in Extramural Medicine (EMGO), VU University Medical Centre, Amsterdam, The Netherlands
Received 15 April 2007; received in revised form 15 November 2007; accepted 21 December 2007
Abstract Physiotherapists’ treatment approach might influence their behaviour during practice and, consequently, patients’ treatment outcome; however, an explicit description of the treatment approach is often missing in trials. The purpose of this prospective exploratory study was to evaluate whether the treatment approach differs between therapists who favour a behavioural graded activity (BGA) program, conservative exercise (CE) or manual therapy, and whether BGA training has influence on the treatment approach. Forty-two therapists participated. BGA therapists received a 2-day training. Treatment approach was measured at baseline and at 3-month follow-up, using the Pain Attitude and Beliefs Scale for Physiotherapists (PABS-PTs). By this method data on the adoption of biomedical or biopsychosocial approaches were generated. Differences were examined with analysis of variance (ANOVA) and independent Student’s t-test. Influence of the BGA training was examined with linear regression. At baseline, there were no significant differences between BGA, CE or manual therapists use of biomedical or biopsychosocial approaches, but there was a trend for BGA therapists to score higher on the biopsychosocial approach. At follow-up, their biopsychosocial score remained higher and their biomedical score was lower compared to CE therapists. Corrected regression analysis showed a 4.4 points (95%CI 7.9; 0.8) higher decrease for therapists who followed the BGA training compared to therapists who did not. Our results indicate no significant differences in treatment approach at baseline, and that BGA training might influence therapists’ treatment approach since the scores on the biomedical approach decreased. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Attitude; Treatment approach; Neck pain; Physiotherapists
1. Introduction In the Netherlands, neck pain is one of the three most reported musculoskeletal pains and entails considerable costs for health care (Picavet and Schouten, 2003). Because generally no specific underlying pathology can
* Corresponding author. Tel.: þ31 10 4087550; fax: þ31 10 4089491. E-mail address:
[email protected] (F. Vonk). 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2007.12.005
be found, neck pain is often designated as non-specific (Bogduk, 1984). When musculoskeletal pain cannot be explained by an obvious physical cause and when only few guidelines are available, treatment regimens may reflect the clinicians’ beliefs (Foster et al., 2003a). Therapists’ attitudes influence their actual behaviour, which could have implications for the effectiveness of the treatment (Rainville et al., 2000; Linton et al., 2002; Houben et al., 2004). An observational study showed that the
132
F. Vonk et al. / Manual Therapy 14 (2009) 131e137
treatment style of clinicians (concerning prescription of pain medication or bed rest) was related to treatment outcome in low back pain (Von Korff et al., 1994). Health care providers who were fear avoidant also were more likely to advise a patient to avoid painful movements (Linton et al., 2002). Further, it is argued that therapists allegiance and adherence to treatment protocols is a plausible contributor to differences in treatment outcome (Morley and Williams, 2006). Therefore, understanding therapists’ beliefs or treatment approach seems fundamental in developing better ways of managing pain complaints (Foster et al., 2003b). Insight into therapists’ treatment approaches and whether or not training can modify them could have implications for the education of therapists and for daily practice. Two different treatment approaches are described in literature. First, the traditional biomedical approach in which treatment is focused on pain caused by physiological pathology or impairment (Turk and Flor, 1984). Therapists support a pain-contingent treatment approach, where treatment is guided by the amount of pain the patient experiences. Second, the biopsychosocial treatment approach in which psychological and social factors are assumed to be important determinants in the development and maintenance of complaints, and in which pain can persist long after the initial pathology has healed. Therapists support a time-contingent approach in which patients’ activities are systematically increased (Fordyce, 1976; Lindstrom et al., 1992). To measure physiotherapists’ treatment approach, Ostelo et al. (2003) developed the questionnaire ‘pain attitudes and belief scale for physiotherapists (PABSPTs)’, which was further validated by Houben et al. (2005b). From this questionnaire two categories can be generated: a biomedical approach and a biopsychosocial approach. The categories are not opposites of the same scale, but both are important in determining therapists’ treatment approach (Houben et al., 2005b). The questionnaire has been used to examine the treatment approach of different therapists, physiotherapy students, and general practitioners (GPs) (Houben et al., 2005a,b; Jellema et al., 2005). A recent review of five measurement tools for health care providers’ attitudes and beliefs concluded that the PABS-PT was one of two to have undergone the most thorough testing to date (Bishop et al., 2007). Although physiotherapists’ treatment approach may be important, an explicit description is often missing in trials. The aim of this exploratory study was to appraise the treatment approach of therapists in two ongoing trials (Vonk et al., 2004; Pool et al., 2006). Therefore, we formulated three research questions. First, do therapists who favour a behavioural graded activity (BGA) program differ in their treatment approach from those therapists who favour conservative exercise (CE) or manual therapy? Second, does the primary specialisation
(physiotherapy/manual therapy) influence the treatment approach? This influence is assumed because in the Netherlands certified manual therapists are specialised in manipulation techniques and are allowed to use them, whereas physiotherapists are not. Third, can BGA training, based on the principles of behavioural change as described by Fordyce (1976) and as applied by Lindstrom et al. (1992), influence therapists’ treatment approach?
2. Methods 2.1. Physiotherapists Therapists included in this study (n ¼ 45) were involved in one of two ongoing randomised clinical trials (RCTs), i.e. Ephysion (Vonk et al., 2004) or the Neck Trial (Pool et al., 2006). In these trials a BGA program was compared with either conventional exercise (Ephysion) or manual therapy (Neck Trial) in sub-acute or chronic neck pain patients. Before assessment of the treatment approach, participating therapists were given the choice to decide which treatment arm they were most comfortable with to deliver within the trial. As a result, both the BGA and the CE treatment arm in the Ephysion study consisted of both physiotherapists and manual therapists. Three therapists from the Ephysion study were excluded: two applied after baseline measurement and one did not complete the baseline measurement. The BGA therapists from the Neck Trial were excluded because their treatment approach was only assessed after the BGA training. Consequently, insight into the influence of that training on their treatment approach was not possible. The 42 remaining therapists consisted of 30 therapists from the Ephysion study (13 CE therapists and 17 BGA therapists) and 12 manual therapists from the Neck Trial (see Fig. 1). All participating manual therapists were certified and registered by the Royal Dutch Association for Physical Therapist (KNGF). After baseline measurement, the BGA therapists received a 2-day training on the BGA approach. The remaining therapists participated in a consensus meeting to standardise their treatments (Vonk et al., 2004; Pool et al., 2006). 2.2. Questionnaires First, therapists’ characteristics were measured by a questionnaire, including gender, age, primary specialisation, work setting, and years of working experience. Second, therapists’ treatment approach towards neck pain was measured with the PABS-PT (Houben et al., 2005b). The PABS-PT is a 19-item questionnaire developed by Ostelo et al. (2003) and further validated by Houben et al. (2005b). It was designed to determine
133
F. Vonk et al. / Manual Therapy 14 (2009) 131e137
Ephysion trial n=30
Neck trial n=12
CE therapists (n=13 of which 3 are manual therapists)
BGA therapists (n=17)
Manual therapy (n =12)
3 manual therapists
10 physiotherapists
n=17
n=17
n=13
n=12
Total = 15 manual therapists
n=13
Research question 3:
Research question 1:
Research question 2:
Could a BGA training influence therapists treatment approach?
Do therapists who favour a different treatment arm differ in their treatment approach? BGA therapists vs. CE therapists vs. manual therapists
Does primary specialization (physiotherapy or manual therapy) influence the treatment approach?
Fig. 1. Overview of the compilation of the groups of therapists analysed to answer the research questions.
physiotherapists’ treatment approach towards chronic low back pain. To make the questionnaire suitable for the present study we replaced ‘low back pain’ with ‘neck pain’. Therapists were asked to rate every item on a six-point Likert scale ranging from ‘totally disagree (1)’ to ‘totally agree (6)’. From this, two factors were generated, i.e. (1) a biomedical approach including 10 items, and (2) a biopsychosocial approach including nine items (Houben et al., 2005b). Each treatment approach is calculated by the sum of the items ranging from 10 to 60 on factor 1 and from 9 to 54 on factor 2. Higher scores on factor 1 indicate a biomedical treatment approach, and higher scores on factor 2 indicate a biopsychosocial treatment approach.
2.3. Data collection The therapists in the Ephysion study received the PABS-PT twice: once at baseline (1 week before either the consensus meeting or the BGA training), and 3 months after the trial started. In the Neck Trial, therapists’ treatment approach was evaluated only 3 months after the trial started. Because the manual therapists from the Neck Trial showed no differences in demographics or characteristics compared with BGA and CE therapists and because they did not receive any training, their data were regarded as baseline data.
2.4. Statistical analysis 2.4.1. Research question 1 First, frequencies (number, mean, and standard deviation, SD) were calculated for demographics and characteristics of the participating therapists. To examine baseline differences in treatment approach we calculated scores for the biomedical and biopsychosocial approach and tested them using a one-way analysis of variance (ANOVA, research question 1). Fig. 1 shows which therapists were compared per research question. For further exploration of research question 1, we calculated a global treatment attitude at baseline, by combining the biomedical and biopsychosocial treatment approach after dividing the scores on these latter approaches into quartile. Five different global treatment attitudes were derived, i.e. (1) therapists were considered to have a purely biomedical treatment attitude when their score was in the highest quartile on the biomedical treatment approach and in the lowest quartile on the biopsychosocial treatment approach, (2) they were considered to have a more biomedical treatment attitude when their score on the biomedical treatment approach was one quartile higher than their biopsychosocial score. The same applies vice versa for a (3) ‘purely’ or (4) ‘more’ biopsychosocial treatment attitude, and (5) therapists were considered to have a neutral treatment attitude when therapists scored both treatment approaches in
134
F. Vonk et al. / Manual Therapy 14 (2009) 131e137
the same quartile. The division into the global attitude is descriptive, no further statistical analyses have been carried out because of the small sample size. 2.4.2. Research question 2 Because of education differences, we assumed that primary specialisation (physiotherapy/manual therapy) could influence the treatment approach (research question 2). To examine this, the manual therapists from the CE treatment arm (n ¼ 3) were added to the manual therapists (n ¼ 12) of the Neck Trial. Then mean scores on the biomedical and biopsychosocial approach were calculated, and both groups were compared with an independent Student’s t-test (a ¼ 0.05). 2.4.3. Research question 3 Finally, we evaluated whether BGA training could influence the treatment approach (research question 3). We calculated follow-up scores of the treatment approaches and the within-person changes between baseline and follow-up. Differences in follow-up scores were examined with independent Student’s t-tests and differences from baseline scores with dependent Student’s t-test (a ¼ 0.05). Then the possible influence of the BGA training on the within-person changes was evaluated with linear regression. Confounding was checked by separately adding variables that were assumed to influence the treatment approach. Variables were subsequently added to the multivariate model when they were related to both the BGA training (determinant) and the within-person change (outcome), and when they changed the regression coefficient of the BGA training by at least 10%; they were added in a block using the method ‘enter’. The examined variables were age (cut-off point 43 years, mean), gender, primary specialisation (physiotherapist/manual therapist), other trainings followed (biomedical/biopsychosocial training), experience of neck pain (yes/no), and work experience (cut-off point 18 years, mean) (Ostelo et al., 2003; Houben et al., 2005b).
3. Results 3.1. Research question 1 In total, 42 baseline questionnaires were completed. Table 1 presents the baseline demographics, characteristics and treatment approaches of the three treatments’ arms. There were no significant differences in characteristics between the therapists. The overall mean age was 43.7 (SD 8.3) years and overall work experience was 19.1 (SD 7.5) years. In general, BGA therapists scored lower on the biomedical approach and higher on the biopsychosocial
Table 1 Baseline data on therapists’ gender/age, work characteristics and scores on treatment approach. Ephysion Ephysion Neck Trial CE therapists BGA therapists manual therapists (n ¼ 13) (n ¼ 17) (n ¼ 12) Male (n) Age in years, mean (SD) Registered as manual therapist (n) Work experience in years (SD) Weekly hours work, mean (SD) Biomedical, mean (SD) Biopsychosocial, mean (SD)
11 42.6 (10.8)
14 44.3 (6.8)
9 44.2 (7.5)
3
6
12
17.1 (8.6)
19.8 (7.1)
20.3 (7.1)
35.2 (9.7)
40.9 (12.0)
36.9 (11.7)
27.6 (4.7)
25.6 (5.4)
28.4 (8.7)
35.1 (4.7)
38.7 (4.5)
36.0 (6.4)
BGA ¼ graded activity program.
approach compared to CE therapists and manual therapists. However, when tested with ANOVA, these differences were not significant for either the biomedical approach ( p ¼ 0.46) or the biopsychosocial approach ( p ¼ 0.14). The quartile borders (for calculating the global treatment attitude) lay at 24.2 and 29.0 points for the biomedical treatment approach and at 34.0 and 39.0 points for the biopsychosocial treatment approach, respectively. With these, the therapists were divided into five global treatment attitudes (Table 2). Table 2 shows that the majority of the CE therapists and manual therapists have a global biomedical attitude (76.9% and 58.3%, respectively) and the majority of the BGA therapists have a global biopsychosocial attitude (56.3%). 3.2. Research question 2 No differences were found for the influence of primary specialisation (physiotherapy/manual therapy) on the treatment approach. The mean biomedical score of Table 2 The five different global treatment attitudes at baseline and the number (percentage) of therapists with that attitude per treatment arm.
Purely biomedical attitude More biomedical attitude Neutral attitude More biopsychosocial attitude Purely biopsychosocial attitude
CE therapists (n ¼ 13)
BGA therapists (n ¼ 17)
Manual therapists (n ¼ 12)
3 7 0 1 2
2 3 3 2 7
6 1 1 1 3
(23.1%) (53.8%) (7.6 %) (15.4%)
(12.5%) (18.8%) (18.6%) (12.5%) (43.8%)
(50%) (8.3%) (8.3%) (8.3%) (25%)
The global treatment attitude was revealed by calculation of one overall score, which was done by combining the quartile scores of the biomedical and the psychosocial approach.
135
F. Vonk et al. / Manual Therapy 14 (2009) 131e137
the manual therapists (n ¼ 15) was 27.6 (SD 8.0) compared with 28.6 (SD 4.8) for the physiotherapists (n ¼ 10) (mean difference [MD] 1.0, 95%CI 4.8; 6.8). The scores on the biopsychosocial approach were 35.7 (SD 5.9) and 35.3 (SD 5.1), respectively (MD 0.4, 95%CI 5.1; 4.4).
3.3. Research question 3 At 3-month follow-up, 27 questionnaires were returned in the Ephysion study. Three therapists (10%) did not return the follow-up questionnaire. They did not differ in demographics, characteristics and treatment approach at baseline compared to the other therapists. The treatment approach scores at follow-up are presented in Table 3. Table 3 shows significantly lower scores at follow-up on the biomedical approach for BGA therapists compared to CE therapists (MD 6.2 points, 95%CI 11.1; 1.3). The scores on the biopsychosocial approach for BGA therapists compared with CE therapists were significantly higher (MD 5.8 points, 95%CI 1.8; 9.9). With regard to the within-person changes from baseline to follow-up, the BGA therapists showed a significant decrease of 4.6 (95%CI 1.8; 7.4) points on the biomedical approach but no changes on the biopsychosocial approach. The CE therapists showed no within-person changes on either approach. Univariately, the BGA training was significantly related to the biomedical approach (B ¼ 3.8, 95%CI 7.4; 0.3). The variables’ work experience and age were found to be confounders. However, because they were significantly correlated (r ¼ 0.88) they could not be considered as separate variables. We considered work experience in physiotherapy a more important contributor to the development of a treatment approach than age and therefore added this variable to the multivariate model. Table 3 Mean scores on the biomedical and biopsychosocial approach at 3-month follow-up and change scores from baseline to follow-up. CE therapists (n ¼ 12)
Biomedical, mean (SD) Biopsychosocial, mean (SD)
BGA therapists (n ¼ 15)
Change scores from baseline to follow-up, mean (SD) CE therapists
BGA therapists
26.9 (4.5)
20.7 (7.1)a
0.8 (3.7)
4.6 (4.9)b
34.5 (4.3)
40.4 (5.6)a
0.8 (3.5)
0.7 (4.8)
a BGA therapists’ scores on both approaches are significantly different from CE therapists’ scores. b BGA therapists biomedical score has significantly decreased from the baseline score in Table 1.
Table 4 Final multivariate models of the influence of the BGA training on the within-person change on the biomedical and biopsychosocial approaches corrected for work experience. Outcome
Variables
Ba
SE
95% CI
Within-person change on the biomedical approach
Constant BGA training Work experience (years)
0.81 4.37 2.43
1.73 1.73
7.95, 0.79 1.15, 6.01
Within-person change on the biopsychosocial approach
Constant BGA training Work experience (years)
6.99 0.67 3.87
1.46 1.46
2.35, 3.69 0.85, 6.89
BGA training (1) vs. no BGA training (0); work experience 18 years (1) vs. work experience 3 months, NDI: 49 17) and 31 controls participated. The whiplash group demonstrated elevated vibration, heat and electrical detection thresholds at most hand sites compared to controls ( p < 0.05). Electrical detection thresholds in the lower limb were no different from controls ( p ¼ 0.83). Mechanical and cold pain thresholds were lower in the whiplash group ( p < 0.05) with no group difference in heat pain thresholds ( p > 0.1). SCL-90 scores were higher in the whiplash group but did not impact on any of the sensory measures. A combination of pain threshold and detection measures best predicted the whiplash group. Sensory hypoaesthesia and hypersensitivity co-exist in the chronic whiplash condition. These findings may indicate peripheral afferent nerve fibre involvement but could be a further manifestation of disordered central pain processing. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Whiplash injury; Sensory hypersensitivity; Hypoaesthesia; Quantitative sensory testing
1. Introduction Whiplash associated disorders (WADs) remain one of the most debated musculoskeletal conditions. Sensory disturbances including hypersensitive responses to mechanical, thermal and electrical stimulation have been consistently shown to be a feature of both the acute and chronic stages of the whiplash condition (Curatolo et al., 2001; Moog et al., 2002; Sterling et al., 2003a). * Corresponding author. Centre of National Research on Disability and Rehabilitation Medicine (CONROD), The University of Queensland, Mayne Medical School, Herston QLD 4006, Australia. Tel.: þ61 7 3365 5344; fax: þ61 7 3346 4603. E-mail address:
[email protected] (M. Sterling). 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2007.12.004
Importantly some of the sensory changes have been shown to be associated with poor functional recovery (Kasch et al., 2005; Sterling et al., 2005). It is generally acknowledged that the sensory hypersensitivity represents augmented central nervous system pain processing mechanisms (Curatolo et al., 2001; Sterling et al., 2003a). However, some of the changes, particularly cold hyperalgesia and sympathetic nervous system (SNS) dysfunction, may be indicative of peripheral nerve pathology (Sterling et al., 2003a). This proposal has some basis as animal and cadaver models simulating whiplash injury have shown that the nonphysiological kinematic movement during the impact induces stresses in cervical neural tissue such as the nerve roots and spinal ganglia resulting in mechanical
A. Chien et al. / Manual Therapy 14 (2009) 138e146
139
compromise sufficient to cause structural damage (Ortengren et al., 1996; Taylor and Taylor, 1996; Cusick et al., 2001). Furthermore, mechanosensitivity has been demonstrated with clinical tests designed to provoke the brachial plexus as well as mechanical hyperalgesia over upper limb nerve trunks (Ide et al., 2001; Sterling et al., 2002a; Greening et al., 2005). Despite these findings, standard clinical neurological examination is often normal and deficits in nerve conduction studies are rarely found (Barnsley et al., 1998; Alpar et al., 2002). Although nerve conduction studies are reliable and reproducible when carried out by a single examiner (Chaudhry et al., 1994), they are limited by their ability to assess only large myelinated nerve fibres and the invasive nature of the technique. Quantitative sensory testing (QST) is proving to be a valuable tool to advance the classification of specific disorders and may be useful in illuminating the underlying mechanism of pain disorders (Edwards et al., 2005). Rolke et al. (2006) have demonstrated the validity of using comprehensive QST to obtain a complete somatosensory profile in order to characterize patients with suspected neuropathic conditions but such testing has never been undertaken in a WAD cohort. In a cross-sectional study design, comprehensive QST was used to further investigate the sensory presentation of chronic WAD. Different modalities were incorporated to provide an indirect measure of primary afferents that mediate both innocuous and painful sensation. We hypothesised that patients with chronic WAD would demonstrate elevated detection thresholds as well as widespread sensory hypersensitivity.
The study was approved by the institutional medical research ethics committee. All the subjects were unpaid volunteers and all gave written informed consent before inclusion.
2. Materials and methods
2.3.2. Thermal (hot, cold) pain thresholds (TPTs) TPTs were measured using the Thermotest system (Somedic AB, Farsta, Sweden) over the mid-cervical spine and the distal aspect of C7/8 dermatomes (dorsal aspect of the hand). The temperature was preset to either increase or decrease at a rate of 1 C/s from a baseline of 30 C. The subject pressed a switch when the cold or warm sensation first became painful (Hurtig et al., 2001). The mean of three trials at each site was calculated for analysis.
2.1. Subjects Thirty-one volunteers (25 females, mean (SD) age 35.3 10.7 years) with neck pain (3 months to 3 years duration) as a result of a motor vehicle crash were recruited. Subjects fulfilled the Quebec Task Force Classification criteria of WAD II, neck complaints and musculoskeletal signs but without conduction loss on clinical neurological examination (Spitzer et al., 1995). Subjects were excluded if they experienced concussion, loss of consciousness or head injury as a result of the accident, a previous history of neck or upper quadrant pain that required treatment and/or a diagnosed psychiatric disorder. The whiplash subjects were recruited via primary care practices and from advertisement within radio and print media. Thirty-one healthy volunteers (25 females, mean age 31.4 8.9) also participated in the study. The control group was recruited from the general community provided they had never experienced trauma or injuries to the cervical spine, head, and upper quadrant requiring treatment.
2.2. Brachial plexus provocation test (BPPT) The BPPT which has been used in previous studies of whiplash (Sterling et al., 2002b; Sterling et al., 2003a) was performed. The angle of elbow extension was measured at pain threshold using a standard goniometer aligned along the mid-humeral shaft, medial epicondyle and ulnar styloid (Balster and Jull, 1997; Sterling et al., 2002b). Subjects indicated their pain during the test on a 10 cm visual analogue scale (VAS) where 0 indicated no pain and 10 was the worst pain imaginable. 2.3. QST 2.3.1. Pressure pain thresholds (PPTs) PPTs were measured using a pressure algometer (Somedic AB, Farsta, Sweden) with a probe size of 1 cm2 and application rate of 40 kPa/s. Test sites included the articular pillars of C5/6, nerve trunk of the median nerve at the elbow bilaterally (palpated on the medial side of the biceps just before it forms its tendon) and at a bilateral remote site (muscle belly of tibialis anterior). The subjects depressed a button when the sensation under the probe changed from one of pressure alone to one of pressure and pain (Sterling et al., 2002b). Triplicate recording was taken at each site and the mean values used for analysis.
2.3.3. Vibration detection thresholds (VTs) A vibrometre (Somedic AB, Sweden) with a tissue displacement range of 0.1 400 mm and a constant frequency of 120 Hz was used. In order to familiarise the subjects with the vibration stimulus, three trials of the test stimuli, or until the subject was able to consistently indicate the onset of the stimulus, were applied over the muscle belly of brachioradialis. Measures were taken over areas of the hand innervated by distal aspect of the C6 (palmar aspect of the 1st metacarpal), C7 (palmar aspect of 2nd metacarpal; dorsum of the
140
A. Chien et al. / Manual Therapy 14 (2009) 138e146
2nd metacarpal) and C8 dermatomes (dorsum of the 5th metacarpal). Subjects indicated when the vibration first appeared, the perception threshold (VPT), and when it disappeared, the disappearance threshold (VDT). The vibration threshold (VT) was the average of VPT and VDT. Triplicate recordings were taken at each site and the mean values used for analysis.
2.3.4. Thermal (hot, cold) detection thresholds (TDTs) TDTs assess the function of afferent small myelinated A-delta fibres (cold sense) and unmyelinated C-fibres (warm sense) (Hallin et al., 1982; Adriaensen et al., 1983; Fowler et al., 1988). Incorporating the method of limits, the Thermotest (Somedic AB, Sweden) was used to measure TDTs over areas of the hand innervated by the C7 (dorsum over the 2nd metacarpal) and C8 (dorsum of the 5th metacarpal) dermatomes. The temperature was preset to either increase or decrease at a rate of 1 C/s from a baseline of 30 C. The patient pressed a switch when they first detected the sensation of warmth or cold.
2.3.5. Electrocutaneous detection and pain thresholds A non-noxious method of electrocutaneous stimulation was used in a method of limits procedure using the Neurometer device (Neurotron, Baltimore, USA). Sites tested were those innervated by C5/6 (anterior shoulder, inferior to shoulder joint line), C7 (distal phalanx of index finger); C8 (distal phalanx of 5th digit) and tibialis anterior as a remote site. Three different sinusoidal frequencies (2000 Hz, 250 Hz and 5 Hz) were applied to each site in order to evoke a response from a different subpopulation of sensory fibre (Katims et al., 1986; Katims et al., 1987). The subjects reported when they first perceived the sensation (perception threshold) and again at the intensity at which they can no longer feel the sensation (disappearance threshold). The mean of these two values were calculated and recorded three times for analysis. The same sites used to determine current detection thresholds were used to determine pain threshold but only a frequency of 250 Hz was used. As the stimulus intensity increased, the subject released a button when they first perceived the stimulus as painful. The procedure was repeated three times with the mean score recorded as electrical pain threshold. Ratios were obtained by dividing the electrocutaneous pain threshold over the electrocutaneous detection threshold. Low intensity electrical stimulation activates large A-beta nerve fibres. Current evoked pain at or close to detection threshold (ratio of less than 2:1) has been suggested to be a substrate of A-beta fibre allodynia (Sang et al., 2003).
2.4. Sympathetic vasoconstrictor reflex (SVR) A laser Doppler (Moor Instruments, Devon, UK) was used to assess SNS function (Schurmann et al., 1999). Electrodes were attached to the thenar eminence of both hands. The test was performed with subjects in a comfortable supine position, arms resting at heart level. After a period of acclimatization and normal breathing, participants were asked to take a sudden deep breath. This provocation manoeuvre (inspiratory gasp) is known to cause a short sympathetic reaction and cutaneous vasoconstriction (Schurmann et al., 1999) and has been used in previous investigation of whiplash (Sterling et al., 2005). The procedure was repeated three times. Two quotients (SRF and QI) which describe vasomotor reflexes following the inspiratory gasp were calculated. SRF value represents the relative drop of the curve after the manoeuvre with the QI parameter also being influenced by the duration of perfusion decrease (Schurmann et al., 1999). 2.5. Questionnaires All participants completed the Neck Disability Index (NDI) (Vernon and Mior, 1991) and The Symptom Check List 90-R (SCL-90-R). The NDI was used to assess the extent of perceived functional disability. The SCL-90-R assessed the psychological well being of participants. 2.6. Procedure Once the informed written consent was obtained, testing was performed in the following order: SVR, BPPT, PPT (tibialis anterior, median nerve, C5/6), TDTs, TPTs, VTs, electrocutaneous detection (2000 Hz, 250 Hz, 5 Hz) and pain thresholds (250 Hz). The SVR testing was performed in a temperature-controlled laboratory. The temperature was set at 20 C, lights were dimmed and ambient noise was kept low. The rest of the testing was completed in a standard airconditioned laboratory. For all the measures, the left side was tested first followed by the right side. 2.7. Statistical analysis The SPSS 12.0 statistical package for Windows was used for analyses. A two sample t-tests determined within subject side to side differences for all measures. A multi-variate analysis of covariance (MANCOVA) was used to compare differences between the chronic whiplash group and controls. SCL-90-R scores were entered as covariates in the analysis. Receiver Operating Characteristic (ROC) analysis was determined to examine the ability of each variable to discriminate between the groups. Variables with
141
A. Chien et al. / Manual Therapy 14 (2009) 138e146
a greater predictive capacity based on the significance level ( p < 0.01) were entered in a logistic regression analysis to determine the best combination to predict group membership. The regression analysis was then subjected to cross-validation analysis (leave one out) to examine its reliability and generalisability. To determine differences in sensory measures between whiplash participants with or without arm pain, Manne Whitney U test was used. The presence of arm pain was defined as any pain (spontaneous or evoked) distal to the shoulder reported by the participants. For all analyses significance was set at p < 0.05.
Table 1 Mean and standard deviation values for each variable. Measures
Site
Whiplash Mean
SD
Mean
SD
PPT
Cx* Med* Tib Ant*
180.35 212.67 394.07
64.77 99.17 188.55
313.86 300.97 592.04
62.49 61.26 170.74
CPT 2000 Hz
Elb* Ind* Lit* Tib Ant
106.9 254.44 193.53 186.92
26.64 55.84 40.96 78.15
88.82 180 145.46 151.52
22.33 45.08 31.88 56.24
CPT 250 Hz
Elb Ind* Lit* Tib Ant
41.84 84.79 83.65 37.26
34.1 32.23 40.31 14.64
32.61 62.16 60.5 41.94
8.68 25.88 21.89 14.45
CPT 5 Hz
Elb Ind* Lit Tib Ant
22 46.35 42.53 27.89
9.15 20.49 25.79 17.38
22.16 35.23 34.84 23.11
10.15 16.36 14.02 10.03
CPT Pain
Elb* Ind* Lit* Tib Ant*
0.33 0.55 0.53 0.37
0.19 0.17 0.16 0.14
0.21 0.34 0.35 0.25
0.1 0.14 0.15 0.12
Vibration
Dor 5th* Dor 2nd* Palm 2nd* Palm 1st*
0.48 0.4 0.46 0.79
0.4 0.27 0.31 0.62
0.29 0.26 0.28 0.41
0.12 0.09 0.16 0.25
3.2. Side to side differences
Heat Det
Ind* Lit*
34.91 34.43
2.29 2.2
32.35 32.32
1.43 1.12
There were no side to side differences for any variable in both groups (all p > 0.05). The mean of left and right sides were calculated and used for further analysis.
Cold Det
Ind Lit*
28.99 28.62
1.55 2.05
29.58 29.56
0.85 0.82
Heat pain
Cx Hand
44.67 45.82
3.12 3.23
45.71 44.72
2.6 2.86
Cold pain
Cx* Hand*
15.4 14.78
8.45 7.97
8.03 9
3.21 3.05
3. Results 3.1. Demographic details For the whiplash group, the mean (SD) symptom duration post injury was 16 11 months. Twenty-four patients were involved in ongoing compensation claims; four had settled their claims and three had no compensation involved. The mean (SD) NDI score was 45.9% 18.8%, a moderate level of disability (Vernon and Mior, 1991). Forty-five percent of whiplash patients reported arm pain at the time of testing and 66% experienced headache.
3.3. BPPT The whiplash group demonstrated less elbow extension at pain threshold (22.3 27.4 ) ( p ¼ 0.05) and higher VAS scores (2.4 2.3) compared to the control group (elbow extension: 11.0 5.9 ; VAS: 0.7 1.1) ( p ¼ 0.05). 3.4. QST 3.4.1. Pain thresholds The whiplash group demonstrated lower PPT’s at all test sites compared to controls ( p < 0.05) (Table 1) There was no significant difference between the two groups for heat pain thresholds ( p > 0.1), while cold pain thresholds were significantly reduced (pain at a higher temperature) at both sites in the whiplash group ( p < 0.01) (Table 1). 3.4.2. VT Fig. 1 shows the average parameters for VT (mean and SD data shown in Table 1). The whiplash group
Control
PPT ¼ pressure pain threshold, Cx ¼ cervical spine, Med ¼ median nerve, Tib Ant ¼ tibialis anterior; CPT ¼ current perception threshold, Elb ¼ elbow, ind ¼ index finger, Lit ¼ little finger; Dor5th and Dor2nd ¼ dorsum surface of the 5th and 2nd metacarpal, Palm1st and Palm2nd ¼ palmar surface of the 1st and 2nd metacarpal; Heat Det and Cold Det ¼ heat and cold detection thresholds. *p < 0.05 On MANOVA of group difference between WAD and controls.
demonstrated elevated detection thresholds for all sites compared to the control group ( p < 0.05). 3.4.3. TDTs Heat detection thresholds were higher in the whiplash group for all test sites compared to the control group ( p < 0.01). Cold detection thresholds were reduced (detection at a lower temperature) in the whiplash group at the 5th metacarpal site ( p < 0.05) but no different from the controls at the 2nd metacarpal area ( p > 0.1) (Fig. 2).
142
A. Chien et al. / Manual Therapy 14 (2009) 138e146
(Fig. 3). There was no difference between the groups for electrical detection thresholds measured at tibialis anterior ( p ¼ 0.83). At 250 Hz, the whiplash group demonstrated lowered pain thresholds at all sites ( p < 0.05). At tibialis anterior a 37% decrease was found, while at all other sites, the whiplash group demonstrated a 20% decrease in pain thresholds. For the electrocutaneous pain over detection threshold ratios, the whiplash group showed differences at all sites when compared to controls ( p < 0.01) (Fig. 4). The index and little finger sites were found to have a pain over detection threshold ratio of less than two (Table 1). Fig. 1. Mean (SE) VTs in the whiplash group and control groups. The stimulus was applied over areas of the hand innervated by C6 (Palm 1st), C7 (Palm 2nd, Dor 2nd) and C8 dermatomes (Dor 5th). *p < 0.05 Significantly different from the control group.
3.4.4. Electrocutaneous stimulation thresholds At 2000 Hz, the whiplash group demonstrated elevated electrical detection thresholds at the shoulder, index and little finger sites ( p < 0.01). At 250 Hz the whiplash group demonstrated elevated electrical detection thresholds at the index and little finger sites ( p < 0.01) and at 5 Hz the same group showed elevated detection thresholds at the index finger site ( p < 0.05) Heat Detection Threshold Temperature (°C)
36
**
**
35
Whiplash Control
34 33 32 31
3.5. SVR The whiplash group demonstrated higher QI (76.00 12.54) ( p ¼ 0.05) and lower SRF (0.55 0.17) ( p ¼ 0.05) indicating reduced vasoconstriction when compared to the control group (QI: 68.17 7.58; SRF: 0.65 0.08). 3.6. ROC analysis Areas under the curve for ROC analysis for all of the sensory tests are presented in Table 2. Summary of the ROC analysis is presented in Table 2. Logistic regression showed that C5/6 and median nerve PPT, 2nd metacarpal heat detection threshold and index finger 2000 Hz detection and pain over detection ratio were the strongest variables and predicted group membership at 96.77%. Cross-validation demonstrated that the four variables combined revealed high sensitivity and specificity to predict group membership (90.32%).
30 IndHeat
Electrocutaneous Threshold
LitHeat
Site
300
Cold Detection Threshold
*
30
Whiplash Control
250 Whiplash Control
29 28
**
200
mA
Temperature (°C)
31
**
150
**
**
100
**
27
*
50
26 25
0 sh2000
24 IndCold
LitCold
Site Fig. 2. Thermal (warmth and cold) detection thresholds (mean SE) in the whiplash group and control groups. The stimulus was applied over the dorsum aspect of the hand corresponding to the C7 (Ind: dorsum over the 2nd metacarpal) and C8 (Lit: dorsum of the 5th metacarpal) dermatomes. **p < 0.01, *p < 0.05 Significantly different from the control group, respectively.
Ind2000
Lit2000
Ind250
Lit250
Ind5
Site and Frequency (Hz) Fig. 3. Electrocutaneous detection thresholds (mean SE). The figure illustrates the sites and frequencies demonstrating significant difference between the whiplash and control groups. **p < 0.01, *p < 0.05 Significantly different from the control group, respectively. (Sh2000, anterior shoulder 2000 Hz (C5/6); Ind2000, index finger 2000 Hz (C7); Lit2000, little finger 2000 Hz (C8); Ind250, index finger 250 Hz (C7); Lit250, little finger 250 Hz (C8); Ind5, index finger 5 Hz (C8)).
143
A. Chien et al. / Manual Therapy 14 (2009) 138e146
Ratio
Electrocutaneous Pain over Detection Ratio 6
Whiplash
5
Control
4 3
**
** **
**
2 1 0
Shoulder
Index
Little
TibAnt
Site Fig. 4. Ratio of electrocutaneous detection threshold/electrocutaneous pain thresholds for the whiplash and control groups (mean SE). The whiplash group showed statistically significant difference for all sites when compared to controls (**p < 0.01).
3.7. Psychological distress (SCL-90-R) The whiplash subjects showed elevated distress, in particular the subscales of somatization (72 4 vs 41 5), depression (73 6 vs 43 5) and general severity index (66 5 vs 46 5) compared to controls ( p ¼ 0.01) (Table 3). Comparing the means of the general severity index, 21 out of the 31 whiplash participants (68%) demonstrated elevated scores above the population norms (Derogatis, 1977). However, when SCL-90-R scores were entered into the analysis as a covariate, group differences remained significant for all measures ( p < 0.05) and the effect size on the sensory measures was small (s2 ranged from 0.031 to 0.157). 3.8. Arm pain vs no arm pain There was no difference between patients with reported arm pain (n ¼ 13) and those without for age, gender and all QST measures ( p > 0.05). There was no difference between the groups for NDI scores (arm pain 42.8 15.0; no arm pain 44.8 20.9) ( p ¼ 1.27).
4. Discussion The results of this study confirm the presence of generalised sensory hypersensitivity in chronic whiplash. Consistent findings of widespread decreased pain thresholds to a variety of sensory stimuli (pressure, thermal, electrical) likely reflects augmented central pain processes as a contributing factor to whiplash pain (Curatolo et al., 2001; Moog et al., 2002). For the first time, the results of this study demonstrate the additional presence of elevated detection thresholds or hypoaesthesia. Hypoaesthesia was found for vibration, electrical and thermal stimulation. VTs were elevated by an average of 40% and present across areas of the hands innervated by the lower cervical nerve roots. This is consistent
Table 2 ROC curves, area under the curve and its significance to discriminate between the whiplash and control groups, for all variables. Variable
Area
Standard error
Significance
C5/6 PPT Index finger 2000 Hz detection threshold 2nd Metacarpal heat detection threshold 5th metacarpal heat detection threshold Little finger, 2000 Hz detection threshold Index finger, pain over detection ration Tib ant PPT Little finger, pain over detection ratio Median nerve PPT Tib Ant, pain over detection ratio QI value Little finger, 250 Hz detection threshold SVR value Cx cold pain threshold Hand cold pain threshold Index finger, 250 Hz detection threshold Shoulder, pain over detection ratio Index finger, 5 Hz detection threshold 5th Metacarpal, cold detection threshold Shoulder, 2000 Hz detection threshold Dorsum 5th metacarpal, VDT Palmr 1st metacarpal, VDT Palmar 2nd metacarpal, VDT Tib ant, 2000 Hz detection threshold Dorsum 2nd metacarpal, VDT Hand, heat pain threshold Tib ant, 250 Hz detection threshold Little finger, 5 Hz detection threshold 2nd Metacarpal, cold detection threshold Tib Ant, 5 Hz detection threshold Shoulder, 250 Hz detection threshold Cx heat pain threshold Shoulder, 5 Hz detection threshold
0.92 0.9
0.05 0.04
0.00* 0.00*
0.89
0.05
0.00*
0.86
0.05
0.00*
0.86
0.05
0.00*
0.84
0.06
0.00*
0.8 0.79
0.06 0.06
0.00* 0.00*
0.78 0.76
0.07 0.07
0.00* 0.00*
0.73 0.72
0.07 0.07
0.01 0.01
0.71 0.71 0.71 0.71
0.08 0.08 0.08 0.07
0.01 0.01 0.01 0.01
0.7
0.07
0.01
0.69
0.07
0.02
0.69
0.08
0.02
0.68
0.07
0.03
0.68 0.68 0.67 0.65
0.08 0.08 0.08 0.08
0.03 0.02 0.03 0.06
0.63 0.61 0.61
0.08 0.08 0.08
0.10 0.18 0.18
0.61
0.08
0.17
0.6
0.08
0.22
0.57
0.09
0.41
0.56
0.08
0.50
0.56 0.53
0.08 0.08
0.45 0.74
*p < 0.05.
with dysfunction of large myelinated or A-beta sensory fibres (Lang et al., 1995; Greening and Lynn, 1998). Altered vibration detection sense is thought to be an early indicator of neural pathology (Greening et al., 2003). Electrical stimulation at detection threshold levels
144
A. Chien et al. / Manual Therapy 14 (2009) 138e146
Table 3 SCL-90-R psychological subscales (mean SD) for whiplash and control groups. SCL-90-R subscale
Whiplash
Control
p Value
Somatization Obsessive compulsive Interpersonal sensitivity Depression Anxiety Hostility Phobic anxiety Paranoid ideation Psychoticism General severity index
72 57 51 73 54 51 50 48 52 66
41 42 42 43 41 41 45 42 42 46
26 weeks Previous periods of shoulder complaints No Yes, left shoulder Yes, right shoulder Yes, both shoulders Previous neck complaints (minimally 1 week) No Yes
9 (9.9%) 28 (30.8%) 13 (14.3%) 11 (12.1%) 12 (13.2%) 18 (19.8%) 31 (34.1%) 23 (25.3%) 28 (30.8%) 9 (9.9%) 36 (39.6%) 55 (60.4%)
Development of complaints Rapid/acute Gradual
28 (31%) 63 (69%)
Shoulder pain (range 0e10)
3.4 (2.2)
Shoulder restrictions (range 0e10)
4.5 (2.8)
J.G. Nomden et al. / Manual Therapy 14 (2009) 152e159 Table 2 Cohen’s Kappa and absolute agreement for dichotomous data. Variables
Kappa Absolute agreement (%)
Active painful arc (present, absent) Passive painful arc (present, absent) Impingement (present, absent) Acromioclavicular swelling (present, absent) Springing test first rib range of motion (normal, restricted) Springing test first rib stiff (present, absent) Springing test first rib pain (present, absent)
0.46 0.52 0.47 e 0.26
74 76 74b 99a 66
0.09 0.66
68 82a
e: Cohen’s Kappa could not be calculated because of incomplete filling of the 2 2 tables. a Tests fulfilling criteria for acceptable reliability. b Test only performed if no restrictions in glenohumeral range of motion were found.
calculated because of incomplete filling of the 2 2 tables. For two tests (‘acromioclavicular swelling’ and ‘springing test first rib pain’) acceptable reliability (absolute agreement > 80%) was found. Table 3 shows the results in absolute agreement for ordinal data. In two functional tests (‘pain HIN’ and ‘pain HIB’) the absolute agreement was less than 80%. In the other seven tests the reliability was acceptable. Data of the differences between observers, results of t-tests for differences in mean range of motion between observers, and the corresponding ICC are shown in Table 4. For the tests ‘abduction passive starting point of painful arc’ and ‘passive external rotation’ the difference between the observers was statistically significant. For these outcome variables no plots were made because systematic differences between the observers exist (Bland and Altman, 1986). In Figs. 1 and 2 Bland and Altman plots are shown for ‘abduction range of motion active’ and ‘abduction active starting point of painful arc’ to illustrate the magnitude and direction of differences across the range of measurements. No funnel shape was observed in the plots. Similar results are found in Bland and Altman plots for ‘abduction passive range of motion’, Table 3 Weighted Kappa and absolute agreement for ordinal data. Variables
Kappa
Absolute agreement (%)
Range of motion HIN HIB
0.52 0.73
85a 94a
Pain HIN HIB Abduction active Abduction passive External rotation passive Impingement Acromioclavicular joint
0.52 0.35 0.65 0.69 0.50 0.62 0.51
79 73 90a 91a 82a 91a 90a
a
Tests fulfilling criteria for acceptable reliability.
155
‘abduction active end point of painful arc’ and for ‘abduction passive end point of painful arc’. Thus differences between observers were consistent across the range of measurements for these tests. In two tests (range of motion in active and passive abductions) an ICC of >0.75 was observed. For these tests the interobserver reliability was acceptable. In summary, 11 of the 23 tests (48%) had an acceptable interobserver reliability.
4. Discussion Substantial variation in the interobserver reliability, ranging from poor to good reliability in the tests of physical examination of the shoulder girdle was found in this study. In the 23 tests performed 11 (48%) fulfilled the criteria of an acceptable reliability. For the tests on dichotomous data two out of seven tests showed acceptable reliability, for tests on ordinal data seven out of nine tests showed acceptable reliability and for tests on interval data two out of seven tests showed acceptable reliability (Tables 2e4). Thus, tests on ordinal data showed a higher reliability than tests on dichotomous or interval data. One might consider several explanations for the overall moderate reliability reported in this study. These explanations are related to the data level of the physical examination, training effects within patients, difference between observers and changes of the outcome as a result of the first physical examination. 4.1. Data level An explanation for better reliability results of tests at ordinal data level could be that patients prefer more response options. Answering on a more gradual, ordinal, scale (no pain, little pain, much pain, and excruciating pain) might be easier than answering on a dichotomous scale: pain absent or present. On a gradual scale patients can indicate more precisely how they experience the pain during the test. The tests producing interval data were all tests based on visual estimation by the observer of active/passive range of motion and starting/end point of a painful arc. Two movements at most were performed during which the examiner had to do his assessment because this was the trial protocol. For the movements active and passive abductions a good reliability was found despite the large standard deviations of the mean difference between the observers. For the observer it may be more difficult (i.e. less reliable) to assess range of motion during the movement, as for instance the starting point or end point of a painful arc, than in an end position of active and passive abductions. A significant difference between the assessments of the two observers was found
156
J.G. Nomden et al. / Manual Therapy 14 (2009) 152e159
Table 4 Differences between observer 1 and observer 2, results of t-test for related samples and ICCs. Variable
Observer 1 mean (SD)
Observer 2 mean (SD)
Abduction range of motion Active Passive
160.2 (40.0) 165.9 (33.0)
160.2 (38.8) 165.0 (34.3)
0.0 (11.1) 1.0 (10.0)
1.000 0.346
0.96a 0.96a
Abduction active Starting point of painful arc End point of painful arc
104.8 (39.2) 158.0 (26.4)
110.7 (37.2) 153.0 (31.4)
5.9 (28.5) 5.0 (26.7)
0.180 0.226
0.72 0.57
Abduction passive Starting point of painful arc End point of painful arc
114.7 (35.2) 162.6 (24.8)
126.9 (36.3) 160.9 (26.5)
12.2 (33.1) 1.6 (19.5)
0.032b 0.617
0.54 0.72
55.5 (19.4)
63.2 (21.5)
7.7 (14.2)
0.05). Data normality was confirmed using the Kolmogorove Smirnov test and allowed a number of t-tests for dependent gait variables to determine significant differences between the two experimental conditions (PRE and POS). Statistical tests were performed in StatisticaÒ software package, version 5.5 and the significance level was set at p < 0.05. Bonferroni’s correction was performed to adjust the significance of coefficient level.
3. Results The findings of the study are summarized in Table 1. These show a number of significant differences between the gait parameters before and after stretching. After stretching participants were able to achieve a 6.6% Table 1 Mean gait variables (standard deviation) before (PRE) and after (POST) stretching exercises, the mean difference and variability within subjects’ trials (Vwt). Variable (unit)
PRE
POST
Difference (%) Vwt
CYD (s) 1.10 0.09 1.09 0.09 0.6 STD (%) 62.0 2.1 60.1 2.4 3.1 (*) SWD (%) 38.0 2.1 39.9 2.4 þ5.1 (*) DSD (s) 0.18 0.01 0.17 0.005 5.6 (*) CAD (step/min) 55.3 4.8 55.7 4.3 þ0.7 SLE (m) 0.51 0.07 0.54 0.06 þ5.8 (*) CLE (m) 0.014 0.01 0.017 0.01 þ16.2 0.96 0.16 1.02 0.15 þ6.6 (*) SPE (m s1) 1.09 0.3 1.17 0.2 þ6.9 HSV (m s1) 12.9 2.6 15.7 4.6 þ21.9 (*) PAM ( ) 5.0 1.5 6.5 1.7 þ28.4 (*) PRO ( ) 24.4 3.1 25.8 3.7 þ5.7 HAM ( ) 49.9 3.3 50.2 5.5 þ0.6 KAM ( ) 23.2 3.6 24.9 5.7 þ7.3 AAM ( )
0.02 0.03 0.12 0.06 0.02 0.16 0.49 0.03 0.18 0.12 0.23 0.15 0.05 0.06
Significant differences ( p 0.05) are marked (*). CYD e cycle duration; STD e stance phase duration; SWD e swing phase duration; DSD e double support phase duration; CAD e cadence; SLE e step length; CLE e toe clearance; SPE e gait speed; HSV e heel velocity at foot strike; PAM e pelvic anterior/posterior tilt amplitude; PRO e pelvic rotation; HAM e hip flexion/extension amplitude; KAM e knee flexion/extension amplitude; AAM e ankle dorsiflexion/extension amplitude; Vwt e Variability within subjects’ trials.
A.L.F. Rodacki et al. / Manual Therapy 14 (2009) 167e172
greater walking velocity achieved by greater step length with no change in cadence. The increase in step length was mainly achieved by virtue of greater motion about the pelvis with increases in both anterior tilt and rotation in the transverse plane. The gait pattern also showed changes in the temporal pattern. The increased gait velocity after stretching was accompanied by a reduction in the stance time, a lower proportion of time in double support and, a longer swing duration. These temporal changes are indicative of improved balance.
4. Discussion This study aimed to analyse the acute effects of stretching the hip flexors muscles on walking gait. It was hypothesized that the transient effect of a single bout of static stretching exercises would acutely increase joint range of motion and change gait pattern. These changes are expected to reduce the risk of falls in elderly (Kerrigan et al., 1998, 2001, 2003; Evans et al., 2003). The gait pattern exhibited immediately before stretching showed dynamic temporal and spatial features similar to those reported in other studies (Murray et al., 1969; Winter, 1991; Prince et al., 1997; Kerrigan et al., 1998; Mills and Barrett, 2001). This indicated that the sample used in the present study was adequate to represent general healthy elderly population living independently in the community. Aging-related conditions (e.g., balance problems, osteoarthritis) may produce changes in gait pattern that could influence our results. The stretching protocol used in this study was similar to several others, which have shown significant gains in range of motion (Murray et al., 1969; Taylor et al., 1990; Bandy et al., 1997; Prince et al., 1997; Feland et al., 2001a,b). Although the acute effects of stretching were not recorded during the experimental session to determine whether they were still present during the gait assessment, the short interval imposed (30 s) was considered sufficient to preserve most exercise effects. Spernoga et al. (2001) analysed the muscleetendon elastic properties over a much longer period and detected significant effects were still present 6 min after stretching. Gait speed has been suggested as the best independent fall-related predictor (Dargent-Molina et al., 1996). Guimar~ aes and Isaacs (1980) and Woo et al. (1995) have demonstrated that fallers tend to have a lower gait velocity in comparison to non-fallers. Therefore, the greater walking speed found after stretching suggests that these exercises were successful to improve some important functional effects of aging and, resulted in improved mobility. Thus, stretching exercises may represent an important strategy to reduce risk of falls during walking.
171
Walking speed is ultimately determined by step length and cadence (Zakas et al., 2005). The greater walking speed found in the present study cannot be explained by cadence, which remained unaltered. Rather increased step length as a result of increased pelvic rotation and tilting range of motion can be considered as the key to the greater walking speed after stretching. The greater range of motion around the pelvis may have allowed the heel of the swinging leg to strike further in front of the body (Rose and Gamble, 2006). Increased pelvic rotation is believed to have an important effect on gait dynamics by flattening the summit of the centre of mass path, which produces a smoother displacement of the body (Rose and Gamble, 2006). It is also described as to cause a more smooth change in the centre of mass that allows the elderly to attenuate the impact forces with the ground. Thus, it can be speculated that reducing the impact forces at heel strike may help to reduce head acceleration during progression and provide a facilitated stabilization of the visual platform (Yack and Berger, 1993) and fewer disturbances over the vestibular apparatus. Increased double support time in the elderly (Kemoun et al., 2002) is another well known predictor of falls. The longer duration of double support can be seen as a necessity to increase stability during progression for the next step (Viel, 2001). Therefore, smaller double support time may indicate a better stability during gait, which may also represent a measure of mobility. This reinforces the idea that stretching exercises can be an effective way to improve gait performance in the elderly. The anterioreposterior heel contact velocity and the toe clearance have been related to the risk of fall (Winter, 1991). The anterioreposterior heel contact velocity was similar to that described in other studies (1.15 m s1 e Sadeghi et al., 2000). This variable is considered to be largely determined by the segmental angular velocities of the thigh, shank and foot of the swinging leg. The stability of the segmental angular velocities found in the present study can explain the unchanged anterioreposterior heel contact velocity and clearance.
5. Conclusion Stretching exercises resulted in important modifications in gait characteristics that allowed the elderly to present a movement pattern more similar to that observed in healthy adults. These results are suggestive that these exercises constitute an attractive strategy to improve and/or reduce the negative influence of aging over a number of functional characteristic related to fall risk during gait. It is important to have in mind that stretching exercises are an important component of physical fitness programs and should be viewed as one of the factors that influences gait performance.
172
A.L.F. Rodacki et al. / Manual Therapy 14 (2009) 167e172
Studies analyzing the long-term effects of stretching exercises performed under supervision are required to observe whether the transient effects shown in the present study occur as a result of a systematic training program. In addition, longitudinal studies relating stretching and the risk of fall are necessary to confirm experimentally these suppositions. Conflicts of interest Authors have exclusive academic interest in this manuscript and there are no conflicts of interest in the present submission. References Andersson BVG, Schultz AB. Transmission of moments across the elbow joint and the lumbar spine. Journal of Biomechanics 1979;12:745e55. Bandy WD, Irion JM, Briggler M. The effect of time and frequency of static stretching on flexibility of the hamstring muscles. Physical Therapy 1997;77(7):1090e6. Blake A, Morgan K, Bendall M. Falls by elderly people at home: prevalence and associated factors. Age & Ageing 1988;17:365e72. Cameron I, Quine S. External hip protectors: likely non-compliance among high risk elderly living in the community. Archives of Gerontology Geriatric 1994;19:273e81. Campbell AJ, Borrie MJ, Spears GF. Risk factors for falls in a community-based prospective study of people 70 years and older. Journal of Gerontology 1989;44:112e7. Cummings SR, Black DM, Nevitt MC, Browner WS, Cauley JA, Genant HK, et al. Appendicular bone density and age predict hip fracture in women. JAMA 1990;263:665e8. Dargent-Molina P, Favier F, Grandjean H, Baudoin C, Schott AM, Hausherr E, et al. Fall-related factors and risk of hip fractures: the EPIDOS prospective study. The Lancet 1996;348:145e9. Evans JM, Zavarei K, Lelas JJ, Riley PO, Kerrigan DC. Reduce hip extension in the elderly: dynamic or postural? Archives of Physical Medicine and Rehabilitation 2003;84:A15. Feland JB, Myrer JW, Merrill RM. Acute changes in hamstring flexibility: PNF versus static stretch in senior athletes. Physical Therapy in Sports 2001a;2:186e93. Feland JB, Myrer JW, Schulthies SS, Fellingham GW, Measom GW. The effect of duration of stretching of the hamstring muscle group for increasing range of motion in people aged 65 years or older. Physical Therapy 2001b;81(5):1110e7. Ferber R, Osternig LR, Gravelle DC. Effect of PNF stretch techniques on knee flexor muscle EMG activity in older adults. Journal of Electromyography and Kinesiology 2002;12:391e7. Guimar~ aes JMN, Farinatti PTV. Ana´lise descritiva de varia´veis teoricamente associadas ao risco de quedas em mulheres idosas. Rev. Bras. Med. Esporte 2005;11(5):299e305. Guimar~ aes RM, Isaacs B. Characteristics of the gait of old people who fall. International Rehabilitation Medicine 1980;2:177e80. Halbertsma J, Goeken L. Stretching exercises: effect on passive extensibility and stiffness in short hamstrings of healthy subjects. Archives of Physical Medicine and Rehabilitation 1994;74:976e81.
Honeycutt PH, Ramsey P. Factor contributing to falls in elderly men living in the community. Geriatric Nursing 2002;23(5):250e5. Kemoun G, Thoumie P, Boisson D, Guieu JD. Ankle dorsiflexion delay can predict falls in the elderly. Journal of Rehabilitation Medicine 2002;34:278e83. Kerrigan DC, Lee LW, Collins JJ, Riley PO, Lipsitz LA. Reduce hip extension during walking: healthy elderly and fallers versus young adults. Archive of Physical Medicine and Rehabilitation 2001;82:26e30. Kerrigan DC, Todd MK, Della Croce U, Lipsitz LA, Collins JJ. Biomechanical gait alterations independent of speed in the healthy elderly: evidence for specific limiting impairments. Archives of Physical Medicine and Rehabilitation 1998;79:317e22. Kerrigan DC, Xenopoulos-Oddsson A, Sullivan MJ, Lelas JJ, Riley PO. Effect of a hip flexor-stretching program on gait in the elderly. Archive of Physical Medicine and Rehabilitation 2003;84:1e6. King MB, Whipple RH, Gruman CA, Judge JO, Schmidt JA, Wolfson LI. The performance enhancement project: improving physical performance in older persons. Archives of Physical Medicine and Rehabilitation 2002;83:1060e9. Kubo K, Kanehisa H, Fukunaga T. Effect of stretching training on the viscoelastic properties of human tendon structures in vivo. Journal of Applied Physiology 2002;92:595e601. McHugh M, Magnusson S, Gleim G, Nicholas J. Viscoelastic stress relaxation in human skeletal muscle. Medicine & Science in Sports & Exercise 1992;24(12):1375e82. Mills PM, Barrett RS. Swing phase mechanics of healthy young and elderly men. Human Movement Science 2001;20:427e46. Murray MP, Kory RC, Clarkson BH. Walking patterns in healthy old men. Journal of Gerontology 1969;24:169e78. Prince F, Corriveau H, He´bert R, Winter DA. Gait in elderly. Gait and Posture 1997;5:128e35. Rose J, Gamble JG. Human walking. 3rd ed. Baltimore: Lippincott Williams & Wilkins; 2006. Sadeghi H, Allard P, Prince F, Labelle H. Symmetry and limb dominance in able-bodied gait: a review. Gait and Posture 2000;12:34e45. Spernoga SG, Uhl TL, Arnold BL, Gansneder BM. Duration of maintained hamstring flexibility after one-time, modified hold-relax stretching protocol. Journal of Athletic Training 2001;36(1):44e8. Taylor DC, Dalton JD, Seaber AV, Garrett WE. Viscoelastic properties of muscleetendon units: the biomechanical effects of stretching. The American Journal of Sports Medicine 1990;18(3):300e9. Viel E. A marcha humana, a corrida e o salto: biomecaˆnica, inves~ es, normas e disfunc¸o ~ es, Manole; 2001. tigac¸o Willy R, Kyle B, Moore S, Chleboun G. Effect of cessation and resumption of static hamstring muscle stretching on joint range of motion. Journal of Orthopaedic and Sports Physical Therapy 2001;31(3):138e44. Winter DA. The biomechanics and motor control of human gait: normal, elderly and pathological. 2nd ed. Waterloo: University of Waterloo Press; 1991. Woo J, Ho SC, Lau J, Chan SG, Yuen YK. Age-associated gait-changes in the elderly: pathological or physiological? Neuroepidemiology 1995;14:65e71. Yack HJ, Berger RC. Dynamic stability in the elderly: identifying a possible measure. Journal of Gerontology 1993;48:225e30. Zakas A, Balaska P, Grammatikopoulou MG, Zakas N, Vergou A. Acute effects of stretching duration of range of motion of elderly woman. Journal of Bodywork and Movements Therapies 2005;9:270e6.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 173e179 www.elsevier.com/math
Original Article
A neuropathic pain component is common in acute whiplash and associated with a more complex clinical presentation Michele Sterling a,b,*, Ashley Pedler c a
Centre of National Research on Disability and Rehabilitation Medicine (CONROD), The University of Queensland, Mayne Medical School, Herston Road, Herston, QLD 4066, Australia b Division of Physiotherapy, The University of Queensland, QLD 4006, Australia c Division of Physiotherapy, The University of Queensland, QLD 4072, Australia Received 21 September 2007; received in revised form 6 January 2008; accepted 21 January 2008
Abstract Whiplash is a heterogeneous condition with some individuals showing features suggestive of neuropathic pain. This study investigated the presence of a neuropathic pain component in acute whiplash using the Self-reported Leeds Assessment of Neuropathic Signs and Symptoms’ scale (S-LANSS) and evaluated relationships among S-LANSS responses, pain/disability, sensory characteristics (mechanical, thermal pain thresholds, and Brachial plexus provocation test (BPPT) responses) and psychological distress (General Health Questionnaire-28 (GHQ-28)). Participants were 85 people with acute whiplash ( 0.09). None of the S-LANSS items could discriminate those with cold hyperalgesia ( p ¼ 0.06). A predominantly neuropathic pain component is related to a complex presentation of higher pain/disability and sensory hypersensitivity. The S-LANSS may be a useful tool and the BPPT a useful clinical test in the early assessment of whiplash. Ó 2008 Published by Elsevier Ltd. Keywords: Whiplash; Neuropathic pain; Sensory hypersensitivity; Acute pain
1. Introduction Whiplash associated disorders (WADs) are heterogeneous and costly musculoskeletal conditions. A proportion (approx. 20e30%) of whiplash-injured people demonstrate a complex presentation manifested by higher levels of pain and disability, cold and mechanical * Corresponding author. Centre of National Research on Disability and Rehabilitation Medicine (CONROD), The University of Queensland, Mayne Medical School, Herston Road, Herston, QLD 4066, Australia. Tel.: þ61 7 3365 5344; fax: þ61 7 3346 4603. E-mail address:
[email protected] (M. Sterling). 1356-689X/$ - see front matter Ó 2008 Published by Elsevier Ltd. doi:10.1016/j.math.2008.01.009
hyperalgesia and sympathetic nervous system dysfunction (Sterling et al., 2003a; Kasch et al., 2005). Hypoaesthesia to mechanical and thermal stimulations is also present in the chronic stage of the condition (Chien et al., 2009) as well as spinal cord hyperexcitability identified using nociceptor withdrawal reflexes (Banic et al., 2004). The presence of such phenomena suggests the existence of a neuropathic pain condition in some individuals with whiplash. The definition of neuropathic pain is controversial and without consensus (Bennett, 2003). For the purpose of this study, we have utilised the broad definition of The International Association for the Study of Pain (IASP) e neuropathic pain as being
174
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179
caused by a lesion or dysfunction of the nervous system. Sensory hypersensitivity fits with the ‘dysfunction’ aspect of this definition (Bennett, 2003). More importantly it has been shown that the early presence of some neuropathic features is associated with poor functional recovery at both short and long term follow-ups (Kasch et al., 2005; Sterling et al., 2005, 2006). In particular cold and generalized mechanical hyperalgesia occur within a few weeks of injury, in those with eventual poor recovery and persist virtually unchanged to the chronic stage of the condition (Sterling et al., 2003a). Additionally patients with chronic WAD and the presence of both cold and mechanical hyperalgesia demonstrate recalcitrance to physical rehabilitation (Jull et al., 2007). The identification of neuropathic pain has direct implications for treatment where it has been frequently argued that treatments should be directed toward particular pain mechanisms (Gallagher, 2006). In the case of the whiplash sub-group showing early neuropathic features such an approach may have the capacity to reduce the transition to chronicity. In the acute phase following whiplash injury, patients are frequently assessed by musculoskeletal clinicians. At this crucial stage of the condition it is important that practitioners can identify those at risk of poor recovery. Initial high levels of pain and/or disability are the most consistent prognostic factors for whiplash (ScholtenPeeters et al., 2003) and can be easily measured using validated questionnaires (Stewart et al., 2007). However, the clinical determination of sensory hypersensitivity is more difficult, time consuming and usually not evaluated by clinicians. Whilst mechanical hyperalgesia can be measured in the clinic using a commercial pressure algometer (Ylinen, 2007), there are no clinical devices available to quantify cold pain threshold. In recent times various screening tools have been developed to identify neuropathic pain (Bennett et al., 2007). Whilst most have been used in the investigation of more easily recognized neuropathic pain conditions such as diabetic neuropathies, some have identified neuropathic pain in musculoskeletal conditions including low back pain (Freynhagen et al., 2006). Whilst most of the tools require a physical assessment component, the Self-reported Leeds Assessment of Neuropathic Signs and Symptoms’ scale (S-LANSS) is particularly attractive for use in primary care as it is a self-report tool only (Bennett et al., 2005). The usefulness of such tools in the evaluation of whiplash is not known. The aims of this study were: (1) to evaluate the presence of a neuropathic pain component in an acute whiplash cohort, (2) to investigate relationships between S-LANSS scores and pain and disability, psychological distress and sensory features of acute whiplash and (3) to determine relationships of S-LANSS items and cold pain threshold in acute whiplash.
2. Methods 2.1. Participants Eighty-five individuals (54 females, mean (Standard Deviation, SD) age: 36.27 12.69 years, mean (SD) symptom duration: 2.6 1.2 weeks) reporting neck pain as a result of a motor vehicle crash participated in the study. The whiplash subjects were recruited via hospital accident and emergency departments, primary care practices and from advertisement. They were eligible if they met the Quebec Task Force Classification of WAD I, II or III (Spitzer et al., 1995). Subjects were excluded if they were WAD IV, experienced concussion, loss of consciousness or head injury as a result of the accident and if they reported a previous history of whiplash, neck pain or headaches that required treatment. Ethical clearance was gained from the Medical Research Ethics Committee of the institution involved. 2.2. Questionnaires 2.2.1. Neck Disability Index (NDI) The NDI consists of 10 items addressing functional activities such as personal care, lifting, reading, work, driving, sleeping and recreational activities as well as pain intensity, concentration and headache (Vernon and Mior, 1991). There are six potential responses for each item ranging from no disability (0) to total disability (10). The overall score (out of 100) is calculated by totaling the responses of each individual item and multiplying by two. A higher score indicates greater pain and disability (Vernon and Mior, 1991). The NDI is a valid, reliable and responsive measure of neck pain and disability (Pietrobon et al., 2002) and has been frequently used in research of whiplash (Sterling et al., 2006; Stewart et al., 2007). 2.2.2. S-LANSS The S-LANSS is a validated self-report version of the Leeds Assessment of Neuropathic Symptoms and Signs pain scale (Bennett et al., 2005). It consists of seven items and includes two self-examination items. A score of 12 or greater identify patients with pain of a predominantly neuropathic nature (Bennett et al., 2007) (see Appendix 1). 2.2.3. General Health Questionnaire-28 (GHQ-28) The GHQ-28 is a 28-item measure of emotional distress in medical settings (Goldberg, 1978) which is divided into four sub-scales: somatic symptoms, anxiety/ insomnia, social dysfunction, and severe depression. The total score can be used as a measure of psychological distress. The GHQ-28 has been used in previous research of whiplash (Gargan et al., 1997; Sterling et al., 2003b).
175
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179
2.3. Quantitative sensory tests 2.3.1. Pressure pain thresholds (PPTs) PPTs were measured using a pressure algometer with a probe size of 1 cm2 and application rate of 40 kPa/s (Somedic AB, Farsta, Sweden). PPTs were measured bilaterally over the spinous processes of C2 and C5; over the median nerve trunk at the anterior elbow and at a remote site (tibialis anterior). These sites have been previously used in investigation of WAD (Sterling et al., 2003a). Triplicate recordings were taken at each site and the mean values used for analysis. 2.3.2. Cold pain thresholds Cold pain thresholds were measured bilaterally over the mid to lower cervical spine using the Thermotest system (Somedic AB, Farsta, Sweden) (Sterling et al., 2003a). Triplicate recordings were taken at each site and the mean values used for analysis. 2.3.3. Brachial plexus provocation test (BPPT) The BPPT was performed as described previously (Sterling et al., 2003a). The range of elbow extension was measured at the subjects’ pain threshold using a standard goniometer. If the subject did not experience pain, the test was continued until end of available range. At the completion of this test, the subjects were asked to record their pain perceived during the test on a 10 cm visual analogue scale (VAS). 2.4. Procedure Participants first completed all questionnaires. Quantitative sensory testing was then performed in the following order PPT, cold pain threshold and BPPT. The examiner performing these tests was blind to participant responses on the questionnaires. For all tests no verbal feedback was given to participants on their performance. PPT was performed in the following order C2, C5, median nerve, and tibialis anterior. For PPT, cold pain threshold and BPPT, testing was performed on the left side first. 2.5. Data analysis SPSS 14.0 for Windows was used for all analyses. Paired t-tests indicated no difference between sides ( p > 0.05) for PPT, cold pain threshold or responses (elbow extension and VAS pain scores) to the BPPT so the mean of left and right sides was used in further analysis. The participants were classified into two groups based on S-LANSS scores. (1) Pain of predominantly neuropathic nature. This was defined as a score of 12 on the S-LANSS (Bennett et al., 2007; Smith et al., 2007) and (2) non-neuropathic pain defined as a S-LANSS score of 0.28), pain (VAS) reported with the BPPT ( p ¼ 0.48) or GHQ-28 scores ( p ¼ 0.09). Results of ROC analysis indicated that none of the S-LANSS items could significantly discriminate between the group with cold hyperalgesia and those without Table 1 Response frequency to S-LANSS items in the acute whiplash group with a predominantly neuropathic pain component, S-LANSS 12. S-LANSS item
% Positive S-LANSS score
Item Item Item Item Item Item Item
73.7 36.8 84.2 52.6 63.2 84.2 89.5
1 2 3 4 5 6 7
(dysesthesia) (autonomic) (evoked pain) (paroxysmal) (thermal) (allodynia) (tender/numb)
n ¼ 29/85, 34% of cohort.
176
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179
Table 2 Mean (SD) values for sensory and questionnaire data for each group. Variable
Group 1: Pain with predominantly neuropathic component (S-LANSS 12), N ¼ 29
Group 2: non-neuropathic pain (S-LANSS 12), N ¼ 56
p-Value
Pain and disability (NDI) Cold pain threshold ( C) PPT e C2 (kPa) PPT e C5 (kPa) PPT e median nerve (kPa) PPT e tibialis anterior (kPa) BPPT e elbow extension (from 180 ) BPPT e VAS (/10) GHQ-28
42.97 16.38 144.29 146.8 222.23 401.8 56.5 1.7 39.2
27.1 12.3 207.21 242.68 230.8 483.4 35.3 1.1 31.5
0.005 0.036 0.047 0.003 0.79 0.22 0.003 0.82 0.09
(19.5) (6.2) (101.7) (83.7) (95) (183.3) (28) (3.5) (15.6)
(all p > 0.06). Areas under the curve ranged from 0.453 (Item 6) to 0.653 (Item 1).
4. Discussion Whiplash is a heterogeneous condition with some individuals displaying a more complex clinical presentation that includes moderate to high levels of pain and disability, generalized hyperalgesia and hyperexcitable motor responses suggestive of a neuropathic pain condition (Moog et al., 2002; Banic et al., 2004). The results of this study support this proposal with 34% of an acute whiplash cohort demonstrating a predominantly neuropathic nature to their pain as assessed by the S-LANSS instrument. There is much current debate as to whether or not conditions such as whiplash, that demonstrate neuropathic type features but with no obvious injury to the nervous system, do in fact represent neuropathic pain (Fishbain et al., 2008). Recent investigation has shown that chronic low back pain (Freynhagen et al., 2006), fibromyalgia and complex regional pain syndrome 1 (Fishbain et al., 2008) may have a neuropathic component. Our data indicate that this may also be the case for acute whiplash. It has been argued that painful conditions should not be classified into two mutually exclusive groups, that is, either nociceptive or neuropathic in origin (Attal and Bouhissera, 2004; Bennett et al., 2006). These authors advocate a more flexible model of classification where the aim is to identify pain of predominantly neuropathic origin rather than an all or nothing phenomenon (Bennett et al., 2006) and the S-LANSS instrument was developed along these lines (Bennett et al., 2005). Therefore whilst we cannot say that 34% of our cohort have definitive neuropathic pain, our findings indicate that a significant proportion of individuals with acute whiplash injury demonstrate pain that is predominantly neuropathic in nature. The whiplash group with S-LANSS scores of 12 or greater demonstrated a clinical presentation that would support a neuropathic pain model. This group showed NDI scores indicating moderate to severe pain and disability; cold hyperalgesia, local mechanical hyperalgesia and heightened responses to the BPPT. Cold and
(17.6) (6.0) (104.4) (110) (110.4) (235.17) (19) (2.8) (14.6)
mechanical hyperalgesia are common features of neuropathic pain (Bennett, 2006; Wasner et al., in press) and we have previously argued that generalized and heightened responses to the BPPT are likely to be an indication of central nervous system hyperexcitability (Sterling et al., 2002). Interestingly, there was no difference in PPTs at the upper or lower limb sites between the two whiplash groups. This may be considered unusual since widespread mechanical hyperalgesia is also considered to be a feature of neuropathic pain (Koelbaek-Johansen et al., 1999). However, we have recently shown that psychological factors such as distress and catastrophisation may play a role in sensory hypersensitivity at more distant sites (Sterling et al., 2008). As there was no difference in levels of distress between the two whiplash groups of study, this may explain the lack of difference found in PPTs at distant sites. It also suggests that measurement of cold hyperalgesia, PPTs over the cervical spine and responses to the BPPT could provide a clearer clinical picture of possible neuropathic pain in whiplash. The findings of this study are relevant to clinical practice. Many individuals will consult a musculoskeletal clinician in the early acute stage post whiplash injury. It is clear that this is a crucial stage of the whiplash condition as it has been shown that there is limited recovery after two to three months post accident (Rebbeck et al., 2006). It is imperative that primary care practitioners consider the presence of adverse prognostic indicators in their assessment of patients with whiplash. Whilst some prognostic indicators (for example pain and disability levels, Scholten-Peeters et al., 2003) are relatively straight forward to measure in the clinic, others such as cold hyperalgesia (Sterling et al., 2006) are more difficult and require laboratory equipment to quantify. For this reason we explored the ability of individual S-LANSS items and the total score to discriminate the group with cold hyperalgesia from those without, cold hyperalgesia being defined as 15 C (Bennett, 2006). None of the S-LANSS items or the total score discriminated the two groups and as such indicates that additional (physical) measures of cold hyperalgesia may be required for adequate assessment. In addition to its predictive capacity, cold hyperalgesia may also be an
177
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179
indicator of non-responsiveness to physical rehabilitation, at least in chronic WAD (Jull et al., 2007). It is not known whether whiplash-injured patients identified as having pain of a predominantly neuropathic component may also show recalcitrance to standard interventions and this requires further investigation. There was no difference in psychological distress (GHQ-28 scores) between the predominantly neuropathic pain group and the non-neuropathic pain group. However, both whiplash groups were well above the threshold scores of 24/25 for the GHQ-28 (Goldberg, 1978), indicating that whiplash injury and its associated neck pain are distressing irrespective of symptom level. This would support previous findings where elevated levels of distress were found in the majority of an acute whiplash cohort but decreased in those who eventually recovered, closely paralleling decreasing pain and disability levels (Sterling et al., 2003b). Whether or not psychological distress decreases over time in the nonneuropathic group of our study remains to be seen. It has been argued that assessment for the presence of a neuropathic pain component should not only comprise questionnaires but physical examination is also essential (Hansson, 2007). Our findings of a lack of relationship between S-LANSS items and cold pain threshold would concur with this suggestion. The question for the assessment of whiplash is which physical examination tests should be included. At the present time, cold pain threshold is difficult to measure in the clinic but options may include the use of thermorollers set at predetermined temperatures (Jensen and Baron, 2003). Pressure algometry has been
suggested as a useful clinical tool but our results indicate that measurement of PPTs at distant sites may not provide information of neuropathic components to whiplash pain. Instead the BPPT may be a useful clinical tool, due to the differences in elbow extension (bilaterally) between the predominant neuropathic and non-neuropathic group of our study. Heightened bilateral limited elbow extension with the BPPT may provide indication of central hyperexcitability (Sterling et al., 2003a) and our results would support this proposal. It should be noted that in these studies, elbow extension was measured at pain threshold only and that if the BPPT is an indication of augmented central pain processing, then care will be required with its use in order to avoid potential symptom exacerbation.
5. Conclusion The presence of a predominantly neuropathic component to acute whiplash pain was present in 34% of this cohort and is associated with a more complex presentation of higher pain and disability levels, cold hyperalgesia, local cervical hyperalgesia and less bilateral elbow extension with the BPPT. The S-LANSS may be a useful tool to include in the early assessment of whiplash injury. However, there was no relationship between S-LANSS items and cold pain threshold indicating that physical measures of sensory hypersensitivity may also need to be included in the assessment of acute whiplash.
Appendix 1
Leeds Assessment of Neuropathic Symptoms and Sign (S-LANSS) Think about how your pain that you showed in the diagram has felt over the last week. Please tick the descriptions that best match your pain. These descriptions may, or may not, match your pain no matter how severe it feels. 1.
2.
3.
In the area where you have pain, do you also have ‘pin and needles’, tingling or prickling sensations? a. NO – I don’t get the sensations
(0)
b. YES – I do get these sensations
(5)
Does the painful area change colour (perhaps looks mottled or more red) when the pain is particularly bad? a. NO – The pain does not affect the colour of my skin
(0)
b. YES – I have noticed that the pain does make my skin different from normal
(5)
Does your pain make the affected skin abnormally sensitive to touch? Getting unpleasant sensations or pain when lightly stroking the skin might describe this. a. NO – The pain does not make my skin in that area abnormally sensitive to touch (0) b. YES – My skin in that area is particularly sensitive to touch
(3)
178
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179
4.
5.
6.
Does your pain come on suddenly and in bursts for no apparent reason when you are completely still? Words like ‘electric shocks’, jumping and bursting might describe this. a. NO – My pain doesn’t really feel like this
(0)
b. YES – I get these sensations often
(2)
In the area where you have pain, does your skin feel unusually hot like a burning pain? a. NO – I don’t have burning pain
(0)
b. YES – I get these sensations often
(1)
Gently rub the painful area with your index finger and then rub a non-painful area (for example, an area of skin further away or on the opposite side from the painful area). How does this rubbing feel in the painful area? a. The pain area feels no different from the non-painful area.
(0)
b. I feel discomfort, like pins and needle, tingling or burning in the painful area that is different from the non-painful area. (5) 7.
Gently press on the painful area with your finger then gently press in the same way onto a nonpainful area (the same non-painful area that you chose in the last question). How does this feel in the painful area? a. The pain area feels no different from the non-painful area.
(0)
b. I feel numbness or tenderness in the painful area that is different from the non-painful area.
References Attal N, Bouhissera D. Can pain be more or less neuropathic? Pain 2004:110. Banic B, Petersen-Felix S, Andersen O, Radanov B, Villiger P, Arendt-Nielsen L, et al. Evidence for spinal cord hypersensitivity in chronic pain after whiplash injury and in fibromyalgia. Pain 2004;107:7e15. Bennett G. Neuropathic pain: a crisis of definition. Anesthesia and Analgesia 2003;97:619. Bennett G. Can we distinguish between inflammatory and neuropathic pain? Pain Research and Management 2006;11:11e5. Bennett M, Attal N, Backonja M, Baron R, Bouhassira D, Freynhagen R, et al. Using screening tools to identify neuropathic pain. Pain 2007;127:199e203. Bennett M, Smith B, Torrance N, Lee A. Can pain be more or less neuropathic? Comparison of symptom assessment tools with ratings of certainty by clinicians. Pain 2006;122:289e94. Bennett M, Smith B, Torrance N, Potter J. The S-LANSS score for identifying pain of predominantly neuropathic origin: validation for use in clinical and postal research. The Journal of Pain 2005;6:149e58. Chien A, Eliav E, Sterling M. Hypoaesthesia occurs with sensory hypersensitivity in chronic whiplash: indication of a minor peripheral neuropathy? Manual Therapy 2009;14:137e45. Fishbain D, Lewis J, Cutler R, Cole B, Rosomoff H, Rosomoff R. Can the neuropathic Pain Scale discriminate between non-neuropathic and neuropathic pain. Pain Medicine 2008;9:149e60. Freynhagen R, Baron R, Gockel U, Tolle T. painDETECT: a new screening questionnaire to identify neuropathic components in patients with low back pain. Current Medical Research and Opinion 2006;22:1911e20.
(3)
Gallagher R. Management of neuropathic pain. Clinical Journal of Pain 2006;22:S2e8. Gargan M, Bannister G, Main C, Hollis S. The behavioural response to whiplash injury. The Journal of Bone and Joint Surgery 1997;79-B:523e6. Goldberg D. Manual of the general health questionnaire. Windsor: NFER-Nelson; 1978. Hansson P. Diagnostic work up of neuropathic pain: computing, using questionnaires or examining the patient? European Journal of Pain 2007;11:367e9. Jensen T, Baron R. Translation of symptoms and signs into mechanisms in neuropathic pain. Pain 2003;102:1e8. Jull G, Sterling M, Kenardy J, Beller E. Does the presence of sensory hypersensitivity influence outcomes of physical rehabilitation for chronic whiplash? e A preliminary RCT. Pain 2007;129:28e34. Kasch H, Qerama E, Bach F, Jensen T. Reduced cold pressor pain tolerance in non-recovered whiplash patients: a 1 year prospective study. European Journal of Pain 2005;9:561e9. Koelbaek-Johansen M, Graven-Nielsen T, Schou-Olesen A, ArendtNielsen L. Muscular hyperalgesia and referred pain in chronic whiplash syndrome. Pain 1999;83:229e34. Moog M, Quintner J, Hall T, Zusman M. The late whiplash syndrome: a psychophysical study. European Journal of Pain 2002; 6:283e94. Pietrobon R, Coevtaux R, Carey T, Richardson W, De Vellis R. Standard scales for measurement of functional outcome for cervical pain or dysfunction: a systematic review. Spine 2002;27:515e22. Rebbeck T, Sindhausen D, Cameron I. A prospective cohort study of health outcomes following whiplash associated disorders in an Australian population. Injury Prevention 2006;12:86e93. Scholten-Peeters G, Verhagen A, Bekkering G, van der Windt D, Barnsley L, Oostendorp R, et al. Prognostic factors of whiplash
M. Sterling, A. Pedler / Manual Therapy 14 (2009) 173e179 associated disorders: a systematic review of prospective cohort studies. Pain 2003;104:303e22. Smith B, Torrance N, Bennett M, Lee A. Health and quality of life associated with chronic pain of predominantly neuropathic origin in the community. Clinical Journal of Pain 2007;23:143e9. Spitzer W, Skovron M, Salmi L, Cassidy J, Duranceau J, Suissa S, et al. Scientific monograph of Quebec task force on whiplash associated disorders: redefining ‘‘Whiplash’’ and its management. Spine 1995;20:1e73. Sterling M, Jull G, Kenardy J. Physical and psychological predictors of outcome following whiplash injury maintain predictive capacity at long term follow-up. Pain 2006;122:102e8. Sterling M, Jull G, Vicenzino B, Kenardy J. Sensory hypersensitivity occurs soon after whiplash injury and is associated with poor recovery. Pain 2003a;104:509e17. Sterling M, Kenardy J, Jull G, Vicenzino B. The development of psychological changes following whiplash injury. Pain 2003b;106:481e9. Sterling M, Jull G, Vicenzino B, Kenardy J, Darnell R. Physical and psychological factors predict outcome following whiplash injury. Pain 2005;114:141e8.
179
Sterling M, Pettiford C, Hodkinson E, Curatolo M. Psychological factors are related to some sensory pain thresholds but not nociceptive flexion reflex threshold in chronic whiplash. Clinical Journal of Pain 2008;24:124e30. Sterling M, Treleaven J, Jull G. Responses to a clinical test of mechanical provocation of nerve tissue in whiplash associated disorders. Manual Therapy 2002;7:89e94. Stewart M, Maher C, Refshauge K, Bogduk N, Nicholas M. Responsiveness of pain and disability measures for chronic whiplash. Spine 2007;32:580e5. Vernon H, Mior S. The neck disability index: a study of reliability and validity. Journal of Manipulative and Physiological Therapeutics 1991;14:409e15. Wasner G, Naleschinski D, Binder A, Schattschneider J, McLachlan E, Baron R. The effect of menthol on cold allodynia in patients with neuropathic pain. Pain Medicine; in press. doi:10.1111/j.15264637.2007.00290.x. Ylinen J. Clinimetrics: pressure algometry. The Australian Journal of Physiotherapy 2007;53:207.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 180e188 www.elsevier.com/math
Original Article
Effect of motor control and strengthening exercises on shoulder function in persons with impingement syndrome: A single-subject study design* Jean-Se´bastien Roy a,*, He´le`ne Moffet a,b, Luc J. He´bert c,d, Richard Lirette e a
Centre for Interdisciplinary Research in Rehabilitation and Social Integration, Canada b Department of Rehabilitation, Faculty of Medicine, Laval University, Canada c Department of Radiology, Faculty of Medicine, Laval University, Canada d National Defence of Canada, Canada e Club Entrain Medical Center, Canada
Received 4 July 2007; received in revised form 14 January 2008; accepted 21 January 2008
Abstract The aim of the study was to evaluate the effect of an intervention including shoulder control and strengthening exercises on function in persons with shoulder impingement. Eight subjects with shoulder impingement were evaluated weekly during the nine weeks of this single-subject design study. The study was divided into three phases (A1eBeA2) and involved repeated measures of shoulder pain and function (Shoulder Pain And Disability Index (SPADI) questionnaire), painful arc of motion, peak torque and 3-dimensional scapular attitudes. During the intervention phase, each subject participated in 12 exercise sessions supervised by a physiotherapist. Measures taken during the intervention and post-intervention phases were compared to pre-intervention values. All subjects showed significant improvement in the SPADI at the end of the study. A disappearance of a painful arc of motion in flexion and abduction (n ¼ 6), an increase in isometric peak torque in lateral rotation (n ¼ 3) and abduction (n ¼ 2), and changes in the scapular kinematics, mainly in the sagittal plane, were also observed. The present results provide preliminary evidence to support the use of shoulder control exercises to reduce pain and improve function of persons with shoulder impingement. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Rehabilitation; Kinematics; Exercise
1. Introduction
* Institution to which the work should be attributed is Centre for Interdisciplinary Research in Rehabilitation and Social Integration, Quebec Rehabilitation Institute, 525, Boulevard Hamel, Quebec City (QC), Canada G1M 2S8. * Corresponding author. Centre interdisciplinaire de recherche en re´adaptation et en inte´gration sociale, Institut de re´adaptation en de´ficience physique de Que´bec, Local H-1602, 525, Boulevard WilfridHamel, Que´ bec (QC), Canada G1M 2S8. Tel.: þ1 418 529 9141x6559; fax: þ1 418 529 3548. E-mail address:
[email protected] (J.-S. Roy).
1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.01.010
More than a third of painful shoulder diagnoses are related to disorders of the rotator cuff that are often associated with a clinical entity called shoulder impingement syndrome (SIS) (Matsen and Arntz, 1990). SIS has been described as a repeated mechanical compression of the subacromial structures under the coracoacromial arch during arm elevation (Matsen and Arntz, 1990). In a systemic review, Michener et al. (2004) concluded from limited evidence that exercises and joint mobilization are efficacious for people with SIS. Many other studies have also reported a positive effect of exercises,
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
such as strengthening, stretching, and motor control exercises, on shoulder function (Brox et al., 1993; Bang and Deyle, 2000; Ludewig and Borstad, 2003; McClure et al., 2004; Walther et al., 2004; Ginn and Cohen, 2005). However, the duration of the proposed exercise programs (three weeksesix months), as well as their intensity and level of subject’s supervision diverged widely across studies. Several studies have identified impairments associated with SIS. They have reported that people with SIS present weakness of scapulohumeral muscles (Warner et al., 1990; Leroux et al., 1994) and improper control of the glenohumeral (G/H) and scapulothoracic (S/T) movements during arm elevation. Improper control is characterized by changes in muscle activation levels. More specifically, lower activity of the serratus anterior, higher activity of the upper and lower trapezius (Ludewig and Cook, 2000), and lack of coordination between the different parts of the trapezius have been observed (Wadsworth and Bullock-Saxton, 1997; Cools et al., 2003). This inadequate muscle control is believed to contribute to a reduction of amplitude in posterior tilting and lateral rotation of the scapula during arm elevation (Ludewig and Cook, 2000; Borstad and Ludewig, 2002). Lower activity of the infraspinatus and subscapularis (Reddy et al., 2000) as well as inadequate coactivation of the scapulohumeral muscles (Myers et al., 2003) have also been reported. This abnormal muscle control is most likely associated with a reduction of the subacromial space (Graichen et al., 1999; He´bert et al., 2003b) leading to impingement. The hypothesis that can be derived from these studies is that an improper control of G/H and S/T joint movements and strength deficits in the scapulohumeral and scapulathoracic muscles seem to be partly responsible for the level of shoulder disability in patients with SIS. Hence, this highlights the importance of assessing a comprehensive rehabilitation program that combines two types of exercises used in rehabilitation for SIS: supervised motor control exercises to correct the abnormal G/H and S/T movements and strengthening exercises. As a first step to assess the potential benefit of such a program, the individual responses to this type of rehabilitation intervention must be evaluated. The aim of this study was to evaluate, using a single-subject design, the effects of a 4-week supervised rehabilitation intervention based on a combination of shoulder control and strengthening exercises on shoulder function in persons with SIS.
2. Methods 2.1. Subject selection Eight subjects with unilateral SIS, diagnosed by an orthopaedic surgeon, were recruited (Table 1). The subjects were included if they had at least one positive
181
finding in each of these categories (He´bert et al., 2003b): (1) painful arc of movement during flexion or abduction, (2) positive Neer or KennedyeHawkins impingement signs, or (3) pain on resisted lateral rotation, abduction or Jobe test. Exclusion criteria were type III acromion, calcification or fracture; shoulder instability; previous shoulder surgery; and cervicobrachialgia or shoulder pain during neck movement. All subjects signed an informed consent form. This study was approved by the Ethics Committee of the Quebec Rehabilitation Institute. 2.2. Study design An A1eBeA2 single-subject design was used (Backman et al., 1997). The study was divided into three phases over a 9-week period. Within the first two weeks, three evaluations of the outcome measures were performed (phase A1). During the following four weeks (phase B), each subject participated in 12 supervised exercise sessions and the immediate effect of the intervention was assessed at the end of each week. The last three weeks consisted of the post-intervention phase (phase A2) during which the short-term effects of the intervention were assessed once a week. The subjects were assessed and treated by the same physiotherapist. 2.3. Outcome measures The main outcome was the pain and disability level, which was evaluated at the beginning of the study and each week thereafter using the Shoulder Pain And Disability Index (SPADI). The SPADI is a valid and reliable self-administered questionnaire (Roach et al., 1991). Higher scores indicate a greater level of pain and disability (0e100). Secondary outcomes were the presence of a painful arc of motion, assessed at the same time period as the SPADI, the isometric peak torque, the pain intensity during strength tests and the 3-dimensional scapular attitudes (3DSA), assessed at the beginning of the study and the end of A1, B and A2 phases. In a seated position, the presence of a painful arc of shoulder motion during flexion and abduction was evaluated. If pain was present during one of the two trials performed in each plane of movement, the subject was considered having a painful arc of motion in that plane. In a supine position, the maximal isometric strength of shoulder abductors (shoulder at 10 of abduction; elbow at 0 ) and lateral rotators (shoulder at 0 of abduction; elbow at 90 ) was assessed with a dynamometer (Chatillon CSD 300, Greensboro, NC). The mean torque (n ¼ 2) in Newton-meters was calculated. The intensity of pain during these tests was measured with a visual analogue scale (VAS). The VAS scores for each muscle
182
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
Table 1 Subjects’ characteristics at the initial evaluation. Subject
Age (years)
Gender
Dominant side
Impaired side
Weight (kg)
Height (m)
Duration (month)b
Mean SPADIc
S1 S2 S3 S4 S5 S6 S7 S8
53 40 32 49 60 29 56 50
F M F F F F F F
Right Left Right Right Right Left Right Left
Right Right Right Left Right Right Left Left
57 96 85 93 69 62 79 63
1.61 1.80 1.74 1.51 1.62 1.61 1.69 1.70
12 26 3 16 3 5 8 48
69.6 27.0 28.1 67.9 47.1 38.2 42.0 26.5
Total
46a (11)
1 male 7 female
5 right 3 left
5 right 3 left
75.5a (14.9)
1.66a (0.09)
15.1a (15.4)
43.3a (17.4)
a b c
Mean (1 standard deviation). Time between the appearance of the symptoms and the initial evaluation. Mean of the three SPADI scores during phase A1.
group were averaged (n ¼ 2) to calculate the final outcome (0e100). The 3DSA were calculated at two shoulder positions, 90 of abduction and 70 of flexion, with the Optotrak Probing System (Northern Digital Inc., Waterloo, Ontario, Canada) (He´bert et al., 2000; Roy et al., 2007). These positions were chosen because it has been shown that a reduced posterior tilting at those two positions along with five other variables could explain 91% of the variance of the pain and disability level experienced by subjects with SIS (He´bert et al., 2003a). Two trials were recorded at each position and the mean (n ¼ 2) was used for the analysis. For each trial, six body landmarks were digitized: three on the scapula (acromial angle, inferior angle, root of the spine), and three on the trunk (C7 spinous process, right and left posterosuperior iliac spines). The position of the scapula was calculated relative to the trunk. The three scapular rotations used to described the 3DSA were lateral/medial rotation, anterior/posterior tilting, and protraction/retraction (Fig. 1). The coordinate system and Euler angle sequence of rotations were defined in accordance with ISB recommendations (Wu et al., 2005).
2.5. Phase B: intervention Before developing the intervention program, a review of the literature on SIS and motor learning principles was conducted and a focus group of physiotherapists was held. Thereafter, the aims of the intervention were determined. It was firstly to promote proper scapula kinematic during arm elevation against gravity and secondly, to strengthen the scapulohumeral and scapulathoracic muscles with an external resistance. The decision to introduce strengthening exercises with an external resistance only when proper shoulder control has been observed was taken to ensure a gradual loading of the muscle-tendon-bone units without any setback in the pain level. It resulted that during the intervention more emphasis was put on shoulder control. The subjects participated in three exercise sessions per week. Exercises of increasing difficulty in terms of movement plane, ROM, number of repetitions, speed and resistance were performed. Two indicators were used to determine the level of difficulty of the exercises: quality
2.4. Phase A1: pre-intervention At the first evaluation visit, the outcomes were all evaluated and the participants were taught a standardized home exercise program. This program, performed daily, was comprised of submaximal isometric contraction exercises in abduction and lateral and medial rotations against a wall. This program was prescribed for ethical reasons since it was not possible to leave the subjects without any intervention for two weeks. Participants were evaluated at the end of each week during this 2-week phase. At the last evaluation, a standardized physical examination was performed (shoulder range of motion [ROM], evaluation of scapular movements during arm elevation). The results of this examination were used to determine the intensity of the exercises performed in phase B.
Fig. 1. Representation of the scapular rotations around the Y, X and Z axes. The scapular rotations are defined in accordance with the ISB recommendations. The sequence of rotations used is YsXsZs.
183
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
of shoulder motion and perceived intensity of pain. The intervention started with shoulder control exercises during arm elevation in the frontal, sagittal and scapular planes. These exercises were progressed following a 6phase retraining program and began under the close supervision of the physiotherapist, who directed the retraining with feedback (Table 2). The retraining phases were graded according to: (1) the level of resistance applied on the shoulder during arm elevation (no resistance/passive movement; active assisted; active with or without external resistance); and (2) the use or non use of feedback during the movement. The phases ranged from no resistance with feedback to active movement with external resistance without feedback. In each retraining phase, the ROM was gradually increased as shoulder control improved until proper control was achieved for the full ROM in each vertical plane. When the subject was able to perform a series of 10 repetitions with proper control, series were added to reach three. Then, the subject moved up the next phase. At the end of each session, exercises in diagonal planes were performed. Subjects had to touch targets in a determined sequence, which took into account the maximum ROM they were able to reach in each vertical plane. Once abduction up to a range of 90 was properly controlled, humeral lateral rotation at 90 of abduction was performed. When a proper control was achieved with supervision, the exercise was practiced alone as home exercise. The criterion to start strengthening exercises was to be able to perform pain-free arm elevations with a resistance of 0.45 kg. Humeral medial and lateral rotation at 0 of shoulder abduction using Thera-Bands (red to blue level), push-ups with a progression from vertical wall to
standard horizontal push-ups, and horizontal arm abduction in supine performed with a dumbbell (starting with 0.45 kg) were the exercises performed. The number of repetitions was increased from one to three series of 10. When three series were easily performed, resistance was progressively increased. 2.6. Phase A2: post-intervention At the end of phase B, an individualized home exercise program was given. The content of this program was determined according to the level of shoulder control and strength reached at the end of phase B and was reviewed at the two subsequent visits. 2.7. Data analysis Outcome values obtained in phases B and A2 were compared to the pre-intervention values using two standard deviations above and below the pre-intervention mean (A1-interval). For the outcomes measured on a weekly basis (SPADI; painful arc of motion), two consecutive SPADI scores outside the A1-interval or an absence of a painful arc of motion for two consecutive evaluations were necessary to conclude to a significant change in the corresponding B and A2 phases. For the outcomes measured less frequently (peak torque; pain during peak torque), one measurement outside the A1interval was necessary to conclude to a significant change in phases B and A2. Finally, the differences between 3DSA of phase A1 and 3DSA of phases B and A2 were calculated and illustrated graphically to describe the direction of changes during the study.
Table 2 Phases for retraining of shoulder control and manual feedback given according to scapular dyskinesis. Phases
Steps for retraining of shoulder control 1
2
3
4
1a
Passive elevation Active assisted elevationb
3a
Active elevation with manual feedback if needed Phase 3, but without manual feedback Phase 4, but without visual feedback. Phase 5, but with the elevation performed faster, and then with a load.
Active return with manual feedback if needed Active return with manual feedback if needed Active return with manual feedback if needed
Verbal feedback
2a
Final position actively kept for 5 sec Final position actively kept for 5 sec Final position actively kept for 5 sec
4a 5 6
Verbal feedback Verbal feedback
Types of dyskinesis
Description of the scapular dyskinesis
Manual feedback
1
Decrease of the scapular lateral rotation
2 3
Tilt of the scapular inferior angle Elevation of the superior border of the scapula Tilt of the medial scapular border
Guidance of lateral rotation with a lateral pressure on the inferior angle of the scapula Restriction of the tilt with a anterior pressure on the inferior angle of the scapula Restriction of the scapular elevation with a inferior pressure on the acromion
4 a b
Restriction of the tilt with a anterior pressure on the medial border of the scapula
In front of a mirror. Movement assisted by the physiotherapist to reduce the load on the shoulder.
184
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
3. Results Seven of the eight subjects showed significant improvement in the SPADI during phase B. For five subjects, the improvement started following the first intervention week, whereas for two subjects, the improvement started following the second week. All eight subjects showed significant improvement during phase A2 (Fig. 2). In flexion, one subject (subject 5 [S5]) did not experience a painful arc of motion, while the seven other subjects presented a painful arc of motion during phase A1. During phase B, only one subject (S8) presented significant improvement with disappearance of pain during flexion for two consecutive evaluations. In phase A2,
five subjects (S1, S2, S3, S4, S8) presented significant improvement. At the last evaluation, only one subject (S7) had a painful arc of motion. In abduction, all eight subjects presented a painful arc during phase A1. Two subjects showed significant improvement during phase B (S2, S5) and six during phase A2 (S2, S3, S4, S5, S7, S8). Two subjects (S6, S7) still presented a painful arc in abduction at the last evaluation. Significant increase in isometric abduction peak torque was seen at the end of phases B and A2 for one subject and at the end of phase B for two subjects (Fig. 3). In lateral rotation, a significant increase in peak torque was found in only one subject following phase B, and in three subjects following phase A2 (Fig. 3). Six of the eight subjects (S1, S3, S4, S5, S6 and S8) exhibited significant
Fig. 2. Profile of the SPADI scores. Profiles of the SPADI scores over the three phases of the study (pre-intervention [A1], intervention [B] and postintervention [A2]). The grey band represents two standard deviations above and below the pre-intervention mean and the line in the middle of this band indicates the mean (n ¼ 3) value during phase A1. The * indicates significant changes in the SPADI during phases B and A2.
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
185
Fig. 3. Isometric peak torque in abduction and lateral rotation. The * indicates significant changes in the peak torque during phases B and A2.
reduction of pain intensity during strength testing following phase A2 in abduction and lateral rotation. For the 3DSA in abduction, posterior tilting was increased for seven subjects following phase B and was still increased for five subjects at the end of phase A2; lateral rotation was increased for five subjects following phase B and for six subjects following phase A2; finally, protraction was increased for seven subjects following phases B and A2 (Fig. 4). In flexion, posterior tilting was increased for five subjects following phase B and for six subjects following phase A2; lateral rotation was increased for four subjects at the end of phases B and A2; finally, protraction was increased for four subjects following phases B and A2 (Fig. 5). 3.1. Compliance with the intervention All eight subjects participated in the 12 supervised sessions and performed both shoulder control and strengthening exercises (Table 3). The shoulder control exercises were progressed for all subjects from exercises in the vertical and diagonal planes, to exercises in lateral rotation at 90 of abduction. Strengthening exercises were begun between the third and seventh session with strengthening in medial and lateral rotations. Two subjects had to stop these exercises after three days because of an increased level of pain. Only S4 did not perform push-ups because of pain during its execution. Finally, four subjects performed the horizontal abduction
Fig. 4. Three-dimensional scapular attitude at 90 of abduction. Phases B (white diamonds) and A2 (black triangles) 3DSA differences established in comparison with the mean pre-intervention (A1) phase 3DSA.
strengthening exercise. The four other subjects did not perform horizontal abduction since they had pain during its execution with a dumbbell of 0.45 kg.
4. Discussion The present results suggest that a rehabilitation program based on motor control and strengthening exercises is effective to reduce shoulder pain and
186
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
Fig. 5. Three-dimensional scapular attitude at 70 of flexion. Phases B (white diamonds) and A2 (black triangles) 3DSA differences established in comparison with the mean pre-intervention (A1) phase 3DSA.
promote better function in persons with SIS. These improvements were accompanied, for most subjects, by reduction in pain during maximal contractions and disappearance of the painful arc of motion. Interestingly, the improvement persisted after the end of the supervised intervention, suggesting that home exercises were sufficient to maintain or even enhance the benefits of the intervention. Our results support the findings of other studies that have shown the positive effects of rehabilitation in persons with SIS (Brox et al., 1999;
Bang and Deyle, 2000; Ludewig and Borstad, 2003; McClure et al., 2004; Walther et al., 2004; Ginn and Cohen, 2005). The main contribution of this study is to propose a 4week exercise program, based mainly on motor control principles, that provides a fast improvement in shoulder pain and function. In comparison to previous studies in which exercises have been used to improve shoulder control in individuals with SIS, our results seem promising. Indeed, Conroy and Hayes (1998) reported no difference in pain following a supervised exercise program of similar duration (three weeks) but composed of other types of exercises (stretching and isometric strengthening). The addition of joint mobilization to their program led, however, to a better functional outcome. As in the present study, Ludewig and Borstad (2003) also observed a significant improvement in shoulder function following home exercises. However, the duration of their home exercise program was more than twice longer (10 weeks) as ours. Finally, improvement in shoulder function has also been demonstrated by Brox et al. (1999) following a much longer supervised exercise program of threeesix months. The intervention proposed in this study includes shoulder control exercises targeting the specific impairments described in patients with SIS (Ludewig and Cook, 2000; Borstad and Ludewig, 2002). More specifically, the exercises were designed, in part, to promote larger amplitude of posterior tilting and lateral rotation of the scapula during arm elevation. Such changes in scapular rotations were not consistently found among subjects. Variability in the response to the intervention in a relatively small sample of subjects may explain this result. One can also argue that the measure used to quantify scapular rotations was not sensitive enough to capture changes that are relevant to function. When looking at individual data, changes of small magnitudes were observed following intervention for some subjects. They were mostly found in the sagittal plane with larger posterior tilting amplitude. It is known that posterior tilting elevates the anterior part of the acromion and that the acromiohumeral distance in people with SIS is decreased by only 1.2e1.3 mm around 90 and 110 of arm elevation (He´bert et al., 2003b). Therefore such small increases in posterior tilting could have resulted in less compression of the subacromial structures (Ludewig and Cook, 2000), which may have had an impact on overall shoulder pain and function. Only small changes were observed in the isometric peak torques following the intervention. In the present study, more emphasis was put on exercises promoting better shoulder control in the first weeks of the intervention. Strengthening exercises were only introduced when proper shoulder control was achieved. Once started, strengthening exercises were progressed in order to gradually load the muscleetendonebone units without any
187
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188 Table 3 Description of the exercises performed during phase B. Subjects
S1 S2 S3 S4 S5 S6 S7 S8
Vertical planes
Diagonal planes
Lateral rotation at 90 of abduction
Medial and lateral rotation with Thera-Band
Push-ups
From
From
From
For
From
For
From
For
From
For
3 5 3 2 11 9 11 10
4 3 4 7 4 5 7 3
9 10 9 2b 9 8 3b 10
6 4 5 NP 6 8 8 4
7 4 8 NP 7 5 5 9
9 12 8 NP NP NP NP 7
4 1 5 NP NP NP NP 6
1 1 1 1 1 1 1 1
a
(4) (3) (4) (6) (4) (5) (4) (3)
For 12 12 12 12 12 12 12 12
2 2 4 4 2 3 2 1
(7) (4) (8)
a
(11) (8) (9) (8)
For 11 8 7 7 11 8 8 11
7 5 6 7 2 4 2 2
a
(7) (11) (7)
(7) (5)
Horizontal abduction in supine
Abbreviations: From, session where the exercise was first performed; For, the total number of sessions where the exercise was performed; NP, exercise not performed. a The number in brackets represents the session where the exercise was first performed with a dumbbell. b The exercise had to be stopped because of an increased level of pain.
setback in the pain level. In some subjects who experienced pain during strengthening exercises, these exercises had to be stopped or progressed more slowly than expected. One can hypothesize that tension or compression of the degenerated rotator cuff tendons may have been responsible for the enhancement of shoulder pain. Hence, the number of weeks during which they were performed was probably not large enough to bring about changes in shoulder strength. In comparison, McClure et al. (2004) observed significant gains in the isometric strength of the rotators and abductors of the shoulder following a 6-week program composed of more intense strengthening exercises. Undoubtedly, strengthening exercises help improve function in subjects with SIS. They should, however, be introduced at a proper stage during recovery to avoid pain recurrence and performed at a sufficient intensity to promote functional changes. Although all the subjects showed improvement in shoulder pain and function, they did not reach normal level at the last evaluation. A longer follow-up evaluation could have provided more information of the long term outcomes and guided us on the need for some subjects to have a longer duration of supervised intervention. The home exercises performed during the preintervention phase may have introduced an additional source of variability on measurements, potentially leading to larger 95% confidence intervals and a reduction in our capacity to detect changes. The use of a single-study design limits the generalizability of the results and, by performing repeated measurements of outcomes, bias may have been introduced. Finally, the effect of not having an independent evaluator may have reduced the strength of our conclusions. The use of a self-administered questionnaire as the primary outcome, as well as standardized measurement procedures and valid outcomes enhance, however, the confidence in our results. This study has brought a deeper understanding of the mechanisms that led to the changes observed following
the proposed program. However, a randomised controlled trial is needed to confirm the present findings.
5. Conclusions Results of this study suggest that a 4-week program including motor control and strengthening exercises reduces shoulder pain and improves function in persons with SIS. To better understand how shoulder control is modified, further studies need to evaluate changes in muscle and interjoint coordination using electromyography and motion analysis systems. Nonetheless, this study provides preliminary evidence to support the use of shoulder control exercises to promote better function in people with SIS.
References Backman CL, Harris SR, Chisholm JA, Monette AD. Single-subject research in rehabilitation: a review of studies using AB, withdrawal, multiple baseline, and alternating treatments designs. Archives of Physical Medicine and Rehabilitation 1997;78:1145e53. Bang MD, Deyle GD. Comparison of supervised exercise with and without manual physical therapy for patients with shoulder impingement syndrome. Journal of Orthopaedic & Sports Physical Therapy 2000;30:126e37. Borstad JD, Ludewig PM. Comparison of scapular kinematics between elevation and lowering of the arm in the scapular plane. Clinical Biomechanics (Bristol, Avon) 2002;17:650e9. Brox JI, Gjengedal E, Uppheim G, Bohmer AS, Brevik JI, Ljunggren AE, et al. Arthroscopic surgery versus supervised exercises in patients with rotator cuff disease (stage II impingement syndrome): a prospective, randomized, controlled study in 125 patients with a 2 1/2-year follow-up. Journal of Shoulder and Elbow Surgery 1999;8:102e11. Brox JI, Staff PH, Ljunggren AE, Brevik JI. Arthroscopic surgery compared with supervised exercises in patients with rotator cuff disease (stage II impingement syndrome). British Medical Journal 1993;307:899e903. Conroy DE, Hayes KW. The effect of joint mobilization as a component of comprehensive treatment for primary shoulder impingement
188
J.-S. Roy et al. / Manual Therapy 14 (2009) 180e188
syndrome. Journal of Orthopaedic & Sports Physical Therapy 1998;28:3e14. Cools AM, Witvrouw EE, Declercq GA, Danneels LA, Cambier DC. Scapular muscle recruitment patterns: trapezius muscle latency with and without impingement symptoms. American Journal of Sports Medicine 2003;31:542e9. Ginn KA, Cohen ML. Exercise therapy for shoulder pain aimed at restoring neuromuscular control: a randomized comparative clinical trial. Journal of Rehabilitation Medicine 2005;37:115e22. Graichen H, Bonel H, Stammberger T, Haubner M, Rohrer H, Englmeier KH, et al. Three-dimensional analysis of the width of the subacromial space in healthy subjects and patients with impingement syndrome. American Journal of Roentgenology 1999;172:1081e6. He´bert LJ, Moffet H, McFadyen BJ, St-Vincent G. A method of measuring three-dimensional scapular attitudes using the Optotrak probing system. Clinical Biomechanics (Bristol, Avon) 2000;15:1e8. He´bert LJ, Moffet H, Dionne CE, McFadyen BJ, Dufour M, Lirette R. Shoulder impingement syndrome: clinical indicators and short-term predictors of disability. Archives of Physical Medicine and Rehabilitation 2003a;84:A7. He´bert LJ, Moffet H, Dufour M, Moisan C. Acromiohumeral distance in a seated position in persons with impingement syndrome. Journal of Magnetic Resonance Imaging 2003b;18:72e9. Leroux JL, Codine P, Thomas E, Pocholle M, Mailhe D, Blotman F. Isokinetic evaluation of rotational strength in normal shoulders and shoulders with impingement syndrome. Clinical Orthopaedics and Related Research 1994:108e15. Ludewig PM, Borstad JD. Effects of a home exercise programme on shoulder pain and functional status in construction workers. Occupational Environmental Medicine 2003;60:841e9. Ludewig PM, Cook TM. Alterations in shoulder kinematics and associated muscle activity in people with symptoms of shoulder impingement. Physical Therapy 2000;80:276e91. Matsen FA, Arntz CT. Subacromial impingement. In: Rockwood CA, Matsen FA, editors. The Shoulder. 9th ed. Philadelphia: WA Saunders Co; 1990. p. 623e46. McClure PW, Bialker J, Neff N, Williams G, Karduna A. Shoulder function and 3-dimensional kinematics in people with shoulder
impingement syndrome before and after a 6-week exercise program. Physical Therapy 2004;84:832e48. Michener LA, Walsworth MK, Burnet EN. Effectiveness of rehabilitation for patients with subacromial impingement syndrome: a systematic review. Journal of Hand Therapy 2004;17:152e64. Myers JB, Hwang JH, Pasquale MR, Rodosky MW, Ju YY, Lephart SM. Shoulder muscle coactivation alterations in patients with subacromial impingement. Medicine & Science in Sports & Exercise 2003;35(5):S346. Reddy AS, Mohr KJ, Pink MM, Jobe FW. Electromyographic analysis of the deltoid and rotator cuff muscles in persons with subacromial impingement. Journal of Shoulder and Elbow Surgery 2000;9:519e23. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care & Research 1991;4:143e9. Roy JS, Moffet H, Hebert LJ, St-Vincent G, McFadyen BJ. The reliability of three-dimensional scapular attitudes in healthy people and people with shoulder impingement syndrome. BMC Musculoskeletal Disorders 2007;8:49. Wadsworth DJ, Bullock-Saxton JE. Recruitment patterns of the scapular rotator muscles in freestyle swimmers with subacromial impingement. International Journal of Sports Medicine 1997;18: 618e24. Walther M, Werner A, Stahlschmidt T, Woelfel R, Gohlke F. The subacromial impingement syndrome of the shoulder treated by conventional physiotherapy, self-training, and a shoulder brace: results of a prospective, randomized study. Journal of Shoulder and Elbow Surgery 2004;13:417e23. Warner JJ, Micheli LJ, Arslanian LE, Kennedy J, Kennedy R. Patterns of flexibility, laxity, and strength in normal shoulders and shoulders with instability and impingement. American Journal of Sports Medicine 1990;18:366e75. Wu G, van der Helm FC, Veeger HE, Makhsous M, van Roy P, Anglin C, et al. ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion e part II: shoulder, elbow, wrist and hand. Journal of Biomechanics 2005;38:981e92.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 189e196 www.elsevier.com/math
Original Article
Physiotherapists’ use of advice and exercise for the management of chronic low back pain: A national survey S. Dianne Liddle a,*, G. David Baxter b, Jacqueline H. Gracey a a
Health and Rehabilitation Sciences Research Institute, University of Ulster, Shore Road, Newtownabbey, Northern Ireland b Centre for Physiotherapy Research, School of Physiotherapy, University of Otago, New Zealand Received 31 January 2007; received in revised form 25 January 2008; accepted 30 January 2008
Abstract The objective of the study was to establish the specific use of advice and exercise by physiotherapists, for the management of chronic low back pain (LBP). A questionnaire was mailed to a random sample of 600 members of the Irish Society of Chartered Physiotherapists. Open and closed questions were used to obtain information on treatments provided to chronic LBP patients. Respondents’ treatment goals were also investigated, along with the typical methods used to assess treatment outcome. Four hundred and nineteen of the sample returned the questionnaire; 280/419 (67%) indicated that they currently treated LBP of which 76% (n ¼ 214) were senior grade therapists. Advice and exercise, respectively, were the treatments most frequently used for chronic LBP: advice was most commonly delivered as part of an exercise programme, with strengthening (including core stability) the most frequently used exercise type. Supervision of exercise and follow-up advice were underutilised with respect to the recommendations of relevant clinical guidelines. Pain relief was an important treatment goal. Emphasis on exercise programme supervision, incorporating reassurance that its safe to stay active and ‘hurt does not mean harm’, must be more effectively disseminated and promoted in practice. The influence of follow-up advice on exercise adherence warrants further investigation. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Advice; Exercise; Chronic low back pain; Adherence
1. Introduction The intractability of chronic low back pain (LBP; i.e. symptoms > 12 weeks or 3þ recurrent episodes within 12 months) has led to the adoption of a wide variety of treatment approaches by healthcare professionals (Cherkin, 1998; Foster et al., 1999; Gracey et al., 2002; Armstrong et al., 2003; Snook, 2004), with variable results (Cottingham and Maitland, 1997; Carpenter and Nelson, 1999; * Corresponding author. Room 1F114, Health and Rehabilitation Sciences Research Institute, University of Ulster, Shore Road, Newtownabbey, Co. Antrim BT37 OQB, Northern Ireland. Tel.: þ44 02890 366423; fax: þ44 02890 368068. E-mail address:
[email protected] (S.D. Liddle). 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.01.012
Miller and Timson, 2004). The resulting socioeconomic implications have been identified by various authors (Maniadakis and Gray, 2000; Bartley and Coffey, 2001; Ehrlich, 2003; Speed, 2004; Waddell, 2004). Current research evidence supports the use of advice and exercise for the management of chronic LBP (Hilde et al., 2002; Liddle et al., 2004; van Tulder et al., 2004; Hayden et al., 2005; Liddle et al., 2007a), and previous surveys investigating the physiotherapeutic management of LBP throughout the UK and Ireland have highlighted the popularity of these treatments (Foster et al., 1999; Gracey et al., 2002; Byrne et al., 2006). Kerssens et al. (1999), is one of few studies that have investigated the advice given to LBP patients; their report, compiled from a database of private practice
190
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
consultations within the Netherlands, concluded that physiotherapists’ advice for LBP management is often dependent on the individual therapist, with many differences between therapists in the amount of information provided during treatment, and in the provision of follow-up support after treatment. These findings, along with the conclusions from a recently published systematic review of the use of exercise for chronic LBP (Hayden et al., 2005) have underlined the benefits of individually tailored exercise programmes, and the influence of supervision during treatment on adherence. Similarly, when compared to a general exercise programme, one that was individually tailored to the needs and capabilities of the patient was shown to be more effective in reducing the disability and pain experienced by subacute and chronic LBP patients (Descarreaux et al., 2002). The maintenance of exercise-induced gains is often the most challenging aspect of exercise prescription, being intricately related to the successful integration of exercise science with behavioural techniques, in order to promote adherence and individual goal achievement (ACSM, 2000, p. 140). Kerssens et al. (1999) concluded that the majority of advice or information that was being given to LBP patients was specifically related to home exercises and back care instructions. There is now strong evidence from randomised controlled trials (RCTs) using advice for the management of chronic LBP, to support the use of advice to remain active in addition to specific advice relating to the most appropriate exercise, and/or functional activities for each individual patient (Liddle et al., 2007a). European guidelines for the management of chronic LBP (Airaksinen et al., 2004) support the above, however, there is evidence to suggest that such guidelines and recommendations are frequently not applied in practice (Armstrong et al., 2003; Grol and Buchan, 2006). No previous LBP surveys have specifically investigated the use of both advice and exercise for the management of chronic LBP in current practice, in particular the type and frequency of advice and exercise being offered. Therefore the aim of this survey was to establish the relative importance of advice and exercise for the management of chronic LBP amongst physiotherapists practicing throughout Ireland, with a specific focus on how these treatment approaches would typically be provided. In addition, given the inherent association between therapists’ treatment goals, and their choice of treatment, respondents’ treatment goals, and typical methods of assessing the outcome of treatment were also investigated.
2. Methods 2.1. Survey design A cross-sectional (self-administered) postal questionnaire was developed to investigate physiotherapists’ use
of advice and exercise for the management of chronic LBP: the specific information requested within this study precluded the use of any previously validated questionnaire for this purpose. For the purposes of this survey, chronic LBP was defined as representing patients with symptoms of greater than 12 weeks duration, or those with 3 or more recurrent episodes within the previous 12 months (Liddle et al., 2004). Questions were based on the findings of two systematic reviews carried out by the authors: the first of which investigated the type and frequency of advice provided for LBP, and the other the type and quality of exercise for chronic LBP patients (Liddle et al., 2004). A sample questionnaire is included in the Appendix (published online). The main survey was conducted between March and June of 2004. 2.2. Sampling frame No ethics framework existed in the Republic of Ireland at the time this study was undertaken. The study protocol and questionnaire were reviewed and approved by the executive board of the Irish Society of Chartered Physiotherapists (ISCP), following which a random sample (n ¼ 600) of ISCP members was provided for the survey: a stratified systematic sampling procedure was employed. This sample size was agreed with a statistician, based on the power calculation (from the results of a pilot study), and an anticipated 50e60% response rate. The power calculation indicated that between n ¼ 216 and n ¼ 385 respondents were required for 80% power, and 95% confidence that the response ratio to a given question was not due to chance. As it was not possible to identify each therapist’s practicing area of expertise, in order to identify the most relevant subgroups, information on identified clinical interests of therapists was obtained from the database. From a total membership of approximately n ¼ 1600 therapists, the random sample was subsequently drawn from n ¼ 1000 therapists with at least one of the following clinical interests: acupuncture, manipulative therapy, sports medicine, women’s health, the workplace, community care, education, private practice. This procedure was adopted in an attempt to reduce the number of nonresponders, and respondents who were not employed in a setting that included the treatment of LBP patients. 2.3. Questionnaire A pilot survey was undertaken to ensure the relevance of the content, and clarity of questions in the questionnaire. The final questionnaire contained a combination of 23 open and closed questions, divided into four sections: therapist’s background and LBP experience; information on the therapist’s management strategies for chronic LBP patients, and factors influencing their
191
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
prescription of exercise and advice; the specific type, frequency and mode of delivery of advice and exercise offered to chronic LBP patients; therapists’ use of outcome measures and treatment goals with chronic LBP patients. Closed-ended questions predominated, with the use of a ranked or descriptive answer format: this was intended to enhance question relevance, to allow direct comparisons between respondents, to facilitate objective analysis, and to optimise the level of data analysis where possible (Hicks, 1999, p. 20). The anonymity and confidentiality of the survey design were intended to minimise the influence of social desirability on responses (Metcalfe et al., 2001). The FORMIC software package (Formic limited, London) was used to simplify questionnaire layout, and to allow automated raw data entry into the Statistical Package for the Social Sciences for Windows, Version 11 (SPSS Inc., Cary, NJ; 1989e2001). In order to accurately monitor response rate, each questionnaire was allocated a unique identification number. Those respondents not employed in a setting that included the treatment of LBP were given the option (rather than failing to respond) of answering the first four questions, which provided information on therapist background and experience. Each questionnaire package contained a hand-signed covering letter explaining the study, and a postage-paid, preprinted return envelope (Edwards et al., 2002). Five weeks after the initial distribution of questionnaires, a reminder questionnaire package was sent to nonresponders (Edwards et al., 2002). Postage-paid preprinted postcards with simple ‘tick and return’ response options were then mailed to remaining non-responders (n ¼ 181) after a further six weeks to investigate reasons for non-response, with a further four weeks allocated for replies.
ANOVA) was used to explore whether the number of treatment sessions provided to chronic LBP patients changed in relation to the therapists’ clinical grade. All statistical tests were carried out using Statistical Package for the Social Sciences for Windows, Version 11 (SPSS Inc., Cary, NJ; 1989e2001).
3. Results 3.1. Respondents There was a 70% response rate to the survey (n ¼ 419); 67% of respondents (n ¼ 280/419) indicated that they currently treated LBP patients, and therefore completed the entire questionnaire: 43% (n ¼ 119/280) were employed in a public hospital, and 41% (n ¼ 115/ 280) in private practice. The remaining respondents (n ¼ 139) did not treat LBP: 93 were employed in a public or private hospital, 34 in community care and learning disabilities, 5 in private practice, and 7 in third level education. Twenty-seven percent (n ¼ 49) of nonresponders returned the pre-paid postcard. The most common reason given for not responding was ‘do not treat LBP’; ‘other’ reasons, such as being on (study) leave, or a career break, were also given. Since 73% (n ¼ 132) of non-responders did not return the prepaid postcard, these results represent a very limited description of non-responders. The clinical profile of those respondents currently treating LBP (n ¼ 280) is presented in Table 1: 78% (n ¼ 217) were experienced clinicians with at least six years of experience treating LBP, and 44% (n ¼ 122) had been qualified for more than 10 years. Fig. 1 details the professional development spinal courses completed by respondents.
2.4. Data analysis Automated computer scanning was used to input answers to closed questions; responses to open questions were inputted manually. Apart from descriptive statistics (presented for all responders; n ¼ 419), the remaining statistical analyses were completed on data from responders currently treating LBP (n ¼ 280). As data were not normally distributed and measured on ordinal scales, non-parametric statistics were used. Where appropriate, the level of significance was set at p < .05. Chi-square (c2) analysis was used to explore the relationship between two variables. The Friedman Test (the non-parametric equivalent of the one-way analysis of variance (ANOVA)), was used to determine if a significant difference existed between mean ranks: if a significant difference was found, then the Wilcoxon signed rank test was used compare each rank to establish where the differences occurred. The Kruskal Wallis Test (the non-parametric equivalent of a one-way between groups
3.2. Chronic LBP management NB: The following analyses concentrate only on those respondents currently treating LBP patients, i.e. n ¼ 280.
Table 1 Clinical profile of respondents currently treating LBP (n ¼ 280). Clinical grade
Place of work Public hospital
Private hospital
Private practice
Total Community (%) care
Basic 47 Senior 66 Clinical 2 specialist Manager 4 Private e practitioner
5 11 e
1 13 1
1 19 e
54 (19) 109 (39) 3 (1)
2 5
e 100
3 e
9 (3) 105 (38)
Total
23
115
23
280 (100)
119
192
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
treatment essions provided in relation to the therapist’s clinical experience (Kruskal Wallis Test df ¼ 4, p ¼ .238).
56.4%
60%
50.7% 49.3% 48.6% 46.1%
50%
3.3. Duration of LBP
%
40% 30% 18.9% 18.2%
20%
14.6%
12.5% 5.7%
10% 0%
/P
b
M
gs
&
s
ag
Sn
M
nd
tla
ai
x
ie
ria
Cy
Na
nz
e cK
M
er
in
NG
Pa
th
O
P
AC
M
e
u
Ac
T
ur
ct
n pu
Sc
M
M
Name of professional development spinal course Fig. 1. Percentage of respondents having completed each professional development spinal course (n ¼ 280). Key: Mb/P, muscle imbalance/pilates; NG Pain, neurogenic pain management; MACP, Manipulation Association of Chartered Physiotherapists Manual Therapy Course; MSc MT, Masters level in Manual Therapy.
Respondents were asked to rank the type(s) of treatment they used most frequently with chronic LBP patients; higher ranks indicating more frequent use. Advice, active exercise(s), and mobilisation techniques were ranked first, second and third, respectively (see Table 2). The differences in rank (based upon averaged individual rankings) between these three most popular treatments were all statistically significant (Wilcoxon Signed Ranks Test p < .001). Exercise accounted for the greatest amount of total treatment time respondents gave to chronic LBP patients (median ¼ 40%); however, advice and ‘other treatments’ each accounted for a median of 30% of treatment time. Acupuncture, manipulation, and traction were typically not ranked by respondents. The number of treatment sessions most often provided ranged between 6 and 10 (64% of respondents, n ¼ 179), with a further 27% providing a maximum of 5 sessions. Only 9% of respondents indicated that they would provide greater than 10 sessions. There were no significant differences in the number of Table 2 Treatments provided for chronic LBP patients listed by rank frequency. Name of treatment provided
Mean rank
% As first rank
Advice Active exercises Mobilisation techniques McKenzie Electrotherapy (including ice or heat) Neurogenic pain techniques/neural tension Massage Traction Manipulation (grade v) Acupuncture Other
2.06 2.56 3.50 5.44 5.65
61.8 51.1 25.4 6.8 8.6
6.24
3.2
6.84 7.59 8.28 8.85 9.00
5.0 1.8 1.1 2.9 3.6
The chronic LBP subgroup represented the largest proportion of respondents’ LBP caseloads (Friedman’s Chi square ¼ 28.690, df ¼ 2, p < .001). Chi-square analysis revealed a significant association (c2 ¼ 126.343, p < .001) between the place of work and the proportion of chronic LBP patients included in each respondent’s LBP caseload: of those working in public hospitals, 50% (n ¼ 139) indicated that chronic LBP formed the largest proportion of their LBP caseload, compared to 25% (n ¼ 70) of private practitioners. 3.4. Type and frequency of advice Table 3 shows the mean rank given by respondents to each type of advice. There were no significant differences in the type of advice given in relation to the place of work, or the clinical experience of the therapist. Respondents typically gave advice during the treatment session along with some form of supplementary information, e.g. booklet or exercise sheet. Follow-up advice after the last treatment session was not typically provided. Respondents considered the patient’s age as having the most influence on the advice they provided to chronic LBP patients; cognitive/behavioural factors and current clinical guidelines were also considered highly influential. 3.5. Type and frequency of exercise Ninety-eight percent (n ¼ 273) of respondents frequently used exercise to manage chronic LBP, and expected patients to carry out home exercises. However, only 56% (n ¼ 156) provided a supervised exercise programme. Strengthening exercise (including core stability) was most frequently used by respondents, regardless of their clinical experience, with flexibility exercise ranked second, and aerobic exercise ranked third: there were statistically significant differences between all three ranks (Wilcoxon Signed Ranks Test, p < .001). The Table 3 Type of advice provided for chronic LBP patients listed by rank frequency. Type of advice
Mean rank
% As first rank
Advice as an adjunct to exercise Advice as part of a functional restoration approach Advice to stay active Advice with another intervention Advice as part of a back school approach
2.33 2.41
44.3 42.9
3.20 3.26 3.80
28.6 28.6 16.8
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
most influential factor for respondents when prescribing exercise to chronic LBP patients was the patient’s current pain intensity; cognitive factors were also considered important. Clinical guidelines were considered by respondents to have little influence on the exercise they prescribed (only n ¼ 16 respondents indicated that clinical guidelines influenced their prescription of exercise to chronic LBP patients). 3.6. Respondents’ treatment goals and assessment of treatment outcome Respondents’ treatment goals are ranked in order of importance in Table 4. There was no significant difference in the rank importance that respondents gave to improved function when compared to pain relief (ranked second) (Wilcoxon Signed Ranks Test, p ¼ .206). The importance given by respondents to pain relief is reflected in the use of pain-related outcome measures: pain intensity was used by 70% (n ¼ 195) of respondents to assess treatment outcome. An assessment of the patient’s satisfaction with treatment was also considered important by 60% (n ¼ 168) of respondents. In contrast, back-specific functional outcomes, such as the Roland Morris or Oswestry Disability Questionnaires, were used by only 16% (n ¼ 44) and 12% (n ¼ 33) of respondents, respectively; ‘other’ outcome measures were used by 14% (n ¼ 40) of respondents.
4. Discussion The principal findings of this national survey indicate that the most frequently used treatments adopted for chronic LBP, within the Irish health system (public or private sectors), are advice and exercise respectively. However, despite current recommendations that it is safe for this patient subgroup to remain active, that ‘hurt does not mean harm’, and respondents’ recognition of the primary importance of functional improvement, it appears that pain relief continues to be a major treatment priority for physiotherapists. Whilst respondents regularly prescribed home exercises, they did not appear to be routinely providing supervised Table 4 Respondents’ treatment goals listed by rank importance. Treatment goal
Mean rank % As first rank
Improved function Pain relief Return to work/usual activities Change patient perceptions about chronic LBP Prevent recurrence Increased spinal range of movement Others
2.34 2.56 2.84 4.29
47.9 55.7 35.0 19.3
4.36 4.82 6.79
10.4 5.4 0.7
193
exercise classes. This is despite the potential role of supervision in enhancing exercise adherence and thus treatment outcomes (Sluijs et al., 1993; ACSM, 2000, p. 162; Liddle et al., 2004). In addition, follow-up advice after the last scheduled face-to-face treatment, to provide support and promote long-term self management, does not appear to be a common treatment strategy. Perhaps if therapists devoted more time to incorporating supervision and follow-up, the maintenance of exercise gains in the longer term could be facilitated, helping to reduce the socioeconomic burden of chronic LBP. More investigation into why supervision and follow-up are not commonly provided is necessary to tackle this apparent shortfall in practice. 4.1. Chronic LBP management While advice and exercise were clearly the most frequently used treatments for chronic LBP (62% and 51% of respondents, respectively), mobilisation techniques were also popular (ranked third); 25% (n ¼ 71) of respondents indicated that they used mobilisation techniques most frequently. Respondents indicated that they typically included ‘other treatments’ with advice and exercise, and spent similar proportions of time on each. However, this trend is typical in practice having been reported by previous authors investigating LBP management (Foster et al., 1999; Kerssens et al., 1999; Li and Bombardier, 2001; Gracey et al., 2002). The variations in treatment approach did not appear to influence the number of treatment sessions provided, with 6e10 sessions being the norm. The evidence from randomised controlled trials of chronic LBP and exercise (Liddle et al., 2004), and guidelines for exercise prescription and behaviour change (ACSM, 2000, p. 154), would suggest that it is unlikely that 6e10 sessions represents an adequate time frame for individuals with longstanding pain to develop adequate ‘self-management’ strategies, and functionally-related goals that are the necessary pre-requisites for effective long-term symptom management. 4.2. Type and frequency of advice The frequent use of advice reported in this survey suggests that physiotherapists, regardless of their place of work or clinical experience, are aware of the need to encourage individuals during treatment to increase their activity, and learn to incorporate such changes into their daily lifestyle. Respondents also appear to be aware of the need to tailor advice according to the patient’s age, previous treatment experiences and beliefs, and to reflect current guidelines. It was reported that provision of advice throughout the treatment programme commonly included supplementary written
194
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
information as an additional guide for patients. It has been suggested that refresher programmes (following a course of treatment) may help to maintain the positive results of treatment for chronic LBP patients (Harkapaa et al., 1989; Bendix et al., 1998); however, the provision of follow-up advice is rare even in randomised controlled trials (Liddle et al., 2007a). Given that a large proportion of respondents reported frequently treating chronic LBP, perhaps the provision of follow-up advice after the last treatment session could help to decrease barriers to exercise (Middleton, 2004), reinforce the advice given during treatment, and ensure continued adherence to exercise and activity programmes: this hypothesis warrants further investigation. 4.3. Type and frequency of exercise The majority of respondents currently treating LBP patients indicated that they frequently used exercise for the management of chronic LBP (n ¼ 273/280). Core stabilisation has been identified as an important component of exercise programmes for chronic LBP patients, and it was clearly valued by those physiotherapists responding to this survey; of those respondents who indicated what type of active exercise they actually used, core stabilisation was by far the most popular. This finding is in keeping with those of a smaller scale survey of physiotherapists working in the acute hospital sector in Ireland (Byrne et al., 2006). It is unclear from this survey why supervision is not a common component of treatment, given its recognised value in exercise prescription (ACSM, 2000, p. 162), and support for it within clinical guidelines (Airaksinen et al., 2004); however, the fact that only n ¼ 16 respondents indicated that clinical guidelines influenced their prescription of exercise to chronic LBP patients would suggest that perhaps current expectations about the impact of clinical guidelines is unrealistic, and therefore future guideline development should focus on the end-user, providing clear statements and educational materials (Grol and Buchan, 2006). The findings from a recent qualitative study underscores the importance that chronic LBP patients place on the provision of exercise programme supervision, not only for enhancing adherence but also for general reassurance (Liddle et al., 2007b). 4.4. Outcomes and goals of treatment Pain relief, whilst important, is not widely considered a primary goal in chronic LBP management; rather, recent authors and groups have emphasised the importance of improving functional activities despite pain (Rainville et al., 1997; Davey and Broadbent, 1998; Frost et al., 2000; Deyo and Weinstein, 2001; Cohen and Rainville, 2002; Lively, 2002). This is also reflected in recommendations for outcome assessment in chronic
LBP, where improved function and return to work are two of the five proposed ‘core’ categories of outcome measure recommended for use with such patients (Deyo et al., 1998; Bombardier, 2000; Bombardier et al., 2001). This notwithstanding, the results of this survey highlight the emphasis still being placed by therapists on pain relief (as previously reported by Foster et al., 1999), and the apparent lack of use of clinically relevant outcome measures. However, this is a problem that is not confined to clinical practice, as this has also been highlighted as a weakness within clinical trials in this area (Liddle et al., 2004; Liddle et al., 2007a). Interestingly, pain intensity had a much greater influence on exercise prescription than level of function (66% versus 7% of respondents, respectively). The underlying reasons for this are unclear, however, it may represent an attempt by physiotherapists to incorporate patients’ treatment expectations more directly into their management of chronic LBP (Liddle et al., 2007b), and/or the influence that therapists’ attitudes to back pain may have on their treatment decisions (Pincus et al., 2005). Respondents clearly realise the value of improved function as a goal of treatment (ranked first), but the assessment of functional improvement appears to rely on subjective report and opinion: 33% of respondents who reported using ‘other’ categories of outcome measure used unvalidated subjective measures of functional improvement. This may be the result of limited time, availability and/or a lack of emphasis being placed on the clinical relevance of specific categories of outcome measure. 4.5. Limitations of the study The principal limitation of this study was the sample size of respondents currently treating LBP (n ¼ 280). It is important to note that the overall response rate was 70% (n ¼ 419), however, only 47% (n ¼ 280) of the respondents treated LBP patients, and therefore completed the whole questionnaire. This factor limits the generalisability of the results, and underlines the need to make comparisons with current practice in other countries and healthcare settings (Byrne et al., 2006). It must also be borne in mind that those who take the time to respond to a questionnaire may be different from those who do not, therefore the results of a survey cannot necessarily be generalised beyond those who have responded (Domholdt and Malone, 1985). In addition, the influence of social desirability bias on responses cannot be excluded. Finally the authors acknowledge that the closed response format, predominantly used throughout this survey, may have led to differing interpretations of questions (Metcalfe et al., 2001); however, given the multifaceted nature of chronic LBP and its management, it was considered necessary to have some means of quantifying the data for statistical analysis.
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
5. Conclusion The findings of this survey demonstrate that respondents working in the public or private sector throughout Ireland recognise the value of advice and exercise for the management of chronic LBP. There is also evidence that a variety of treatments are being used alongside advice and exercise. The use of exercise programme supervision and follow-up advice, which are both considered important in facilitating the maintenance of advice and exercise-induced treatment gains, are not widely used by therapists responding to this survey. The potential benefit of follow-up advice (provided after the last treatment session), as a means of reinforcing the advice given during treatment, and ensuring continued adherence to exercise and activity programmes warrants further investigation. The importance of treatments designed to improve chronic LBP patients’ function, using individually tailored and supervised exercise programmes, must be more strongly emphasised in clinical guidelines, that focus on the end-user, and provide clear statements and educational materials (Grol and Buchan, 2006), in order to reap the rewards of these treatments in the longer term.
Acknowledgements The authors gratefully acknowledge the physiotherapists who took part in this study, and the support of the Department of Employment and Learning (Northern Ireland). There are no conflicts of interest.
Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.math. 2008.01.012.
References American College of Sports Medicine (ACSM). ACSM’s guidelines for exercise testing and prescription. 6th ed. Philadelphia: Lippincot, Williams and Wilkins; 2000. Armstrong MP, McDonough S, Baxter GD. Clinical guidelines versus clinical practice in the management of low back pain. International Journal of Clinical Practice 2003;57:9e13. Airaksinen O, Brox JL, Cedraschi C, Hildebrandt J, Klaber-Moffett J, Kovacs F, et al. European guidelines for the management of chronic non-specific low back pain. COST B13 Working Group, www.backpaineurope.org; 2004. Bartley R, Coffey P. Management of low back pain in primary care. Oxford: Butterworth Heinemann; 2001. p. 19. Bendix AF, Bendix T, Haestrup C, Busch E. A prospective, randomised 5-year follow-up study of functional restoration in chronic low back pain patients. European Spine Journal 1998;7:111e9.
195
Bombardier C. Outcome assessments in the evaluation of treatment of spinal disorders. Spine 2000;25:3100e3. Bombardier C, Hayden J, Beaton DE. Minimal clinically important difference. Low back pain: outcome measures. Journal of Rheumatology 2001;28:431e8. Byrne K, Doody C, Hurley DA. Exercise therapy for low back pain: a small-scale exploratory survey of current physiotherapy practice in the Republic of Ireland acute hospital setting. Manual Therapy 2006;11:272e8. Carpenter DM, Nelson B. Low back strengthening for the prevention and treatment of low back pain. Medicine and Science in Sports and Exercise 1999;31:18e24. Cherkin DC. Primary care research on low back pain: the state of the science. Spine 1998;23:1997e2002. Cohen I, Rainville J. Aggressive exercise as treatment for chronic low back pain. Sports Medicine 2002;32:75e82. Cottingham JT, Maitland J. A three-paradigm treatment model using soft tissue mobilisation and guided movement-awareness techniques for a patient with chronic low back pain: a case study. Journal of Orthopaedic and Sports Physical Therapy 1997; 26:155e67. Davey R, Broadbent H. Group rehabilitation for chronic back pain: a pilot study. British Journal of Therapy and Rehabilitation 1998;5:636e42. Descarreaux M, Normand M, Laurencelle L, Dugas C. Evaluation of a specific home exercise programme for low back pain. Journal of Manipulative and Physiological Therapeutics 2002;25:497e503. Deyo RA, Weinstein J. Low back pain. The New England Journal of Medicine 2001;344:363e70. Deyo RA, Battie M, Beurskens AJHM, Bombardier C, Croft P, Koes B, et al. Outcome measures for low back pain research: a proposal for standardised use. Spine 1998;23:2003e13. Domholdt EA, Malone TR. Evaluating research literature: the educated clinician. Physical Therapy 1985;65:487e91. Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, et al. Methods to influence response to postal questionnaires (Cochrane methodology review). The Cochrane Library 2002:4. Ehrlich G. Back pain. The Journal of Rheumatology 2003;30 (Supplement 67):26e31. Foster N, Thompson K, Baxter GD, Allen JM. Management of nonspecific low back pain by physiotherapists in Britain and Ireland: a descriptive questionnaire of current clinical practice. Spine 1999;24:1332e42. Frost H, Lamb SE, Shackleton CH. A functional restoration programme for chronic low back pain: a prospective outcome study. Physiotherapy 2000;86:285e93. Gracey JH, McDonough S, Baxter GD. Physiotherapy management of low back pain: a survey of current practice in Northern Ireland. Spine 2002;27:406e11. Grol R, Buchan H. Clinical guidelines: what can we do to increase their use? Strategies to close the gap between development and implementation of guidelines. Medical Journal of Australia 2006;185:301e2. Harkapaa K, Jarvikoski A, Mellin G, Hurri H. A controlled study on the outcome of inpatient and outpatient treatment of low back pain: Part 1. Pain, disability, compliance, and reported treatment benefits three months after treatment. Scandinavian Journal Rehabilitation Medicine 1989;21:81e9. Hayden JA, van Tulder MW, Tomlinson G. Systematic review: strategies for using exercise therapy to improve outcomes in chronic low back pain. Annals of Internal Medicine 2005;142:776e85. Hicks CM. Research methods for clinical therapists: applied project design and analysis. 3rd ed. Edinburgh: Churchill Livingstone; 1999. Hilde G, Hagen KB, Jamtvedt G, Winnem M. Advice to stay active as a single treatment for low back pain and sciatica (Cochrane Review). The Cochrane Library 2002:3.
196
S.D. Liddle et al. / Manual Therapy 14 (2009) 189e196
Kerssens JJ, Sluijs EM, Verhaak PFM, Knibbe HJJ, Hermans IMJ. Back care instructions in physical therapy: a trend analysis of individualized back care programs. Physical Therapy 1999;79:286e95. Li LC, Bombardier C. Physical therapy management of low back pain: an exploratory survey of therapist approaches. Physical Therapy 2001;81:1018e28. Liddle SD, Baxter GD, Gracey JH. Exercise and chronic low back pain: what works? Pain 2004;107:176e90. Liddle SD, Gracey JH, Baxter GD. Advice for the management of low back pain: a systematic review of randomised controlled trials. Manual Therapy 2007a;12:310e27. Liddle SD, Baxter GD, Gracey JH. Chronic low back pain: patients’ experiences, opinions and expectations for clinical management. Disability and Rehabilitation 2007b;29:1899e909. Lively M. Sports medicine approach to low back pain. Southern Medical Journal 2002;95:642e6. Maniadakis N, Gray A. The economic burden of back pain in the UK. Pain 2000;84:95e103. Metcalfe C, Lewin R, Wisher S, Perry S, Bannigan K, Klaber Moffett J. Barriers to implementing the evidence base in four NHS therapies. Physiotherapy 2001;87:433e41. Middleton A. Chronic low back pain: patient compliance with physiotherapy advice and exercise, perceived barriers and motivation. Physical Therapy Reviews 2004;9:153e60.
Miller J, Timson D. Exploring the experience of partners who live with a chronic low back pain sufferer. Health and Social Care in the Community 2004;12:34e42. Pincus T, Vogel S, Santos R, Breen AC, Foster NE, Underwood M. The attitudes to back pain scale in musculoskeletal practitioners (ABS-MP); the development and testing of a new questionnaire. British Journal of Bone and Joint Surgery 2005;87-B(Suppl. II):207. Rainville J, Sobel J, Hartigan C, Monlux G, Bean J. Decreasing disability in chronic back pain through aggressive spine rehabilitation. Journal of Rehabilitation Research and Development 1997; 34:383e93. Sluijs EM, Kok GJ, van der Zee J. Correlates of exercise compliance in physical therapy. Physical Therapy 1993;73:771e86. Snook SH. Self-care guidelines for the management of non-specific low back pain. Journal of Occupational Rehabilitation 2004;14: 243e53. Speed K. ABC of Rheumatology: low back pain. British Medical Journal 2004;328:1119e21. van Tulder M, Malmivaara A, Esmail R, Koes B. Exercise therapy for low back pain (Cochrane Review). The Cochrane Library 2004:4. Waddell G. The back pain revolution. 2nd ed. London: Churchill Livingstone; 2004. p. 75.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 197e205 www.elsevier.com/math
Original Article
Reliability of assisted indentation in measuring lumbar spinal stiffness Tasha R. Stanton 1, Gregory N. Kawchuk* University of Alberta, Department of Physical Therapy, Faculty of Rehabilitation Medicine, 3-44 Corbett Hall, Common Spinal Disorders Lab, Edmonton, Alberta T6G 2G4, Canada Received 27 June 2007; received in revised form 22 January 2008; accepted 30 January 2008
Abstract The reliability of manual methods to assess spinal stiffness is modest at best. In response, instrumentation has been developed which may be reliable, but is often difficult to use in clinical settings. The purpose of this study was to determine the intra-rater reliability of assisted indentation (AI), a smaller, less automated technique of measuring spinal stiffness in vivo. Twenty-three asymptomatic subjects were included in the study. The AI device was placed over the 4th lumbar spinous process in each prone, resting subject. Ten indentations were performed at approximately 2-min intervals while load and displacement data were collected simultaneously. From these data, two outcome variables were calculated: Global Stiffness (GS; slope of the forceedisplacement data) and Mean Maximal Stiffness (MMS; peak force/peak displacement). Intra-class correlation coefficient values for 10 consecutive measures of GS and MMS were 0.93 and 0.91, respectively. A repeated measures analysis of variance (ANOVA) did not demonstrate significant differences between any indentation trials from the same subject. Measurement of spinal stiffness using AI demonstrated excellent intra-rater reliability. These data, in addition to specific features of AI (small, transportable, relatively low cost, ease of operation) suggest that AI may be of benefit within clinical environments. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Indentation; Assisted indention; Reliability; Spinal stiffness; Posteroanterior compression
1. Background and purpose The manual assessment of low back stiffness remains a key tenet for many professionals who diagnose and treat low back pain. Most often, the clinical assessment of spinal stiffness involves a manual pressure test where a clinician uses their hands to apply pressure in a posteroanterior (PA) direction to the spinous process of interest. During the application of PA pressure, the clinician appreciates the resulting tissue response and * Corresponding author. Tel.: þ1 780 492 6891; fax: þ1 780 492 4492. E-mail address:
[email protected] (G.N. Kawchuk). 1 University of Sydney, School of Physiotherapy, Faculty of Health Sciences, East Street, Lidcombe, NSW 2141, Australia. 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.01.011
forms a subjective impression of spinal stiffness. The resulting impression formed by the clinician during the pressure test is then used to judge if the spine is too compliant (hypermobility), too stiff (hypomobility), or within normal limits (Maitland et al., 2001). These judgments often provide a basis for individual treatment programs and have also been shown to be important in predicting therapeutic success when stabilization exercise programs are prescribed (Hicks et al., 2005). Unfortunately, the PA pressure test is based on human performance, interpretation and communication. As a result, the PA pressure test is highly variable in many respects including the magnitude of applied peak force, (Latimer et al., 1998) the direction of force application (Caling and Lee, 2001) and in the
198
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
identification of a specific spinous process (Harlick et al., 2007) as a PA pressure target. In addition, the level of human sensitivity in detecting alterations in stiffness is limited. It has been estimated that the discrimination threshold for stiffness is of the order of 11% when using a pisiform grip to evaluate stiffness in the range of 6e11 N/mm (Maher and Adams, 1995). As a result, clinicians may be unable to perceive significant changes in spinal stiffness that occur below this threshold. Given the above, it is not surprising that stiffness values obtained from manual assessment of spinal stiffness vary considerably between clinicians (Snodgrass et al., 2006). Specifically, studies of between-clinician agreement have shown that the reliability of stiffness assessment remains poor with intra-class correlation coefficients (ICC) ranging between 0.03 and 0.55 (Fleiss, 1986; Maher and Adams, 1994; Binkley et al., 1995). In response to the poor reliability (Fleiss, 1986; Maher and Adams, 1994; Binkley et al., 1995), large variability (Snodgrass et al., 2006), and limits of human perception (Maher and Adams, 1995) associated with manual assessment of spinal stiffness, mechanical instruments have been designed to measure the applied loads and resulting tissue deformations that occur during manual PA testing. These devices include the Spinal Physiotherapy Simulator (SPS) (Lee and Svensson, 1990), Lee and Evans’ stiffness assessment device (Lee and Evans, 1992), Stiffness Assessment Machine (SAM) (Latimer et al., 1996aec), Spinal Posteroanterior Mobilizer (SPAM) (Edmondston et al., 1998), and the Rigid Frame Indentor (Kawchuk and Herzog, 1996). While the reliability of the majority of these instruments is high, these devices are designed primarily for research applications. As a result, many features of these devices such as their size, expense, and complex operation preclude their use in clinical settings. To exploit the increased performance of mechanical devices in assessing spinal stiffness yet avoid the limitations common to these research-based devices, a new stiffness assessment technique is proposed. This technique, assisted indentation (AI), uses manual load application with the addition of instrumentation designed to assist the operator and improve reliability and accuracy. Given recent findings that indicate stiffness may be a variable which helps predict outcome success (Childs et al., 2004; Hicks et al., 2005), there may be a future clinical need for a device which can measure stiffness accurately and reliably. Although the accuracy of AI has been shown to be excellent (absolute maximal difference of 0.22 mm compared to gold standard) (Kawchuk et al., 2006), the reliability of AI has yet to be determined. Therefore, the purpose of this study was to measure the in vivo, with-in operator reliability of AI measurements of spinal stiffness. It was hypothesized that AI reliability would be excellent (ICC greater than 0.75) (Fleiss, 1986).
2. Methods 2.1. Subjects Following approval from the University of Alberta Health Research Ethics Board, 23 consenting subjects were recruited from the University of Alberta and surrounding area over a 1-month period. This sample size was calculated a priori using a power of 80% and a level of significance of p ¼ 0.05. 2.1.1. Inclusion criteria Study subjects included asymptomatic males and females between the ages of 18 and 30 with no history of low back pain within the last year as well as no current low back pain. 2.1.2. Exclusion criteria Subjects were excluded from this study if they reported back pain and/or medical conditions that could affect the safety of measurement of spinal stiffness using AI and/or intolerance to screening procedures designed to identify those persons sensitive to direct spinous process loading. Please refer to Table 1 for a detailed list of exclusion criteria. 2.2. Research design This study quantified (1) single operator reliability of AI measures in a sample human population and (2) repeatability of AI measures generated by a single operator within single subjects. 2.3. Instrumentation A description of the device used to perform AI has been published previously (Fig. 1) (Kawchuk et al., 2006). In brief, the AI equipment is made up of an outer Table 1 Exclusion criteria. Injury related
Disease processes (Maitland, 1986)
Subject factors
Current low back pain Low back pain within the last year Previous back surgery
Osteoporosis
Lower extremity injury within the last year
Ankylosing Spondylitis Known malignancy Known spondylolisthesis Multiple sclerosis Severe scoliosis
Pregnancy (unsure or confirmed) Medications affecting muscle function (e.g. steroids) Medications affecting pain recognition (e.g. pain medications) Unable to tolerate indentation
Osteoarthritis
Rheumatoid arthritis
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
199
2.4. Calibration Calibration of the assisted indention device was achieved using masses of known magnitude applied to the load cell and spacers of known dimensions applied to the LVDT. After each application of increasing calibration mass or dimension, force and displacement signals were collected then plotted against the known mass or dimension. These data were then modeled with a linear data fitting technique. In each case, the r2 value of the line of best fit was greater than 0.90. The resulting equation of the line of best fit was then used to determine the units of measure for the output voltage of each transducer. Calibration was completed prior to subject testing. 2.5. Spinal stiffness measurement
Fig. 1. Assisted Indentor.
frame that is supported by an external support arm (Tenet Medical Engineering, Calgary, Alberta, Canada). The use of this rigid arm creates a stationary reference point. These structures suspend an inner probe that is moved manually to apply an external force to the anatomical target of interest. By using a ceramic air-bearing to hold the indenting probe, near frictionless movement of the inner probe with respect to the outer frame can be achieved thereby reducing artifacts due to movement of the frame during indentation loading. To measure applied force, a compressive-tension load cell (Entran, Fairfield, NJ) is connected in-series with the probe. The displacement of the probe is measured by a linear variable differential transformer (LVDT) (Honeywell International Inc., Morristown, NJ) attached between the probe and the outer housing. Because the displacement of the indenter is initiated by a manual process, but restricted by mechanical boundaries, this form of indentation is called ‘‘Assisted Indentation’’. Signals from the load cell and the LVDT were conditioned appropriately and collected by customized LABview software (National Instruments, Austin, TX) at a collection rate of 200 Hz.
In each prone subject, the AI device was placed perpendicular to the L4 spinous process with a contact load of less than 1 N (Fig. 2). The subject was then instructed to breathe out comfortably then to hold his/ her breath for the duration of the indentation (approximately 5 s) (Kawchuk and Fauvel, 2001). During indentation, the indentation probe was advanced manually (approximately 2 mm/s) into the spine until a force threshold of 100 N was read from a visual indicator. This level of force application was considered to be safe as forces up to 200 N have been used within an asymptomatic human population (Latimer et al., 1998) and forces up to 105 N within a symptomatic human population (Latimer et al., 1996b) without any adverse
Fig. 2. Placement of the Assisted Indentor on a subject.
200
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
effects reported. When the 100 N threshold was reached, the indentor position was maintained at this load for approximately 1 s after which the indenter was removed from contacting the subject. To decrease variability in the rate of indentation loading, the equipment operator viewed a computerized bar graph which increased in size at a rate of 2 mm/s. Next to this graph, a second bar graph displayed the actual displacement of the AI device. With these two displays, the operator could continually adjust their performance to match the desired indentation rate.
during indentation increased from baseline. If the subject wanted indentation to cease for any reason they were instructed to squeeze the trigger fully which produced an audible alarm alerting the researcher to remove the indentor. If this situation occurred, the researcher re-positioned the indentor and indentation was attempted again. Re-positioning of the indentor was allowed a maximum of two times after which further indications of painful indentation excluded the subject from further participation. 2.7. Analysis of spinal stiffness measurement
2.6. Study procedure Once informed consent was attained from the subjects, a verbal history questionnaire was completed to ensure that subjects met the inclusion criteria and did not possess any factors that would cause exclusion from the study. Following the questionnaire, each subject’s height and weight were recorded, and Body Mass Index (BMI; kg/m2) was calculated (Astrand et al., 2003). With the subject lying in prone on a plinth, the subject’s spine was palpated by the researcher and the L4 spinous process identified. Although identification of spinous processes in the lumbar spine has demonstrated moderate accuracy with use of preferred palpation procedures (47% were on the level intended) (Harlick et al., 2007), a standardized procedure was utilized in this study to reduce this error. Specifically, the horizontal line between the iliac crests was used to identify the L4/5 interspace and the vertebrae above were determined to be L4 (if this line between the iliac crests gave a spinous process, this was identified as L4) (Grieves, 1984). The L4 vertebra was chosen as the site of indentation as this has been shown to be a commonly symptomatic area in patients with low back pain (Maitland et al., 2001). The skin over the presumed L4 spinous was then marked using a pen to provide a visual guide for placing the indenter. The indenter was then placed over the ink marking and a series of five consecutive indentations were provided to familiarize subjects with the indentation process. Once the familiarization indentations were completed, 10 consecutive spinal stiffness measurements (indentations) were collected, each separated by a time period of approximately 2 min. During times between indentations, subjects were instructed to remain in a resting prone position and to remain stationary and relaxed. Each subject was examined at one time period. Indentations were performed by one researcher (T.S.) who had logged approximately 100 h of using the indentation device prior to data collected for this experiment. During the indentation process, all subjects held an analog trigger to indicate if their level of discomfort
Indentation data (force and displacement) were used to calculate the spinal stiffness at the indentation site. Stiffness was quantified in two ways: (1) Global Stiffness (GS); and (2) Mean Maximal Stiffness (MMS). GS, calculated as the slope of the forceedisplacement curve between 30 N and maximal force, represents the stiffness of the underlying tissues during the indentation itself. It is assumed that the relationship between force and displacement is linear between 30 and 100 N given previous work. (Latimer et al., 1996b). MMS, the second variable representing stiffness, was computed by taking the average stiffness value (N/mm) over the time period where the maximal indentation force has been held for a period of approximately 1 s. The MMS variable is therefore a ratio between the applied maximal force and the resultant maximal displacement of the underlying tissues (Fig. 3). 2.8. Statistical analysis For data analysis purposes, all five of the familiarization trials were discarded (Latimer et al., 1996b,c). In addition, the first trial (stiffness measurement during rest) of the 10 experimental indentations was discarded as this trial has been shown to highly variable (Latimer et al., 1996b,c) while stiffness measurements from subsequent trials (after the first trial) have demonstrated stability (Latimer et al., 1996b,c). To assess intra-rater reliability of the researcher/instrument in measuring spinal stiffness, the intra-class correlation coefficient (3,1) was calculated (Shrout and Fleiss, 1979). To describe repeatability, inter-trial inconsistency values for stiffness variables were calculated by taking the difference between two consecutive indentations expressed as a percentage of the average of the same two indentations. Finally, to further explore repeatability and investigate the possibility that a gradual change in stiffness values may occur with successive indentations, a condition that may not be reflected in ICC values, a repeated measures analysis of variance (ANOVA) with a Bonferroni correction was performed.
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
201
Fig. 3. Graphical example of stiffness measurement output. The top graph represents the loadedisplacement curve of a single AI (force on y-axis, displacement on x-axis) with vertical white bars representing the section over which slope was taken (GS). The bottom graph shows the indentation profile (numerical scale for force and displacement on y-axis, time on x-axis). In the bottom graph, the upper trace is the applied force while the bottom trace is the resultant displacement. The vertical white bars represent the section over which force was divided by displacement (MMS).
3. Results
4. Discussion
A total of 30 subjects were recruited to participate in this project with three excluded due to previous back or lower extremity injury within the last year, two excluded for exceeding the age limit, and two excluded prior to formal testing (did not pass the indentation screening procedure in that they reported discomfort with indentation even after the indentor was re-positioned twice). This resulted in 12 male and 11 female subjects who participated in this study (n ¼ 23) (see Table 2 for subject demographics). In this experiment, the reliability of the stiffness measures was described by the ICC which was calculated to be 0.91 for GS and 0.93 for MMS. Additionally, an estimate of the consistency in stiffness measures was obtained by calculating the inter-trial inconsistency value which was 6.23% (4.52%) for the GS and 7.71% (5.33%) for MMS (see Figs. 4 and 5 for individual subject representation of inter-trial inconsistency values). The repeated measures ANOVA did not reveal significant differences between any indentation trials for either GS or MMS ( p ¼ 0.09e1.00 and p ¼ 1.00 for all comparisons, respectively). See Figs. 6 and 7 for the graphical representation of the change in stiffness values over time.
Data from this study support the hypothesis that AI has excellent reliability (ICC 0.75) (Fleiss, 1986). Specifically, AI exhibited excellent intra-rater reliability for all outcome variables used to quantify L4 stiffness. Furthermore, the average inter-trial inconsistency remained below 8% for all stiffness variables. Compared to the manual testing of stiffness, ICC values found for the AI technique were much higher (Table 3). Overall, reliability values for the evaluation of spinal stiffness using the manual PA pressure test have been found to be poor (Matyas and Bach, 1985; Maher and Adams, 1994; Binkley et al., 1995). Matyas and Bach (1985) first found poor reliability of manual PA stiffness assessment when they reported Pearson’s r ranging from 0.09 to 0.46. Unfortunately, these reliability results using Pearson’s r cannot be compared directly to the current study. Later studies also noted poor reliability with ICC (1,1) values ranging from 0.03 to 0.37 (Maher and Adams, 1994; Binkley et al., 1995). With improvements to the testing protocol and delineation of stiffness into ranges, reliability increased to a fair level (Fleiss, 1986) with an ICC value reported to be 0.55 (range 0.50e 0.62) (Maher et al., 1998). The ICC value of the PA pressure test increased further when an 11-point stiffness rating scale was employed and more rigorously controlled testing protocol were used (ICC ¼ 0.77) (Maher et al., 1998). Although improvements in the reliability of the manual assessment of spinal stiffness have been demonstrated, these improvements occur only under standardized and artificial conditions that are not typically employed in the clinical environment. It may be argued that any form of instrumented stiffness assessment, such as AI, is also not typical of the clinical procedures (i.e. PA testing) due to increased
Table 2 Mean (standard deviation) of subject demographic characteristics.
Age (years) Height (m) Weight (kg) BMI (kg/m2)
Male (n ¼ 12)
Female (n ¼ 11)
26.17 1.79 76.23 23.85
24.45 1.63 58.41 21.59
(3.10) (0.065) (9.64) (2.39)
(3.21) (0.052) (8.28) (2.21)
202
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
Fig. 4. Inter-trial inconsistency values (mean standard deviation) for GS estimates of L4 stiffness values.
size of the instrumentation and necessary operator training. However, if the desire is to objectively quantify stiffness in a reproducible way, then changing clinical practice to involve use of scales to delineate stiffness levels or involve use of an instrument becomes important. If changes to clinical practice are mandated and/ or desirable, using a method of stiffness assessment with the combination of high reliability values and minimally clinically invasiveness is paramount. With this in mind, AI may become a viable option for clinical stiffness testing due to its excellent reliability values as well as a design that allows for ease of use by a single operator in a small footprint, low cost device (w$10,000 Canadian dollars) that does not require advanced mechanization such as motors, pulleys or pistons. The observation that AI exhibits greater reliability than manual assessment of spinal stiffness was expected for three reasons. First, AI measures several variables in an objective manner, increasing the reliability of spinal stiffness assessment. Specifically, use of technology to quantify force and displacement data (load cell and
a LVDT, respectively), in addition to customized computer programming, allows consistency of force application and real-time visualization of results. Second, AI reduces variability in factors shown to alter spinal stiffness measures including visual occlusion (Maher and Adams, 1996), peak force (Latimer et al., 1998), frequency of PA loading (Lee and Svensson, 1990; Lee and Liversidge, 1994), direction of force application (Caling and Lee, 2001), and force angulation (Kawchuk and Herzog, 1996). Finally, we elected to employ stiffness variables which considered regions of data that were larger than those used in previous studies. This approach was chosen because the most clinically important region of a loadedisplacement graph remains unknown. While there is some evidence to suggest that stiffness may play a role in predicting outcomes of specific treatments (Childs et al., 2004; Hicks et al., 2005), an understanding of the physiologic basis of spinal stiffness, or its alteration due to pathology or treatment, remains elusive. With respect to other studies, the reliability values for AI, although slightly lower, are comparable to those
Fig. 5. Inter-trial inconsistency values (mean standard deviation) for MMS estimates of L4 stiffness values.
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
203
Fig. 6. Change in GS values over time for all subjects. GS values (mean standard deviation) normalized to indentation trial 2.
found for mechanical indentation devices (Table 3). Intra-class correlation coefficient values have been reported to be over 0.90 for almost all mechanical indentation instruments. Specifically, the SPAM was found to have an ICC value of 0.979 at L5 (Edmondston et al., 1998), Lee and Evans’ stiffness assessment device had an ICC value of 0.99 for L3/4 and 0.95 for L4/5 (Lee and Evans, 1992), SAM had an ICC value of 0.96 for lumbar vertebrae (Latimer et al., 1996aec), and Rigid Frame Indentation at 0.99e1.00 for varying experimental conditions (Kawchuk and Herzog, 1996). Interestingly, the reliability of AI was higher than that of the SPS which found an ICC value of 0.88 at L3 (Lee and Svensson, 1990). That mechanical indentation devices have higher reliability values (overall) than AI is expected. While the rate of indentation studying the AI procedure is standardized using a visual cue (graphic display of force data), slight variations in the rate of indentation were likely to occur. These variations may alter the resulting measures as (1) the target tissues are viscoelastic and may exhibit rate dependant behaviors (White and Panjabi, 1990) and (2) variations in the data may influence stiffness analysis techniques such as GS which is based on a linear approximation of shape.
While these variations were not of sufficient magnitude to create poor reliability, they may account for the slightly lower reliability values that occur with AI compared to other automated techniques. Further support for the comparability of AI to mechanical techniques is suggested by our repeated measures ANOVA results; no significant differences were present between any of the indentation trials for both GS and MMS measures. This observation suggests that all indentations using the AI found similar stiffness values regardless of the time at which the stiffness measure was taken. This finding strengthens the excellent reliability values by demonstrating consistency over time with the stiffness measurements. Further, these ANOVA data suggest that repeated AI trials do not affect viscoelasticity of the target tissue. This is likely due to tissues reaching a steady state of viscoelastic change following sufficient familiarization trials and experimental indentations and/or adequate time between all indentations such that between-trial tissue recovery was complete. It should be noted that large differences in individual subject inter-trial inconsistency values were exhibited with some single subjects having inter-trial inconsistency values approaching 30% (1 SD). This suggests that the
Fig. 7. Change in MMS values over time for all subjects. MMS values (mean standard deviation) normalized to indentation trial 2.
204
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205
Table 3 Intra-class correlation values for three methods of stiffness assessment. Method of stiffness assessment
ICC value
Manual Mechanical Assisted indentation
0.03e0.77 0.88e1.00 0.91e0.93
consistency of stiffness results obtained by AI may be specific to the individual and may be influenced by other factors not defined in this study. Possible factors that could explain measurement inconsistency with these few subjects may include inconsistent localization of the indentation contact point between trials or failure to control subject specific factors which influence stiffness (e.g. intra-abdominal pressure, muscle contraction, subject movement, etc.) (Kawchuk and Fauvel, 2001). In this situation, changes in measured spinal stiffness may occur as the indentation test may involve different anatomy. In addition, the subject’s baseline stiffness could also be a confounding factor. Although a formal analysis was not performed, it was observed that those subjects with high baseline stiffness values for GS and MMS (stiff back) often had large changes in their stiffness values over time. Finally, variables such as plinth padding (Maher et al., 1999), subject positioning, (Edmondston et al., 1998), adipose tissue (Viner et al., 1997), and breathing (Beaumont et al., 1991) must be controlled within a single subject if stiffness measures are to be compared within the same subject over time. Several limitations of this study are noted. First, only intra-operator reliability was measured in asymptomatic subjects. As a result, we cannot comment on intraoperator reliability in a symptomatic population nor inter-operator reliability. Second, our results apply specifically to a sample of patients with an average age of 25 years and average BMI of 22 kg/m2; generalization to those outside this group is unwarranted. 5. Conclusion Measurement of spinal stiffness using AI demonstrated excellent intra-rater reliability. Due to the smaller and less cumbersome nature of AI compared to other mechanical instruments, AI may be viable technology for clinical use, however, further research is needed to quantify inter-rater reliability and to investigate the responsiveness of this instrument (sensitivity and specificity) to alterations in stiffness values. Acknowledgements Funding for this project and for Tasha Stanton was provided by the Province of Alberta Graduate Scholarship, the Strathcona Physiotherapy Foundation and
NSERC. Support for Greg Kawchuk was supplied by the Canada Research Chairs program. We would like to express our sincere appreciation to Gian Jhangri and Dr. Trish Manns for their statistical assistance, Al Fleming and Sam Graziano for their technical support, and the members of the Common Spinals Disorder Lab at the University of Alberta for their feedback and support.
References Astrand PO, Rodahl K, Dahl HA, Stromme SB. Textbook of work physiology: physiological bases of exercise. 4th ed. Champaign, IL: Human Kinetics; 2003. Beaumont A, McCrumb C, Lee M. Differences in the posteroanterior stiffness of the lumbar spine during tidal breathing and breath holding. In: Proceedings of the seventh biennial conference of the manipulative physiotherapists association of Australia, Sydney, New South Wales, Australia 1991, p. 244e51. Binkley JM, Stratford PW, Gill C. Intrarater reliability of lumbar accessory motion mobility testing. Phys Ther 1995;75:786e92. Caling B, Lee M. Effect of direction of applied mobilization force on the posteroanterior response in the lumbar spine. J Manipulative Physiol Ther 2001;24:71e8. Childs JD, Fritz JM, Flynn TW, Irrgang JJ, Johnson KK, Majkowski GR, et al. A clinical prediction rule to identify patients with low back pain most likely to benefit from spinal manipulation: a validation study. Ann Intern Med 2004;141:920e8. Edmondston SJ, Allison GT, Gregg CD, Purden SM, Svansson GR, Watson AE. Effect of position on the posteroanterior stiffness of the lumbar spine. Man Ther 1998;3:21e6. Fleiss JL. The design and analysis of clinical experiments. 1st ed. New York: Wiley; 1986. Grieves G. Mobilisation of the spine: notes on examination, assessment, and clinical method. 4th ed. Edinburgh; New York: Churchill Livingstone; 1984. Harlick JC, Milosavljevic S, Milburn PD. Palpation identification of spinous processes in the lumbar spine. Man Ther 2007;12:56e62. Hicks GE, Fritz JM, Delitto A, McGill SM. Preliminary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization program. Arch Phys Med Rehabil 2005;86:1753e62. Kawchuk G, Herzog W. A new technique of tissue stiffness (compliance) assessment: its reliability, accuracy and comparison with an existing method. J Manipulative Physiol Ther 1996;19:13e8. Kawchuk GN, Fauvel OR. Sources of variation in spinal indentation testing: indentation site relocation, intraabdominal pressure, subject movement, muscular response, and stiffness estimate. J Manipulative Physiol Ther 2001;24:84e91. Kawchuk G, Liddle T, Fauvel R. The accuracy of ultrasonic indentation: a comparison of three techniques. J Manipulative Physiol Ther 2006;29:126e33. Latimer J, Lee M, Adams R. The effects of high and low loading forces on measured values of lumbar stiffness. J Manipulative Physiol Ther 1998;21:157e63. Latimer J, Lee M, Adams R, Moran C. An investigation of the relationship between low back pain and lumbar posteroanterior stiffness. J Manipulative Physiol Ther 1996a;19:587e91. Latimer J, Lee M, Goodsell M, Maher C, Wilkinson B, Adams R. Instrumented measurement of spinal stiffness. Man Ther 1996b;1:204e9. Latimer J, Goodsell MM, Lee M, Maher CG, Wilkinson BN, Moran CC. Evaluation of a new device for measuring responses to posteroanterior forces in a patient population, Part I: Reliability testing. Phys Ther 1996c;76:158e65.
T.R. Stanton, G.N. Kawchuk / Manual Therapy 14 (2009) 197e205 Lee R, Evans J. Loadedisplacementetime characteristics of the spine under posteroanterior mobilisation. Aust J Physiother 1992;38: 115e23. Lee M, Liversidge K. Posteroanterior stiffness at three locations in the lumbar spine. J Manipulative Physiol Ther 1994;17:511e6. Lee M, Svensson NL. Measurement of stiffness during simulated spinal physiotherapy. Clin Phys Physiol Meas 1990;11:201e7. Maher CG, Adams R. Reliability of pain and stiffness assessments in clinical manual lumbar spine examination. Phys Ther 1994;74:801e9. Maher C, Adams R. A psychophysical evaluation of manual stiffness discrimination. Aust J Physiother 1995;41:161e7. Maher CG, Adams RD. Stiffness judgments are affected by visual occlusion. J Manipulative Physiol Ther 1996;19:250e6. Maher CG, Latimer J, Adams R. An investigation of the reliability and validity of posteroanterior spinal stiffness judgments using a reference-based protocol. Phys Ther 1998;78: 829e37.
205
Maher CG, Latimer J, Holland MJ. Plinth padding confounds measures of posteroanterior stiffness. Man Ther 1999;14:145e50. Maitland GD. Vertebral manipulation. 5th ed. London: Butterworthe Heinemann; 1986. Maitland GD, Hengeveld E, Banks K, English K. Maitland’s vertebral manipulation. 6th ed. London: ButterwortheHeinemann; 2001. Matyas T, Bach TM. The reliability of selected techniques in clinical arthrometrics. Aust J Physiother 1985;31:175e99. Shrout PE, Fleiss JL. Intraclass correlation: uses in assessing rater reliability. Psychol Bull 1979;86:420e8. Snodgrass SJ, Rivett DA, Robertson VJ. Manual forces applied during poster-to-anterior spinal mobilization: a review of the evidence. J Manipulative Physiol Ther 2006;29:316e29. Viner A, Lee M, Adams R. Posteroanterior stiffness in the lumbosacral spine: the correlation between adjacent vertebral levels. Spine 1997;22:2724e9 [discussion 2729e30]. White AA, Panjabi MM. Clinical biomechanics of the spine. 2nd ed. Philadelphia: Lippincott; 1990.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 206e212 www.elsevier.com/math
Original Article
Reliability, validity and responsiveness of the French version of the questionnaire Quick Disability of the Arm, Shoulder and Hand in shoulder disorders Fouad Fayad a,*, Marie-Martine Lefevre-Colau b,e, Vincent Gautheron c,e, Yann Mace´ a, Jacques Fermanian d, Anne Mayoux-Benhamou a, Alexandra Roren a, Franc¸ois Rannou a, Agne`s Roby-Brami e, Michel Revel a,e, Serge Poiraudeau a,e a
Department of Rehabilitation, Assistance Publique-Hoˆpitaux de Paris (AP-HP), Cochin Hospital, Paris Descartes University, 27 Rue du Faubourg Saint Jacques, 75679 Paris Cedex 14, France b Department of Rehabilitation, AP-HP, Corentin-Celton Hospital, Paris Descartes University, Issy-les-Moulineaux, France c Department of Rehabilitation, Bellevue Hospital, Jean Monnet University, Saint-Etienne, France d Department of Biostatistics, AP-HP, Necker Hospital, Paris Descartes University, Paris, France e Institut Fe´de´ratif de Recherche sur le Handicap (IFR 25), Institut National de la Sante´ et de la Recherche Me´dicale (INSERM), Paris, France Received 27 July 2007; received in revised form 24 January 2008; accepted 30 January 2008
Abstract We assessed the reliability, validity and responsiveness of the French short version of the scale Disability of the Arm, Shoulder and Hand-Disability/Symptom (F-QuickDASH-D/S) in patients with shoulder disorders. We extracted QuickDASH item responses from the responses to the full-length DASH questionnaire completed by 153 patients. In addition to collecting demographic and clinical data, subjective assessment of activities of daily living (ADL), active range of motion (ROM), and measurement of abduction strength (strength) were recorded by use of the Constant scale. Cronbach’s alpha coefficient was 0.89. The intraclass correlation coefficient was 0.94, which suggested excellent testeretest reliability. Correlation of the F-QuickDASH-D/S score with scores for FDASH-D/S (r ¼ 0.96), handicap (r ¼ 0.79), ADL (r ¼ 0.73), pain during activities (r ¼ 0.63), strength (r ¼ 0.58), pain at rest (r ¼ 0.57) and ROM (r ¼ 0.51) indicated good construct validity. Factor analysis identified 2 factors accounting for 59.1% of the variance. The responsiveness of F-QuickDASH-D/S was excellent, with standardized response mean and effect size values of 1.09 and 1.23, respectively. The F-QuickDASH-D/S has good reliability, construct validity and responsiveness. The strong correlation of its score with the full-length DASH-D/S scale score suggests that the QuickDASH-D/S could be the preferred scale because it is easier to use. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Shoulder; Disability; QuickDASH questionnaire; Outcome measure
1. Introduction Symptomatic shoulder disorders constitute the third most common musculoskeletal reason, after back and * Corresponding author. Tel.: þ33 1 58 41 25 41; fax: þ33 1 58 41 25 45. E-mail address:
[email protected] (F. Fayad). 1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.01.013
neck pain, for consultation in medical practice (Rekola et al., 1993; Linsell et al., 2006; Feleus et al., 2008). Patient’s subjective perception of their disease status is decisive for both diagnostic work-up and subsequent therapeutic management. In addition, patient-reported outcome measures have become an important part of the assessment used in clinical studies. Numerous shoulder outcome-measure instruments are available (Fayad
F. Fayad et al. / Manual Therapy 14 (2009) 206e212
et al., 2005). The Disability of the Arm, Shoulder, and Hand scale (DASH) (Hudak et al., 1996) is among the best-rated self-administered questionnaires for their clinimetric properties (Bot et al., 2004; Gabel et al., 2006). From the original 30-item DASH questionnaire, a shorter version, of 11 items, the QuickDASH, was recently developed (Beaton et al., 2005). The psychometric properties of the QuickDASH are similar to those of the original questionnaire, and the QuickDASH may be preferred because of the reduced time for responding as well as less administrative burden. Furthermore, the QuickDASH has been selected by the American Medical Association’ Guides to the Evaluation of Permanent Impairment for the functional assessment measure of the upper extremity (Matheson et al., 2006). Cross-cultural adaptation of validated outcome instruments has been advocated to facilitate their use in international multicenter clinical trials (Ware et al., 1995), which would also reduce the need for developing new instruments with the same purpose. The full-length version of the DASH has been validated or translated in several languages (Atroshi et al., 2000; Dubert et al., 2001; Offenbaecher et al., 2002; Padua et al., 2003; Lee et al., 2005). The references are some examples only, because many more language versions exist, and the list here is not exhaustive. However, to date for the QuickDASH only English, Sweden and Japanese versions have been validated (Beaton et al., 2005; Gummesson et al., 2006; Imaeda et al., 2006). We aimed to assess the reliability, validity and responsiveness of the French version of the Disability/ Symptom scale of the QuickDASH (F-QuickDASHD/S) in patients with common shoulder disorders.
2. Patients and method 2.1. Patients Patients with common shoulder conditions (rotator cuff tendinopathies, frozen shoulder, osteoarthritis and proximal humeral fractures after bone healing) referred to a tertiary care rehabilitation unit were considered for inclusion in this study. Exclusion criteria were age less than 18 years; symptom duration of less than 2 months; shoulder pain originating from neurological or vascular disorders or neoplasms; referred pain from internal organs; systemic rheumatic conditions; inability to complete questionnaires because of cognitive impairment; or language difficulties. French bioethics legislation does not require consent from the Hospital Ethics Committee for this type of study. The study was conducted in compliance with the protocol Good Clinical Practices and Declaration of Helsinki principles and all patients provided informed consent.
207
2.2. Patient self-administered questionnaire The full-length French DASH-D/S questionnaire (Dubert et al., 2001) was completed by 153 consecutive eligible patients. The QuickDASH item responses were extracted from the subjects’ responses to the full-length scale. The 11 items of the QuickDASH ask about the degree of difficulty in performing various physical activities because of arm, shoulder or hand problems (6 items); the severity of pain and tingling (2 items); as well as the problem’s effect on social activities, work, and sleep (3 items). Each item has 5 response options, ranging from 1, ‘‘no difficulty or no symptom,’’ to 5, ‘‘unable to perform activity or very severe symptom.’’ If at least 10 of the 11 items are completed, a score ranging from 0 (no disability) to 100 (most severe disability) can be calculated [(sum of n responses/n) 1] 25 (Beaton et al., 2005). Data for patients with more than 1 unanswered item on the questionnaire were excluded. The 2 optional scales of the QuickDASH (sport/music and work) were not part of the study. Indeed, we chose to include patients with shoulder disorders without any restriction in age or activities. This led to the inclusion of many patients without professional activity or sport/ music activities. 2.3. Statistical methods 2.3.1. Variables recorded other than the QuickDASH score Demographic and clinical data were collected at the first visit (baseline) by a physician (FF). Parameters recorded were age, sex, body mass index, disease duration, pain scores at rest and during activities (on a visual analog scale [VAS], 0e100 mm), and perceived handicap (on a VAS, 0e100 mm). The following Constant subscale scores (Constant and Murley, 1987) were used: Twenty points were allocated to subjective assessment of activities of daily living (ADL), 40 to active range of motion (ROM), and 25 to abduction strength (strength). The Constant subscale for pain was not used in this study. 2.3.2. Statistics Data analysis involved use of Systat 9 Delta Soft for Windows (Systat Software, Point Richmond, CA). Quantitative variables (age, disease duration, body mass index, pain scores at rest and during activities, perceived handicap, ADL, ROM, strength, and F-DASHD/S and F-QuickDASH-D/S scores) are described with medians and ranges. The qualitative variable (sex) is described with percentages. The chi-square test was applied to test for a normal distribution of the variables: in the whole sample of patients, for all variables we could not determine their normal distribution; by
208
F. Fayad et al. / Manual Therapy 14 (2009) 206e212
contrast, in 2 subgroups of patients (see below), the variables of interest showed a normal distribution.
2.3.3. Reliability Testeretest reliability was analyzed in a subgroup of 42 patients, selected at random by use of random numbers generated by computer. Each patient completed the questionnaire twice within a mean interval of 3.3 (range 2e9) days. The self-administered questionnaire was given at the second visit by a physical therapist (AR) just before the beginning of the rehabilitation program. No specific treatment for the shoulder was given between the 2 evaluations, and all these patients reported no change in functional status at the second visit. All the variables in this subgroup of patients can be considered, after a KolmogoroveSmirnov test, to be normally distributed. Testeretest reliability was assessed with both the intraclass correlation coefficient (ICC), with a 2-way random-effects model (Shrout and Fleiss, 1979), and the Bland and Altman (1986) method. Internal consistency of the F-QuickDASHD/S scale was assessed with the Cronbach’s alpha coefficient.
2.3.5. Responsiveness Responsiveness was analyzed in a subgroup of patients treated with a corticosteroid injection followed by a supervised 5-session program of physical therapy and a self-management program of rehabilitation at home. At this stage, data for 26 patients were analyzed. These patients had rotator cuff tendinopathies with subacromial bursitis (n ¼ 8), frozen shoulder (n ¼ 8), or osteoarthritis (n ¼ 10). All the variables in this subgroup of patients can be considered, after a Kolmogorove Smirnov test, to be normally distributed. Responsiveness statistics-distribution-based was computed by use of the standardized response mean (SRM) and the effect size (ES) (Fortin et al., 1995). Values 0.80 were considered to represent small, moderate, and large degrees of responsiveness, respectively (Husted et al., 2000). The relation between the change in the F-QuickDASH score and change in patient’s perceived handicap (reflecting the overall perceived patient improvement) was studied by use of Pearson correlation to establish the longitudinal construct validity of the F-QuickDASH. 3. Results 3.1. Demographic and clinical data
2.3.4. Construct validity Construct validity of the F-QuickDASH-D/S was investigated on the whole sample of patients. Convergent construct validity was assessed by correlating the questionnaire scores with scores on variables supposedly assessing similar dimensions or concepts (Poiraudeau et al., 2001; Lefevre-Colau et al., 2003; Fermanian, 2005). We hypothesized that the F-QuickDASH-D/S score would have (1) strongest association with ADL score and perceived handicap; (2) moderate association with ROM, strength, pain at rest, and pain during activities; and (3) weakest association with age, disease duration and body mass index. Because a normal distribution could not be demonstrated for all parameters studied, the nonparametric Spearman rank coefficient (r) was used to assess the correlation between 2 quantitative variables. Spearman’s correlation was interpreted as excellent (>0.91), good (0.90e0.71), moderate (0.70e 0.51), fair (0.50e0.31), or little or no correlation ( 1. Eigenvalues are obtained by matrix algebra and represent the part of the whole variation of the data that can be attributed to each factor.
Demographic and clinical characteristics are shown in Table 1. Sixty-five patients had rotator cuff tendinopathies, 32 frozen shoulder, 25 osteoarthritis and 31 fractures of the humeral head. The fracture group was a distinct group and its onset period was much shorter than that of others, with lower pain and disability scores as well. Data for 2 patients were excluded because of more than 1 unanswered item on the questionnaire. Four patients did not respond to item 6, related to recreational activities, and 2 to item 2, related to heavy household chores. No item had a floor or ceiling effect. No patients recorded the minimum disability score of 0 on the F-QuickDASH-D/S scale, which would represent the maximum health status score (ceiling), and no corresponding maximum disability score of 100, which would represent the minimum health status score (floor). 3.2. Reliability Internal consistency was high, with a Cronbach’s alpha coefficient of 0.89. Testeretest reliability was analyzed for 42 of the patients (62% women) with mean age 59 13.9 years (range 25e85 years). The FQuickDASH-D/S scores at the first and the second visit were 48.3 18.1 and 44.9 20.1 ( p ¼ 0.001), respectively. Testeretest reliability analysis gave an ICC of 0.94 (95% confidence interval, 0.87e0.97), indicating excellent reliability. Bland and Altman analysis revealed
209
F. Fayad et al. / Manual Therapy 14 (2009) 206e212 Table 1 Demographic and clinical characteristics of 153 patients and patients by disorder, for whom the F-QuickDASH-D/S was validated. Variables
Whole group (n ¼ 153)
Rotator cuff tendinopathies (n ¼ 65)
Frozen shoulder (n ¼ 32)
Osteoarthritis (n ¼ 25)
Fractures of the humeral head (n ¼ 31)
Age (years) Sex, F (%) Body mass index (kg/m2) Disease duration (months) Pain score at rest (VAS, 0e100 mm) Pain score during activities (VAS, 0e100 mm) Perceived handicap (VAS, 0e100 mm) F-DASH-D/S score (range 0e100) F-QuickDASH-D/S score (range 0e100) ADL (range 0e20) ROM (range 0e40) Strength (range 0e25)
57.0 100 24.6 9.0 9.0 54.0 50.0 48.3 50.0 12 28 5
57.0 40 25.3 10.0 18.5 60.0 50.0 52.5 54.5 11 28 4
49.0 22 23.5 11.5 8.0 58.5 50.0 50.8 51.1 10 17 5
70.0 18 26.8 36.0 23.0 70.0 50.0 45.8 50.0 11 22 3
57.0 20 23.4 3.0 0 29.0 20.0 20.7 21.6 17 32 9
(23e89) (65.4) (16.4e42.1) (2e180) (0e82) (0e100) (0e90) (5.0e87.5) (2.3e88.6) (3e20) (0e40) (0e17)
(27e85) (61.5) (16.4e33.2) (2e120) (0e82) (0e100) (12e90) (5.8e85.8) (2.3e81.8) (4e18) (10e40) (0e16)
(30e69) (68.7) (17.1e42.1) (2e34) (0e72) (0e92) (20e80) (20.0e87.5) (18.2e88.6) (3e19) (8e34) (2e17)
(56e86) (72.0) (19.0e34.7) (3e180) (0e63) (11e100) (10e85) (27.5e80) (20.5e79.5) (3e17) (0e38) (0e16)
(23e89) (64.5) (18.6e37.6) (2e12) (0e16) (0e72) (0e80) (5.0e77.5) (6.8e70.5) (6e20) (0e40) (0e14)
Values are median (minemax), unless indicated. VAS: visual analog scale; F-DASH-D/S: French version of the Disability of the Arm, Shoulder and Hand questionnaire Disability/Symptom scale (30 items); F-QuickDASH-D/S: short version of F-DASH-D/S (11 items); ADL: subjective assessment of activities of daily living; ROM: active range of motion; and strength: measurement of abduction strength.
testeretest results not strictly centered (mean 3.4 6.0), but no systematic trend was observed (r ¼ 0.29). The limits of agreement were 8.4 to 15.2 (Fig. 1).
3.3. Construct validity
Differences between scores
The scale had good convergent validity with perceived handicap and ADL scores; moderate correlation with scores for ROM, strength, pain at rest, and pain during activities; fair correlation with disease duration; and little correlation with age and body mass index. Furthermore, the F-QuickDASH-D/S scale had excellent correlation with the full-length F-DASH-D/S score (r ¼ 0.96) (Table 2). Principal component analysis extracted 2 factors, explaining 59.1% of the variance. Varimax rotation showed that the first factor comprises 7 items of ADL and the second factor comprises 4 items, 3 related to pain. The loading of each item after varimax rotation is given in Table 3. 20
3.4. Responsiveness Responsiveness was analyzed in a subgroup of 26 patients (20 women, mean age of 58.2 11.0 years) treated with a corticosteroid injection followed by physiotherapy (Table 4). All patients were evaluated twice, at baseline and at a mean of 7.8 3.9 weeks after treatment. Twenty-four patients reported improvement, 1 reported unchanged clinical status and another deteriorated clinical status after treatment. The mean F-QuickDASHD/S score decreased significantly (48.4 15.8 vs 29.0 16.4, paired t-test, p < 0.0001), with SRM and ES values of 1.09 and 1.23, respectively, which indicates large degree of sensitivity of the instrument to the clinical improvement (Husted et al., 2000). The mean patient perceived handicap score decreased significantly (51.1 19.4 vs 27.7 23.0, paired t-test, p < 0.0001). The correlation between change in patient’s perceived
Table 2 Correlation of F-QuickDASH-D/S with other variables (n ¼ 153). Spearman correlation coefficient (r)
1.96 SD
10 0 -1.96 SD
-10 -20 0
20
40
60
80
100
Mean of the 2 scores Fig. 1. Bland and Altman plot of testeretest scores in analysis of the FQuickDASH-D/S for shoulder disorders.
F-DASH-D/S score (range 0e100) Perceived handicap score (VAS, 0e100 mm) ADL score (range 0e20) Pain score during activities (VAS, 0e100 mm) Strength score (range 0e25) Pain score at rest (VAS, 0e100 mm) ROM score (range 0e40) Disease duration Age Body mass index
0.96 0.79 0.73 0.63 0.58 0.57 0.51 0.38 0.22 0.15
QuickDASH-D/S: short version of the Disability of the Arm, Shoulder and Hand questionnaire (11 items); VAS: visual analog scale; ADL: activities of daily living; and ROM: range of motion.
210
F. Fayad et al. / Manual Therapy 14 (2009) 206e212
Table 3 Factor loading of principal components of the F-QuickDASH-D/S (the highest loading of each item for each factor is bold). Item
Factor 1
Factor 2
1 2 3 4 5 6 7 8 9 10 11
0.766 0.834 0.823 0.562 0.579 0.745 0.347 0.671 0.339 0.065 0.239
0.086 0.215 0.228 0.359 0.338 0.239 0.663 0.366 0.772 0.690 0.727
F-QuickDASH-D/S: French short version of the Disability of the Arm, Shoulder and Hand questionnaire.
handicap and change in QuickDASH-D/S score was moderate (r ¼ 0.57).
4. Discussion Our results strongly suggest that the F-QuickDASHD/S scale can be used for evaluating shoulder conditions. The reliability and internal consistency of the F-QuickDASH-D/S scale were equal to those of the original (English) version (ICC 0.94 and Cronbach’s alpha 0.89 vs 0.94 and 0.94, respectively, Beaton et al., 2005). Although our ICC for testeretest can be considered excellent, graphic representation of the testeretest scores by the Bland and Altman method revealed that despite a marginal number of outliers (2.4%), the scores were not centered, and no systemic trend was observed. The Bland and Altman method revealed the following: (A) The testeretest results were significantly different. This can be explained by the fact that the 2 observations (test and retest) were not independent, probably because knowledge of the first measurement affected the second measurement. Thus, we have a bias between the 2 series of measures. This phenomenon is frequently observed in testeretest Table 4 Demographic and clinical characteristics of 26 patients included in the study of responsiveness of the F-QuickDASH-D/S. Variables Age (years) Sex, F (%) Disease duration (months) Pain score during activities (VAS, 0e100 mm) Perceived handicap (VAS, 0e100 mm) F-QuickDASH-D/S score (range 0e100)
58.2 20 25.7 57.8 51.1 48.4
(11.0) (76.9) (39.4) (22.0) (19.4) (15.8)
Values are mean (SD), unless indicated. F-QuickDASH-D/S: French short version of the Disability of the Arm, Shoulder and Hand questionnaire.
studies (Nunnally and Bernstein, 1994). We note that this bias (mean 3.4), although significantly different from 0, is small. In the current study, we therefore measured, according to the terminology of Bland and Altman, not the repeatability of the testeretest results, which supposes independent observations, but, rather, their agreement, which is possible to evaluate in the presence of our small bias (Bland and Altman, 1999). (B) The limits of agreement were 8.4 and 15.2. Thus, 95% of the differences in testeretest results could be expected to fall within this range, which could be considered clinically unimportant. Indeed, to our knowledge, the minimal important difference for the QuickDASH score has not been published. However, the parameter computed for the DASH score was 12.6 (Schmitt and Di Fabio, 2004). We found a difference between measurements of greater than 12.6 for only 2 patients (4.8%) in our subgroup (n ¼ 42). Thus, our results for differences between test and retest were in a clinically acceptable range. Because no criterion standard exists to assess functional disability (Guyatt et al., 1993), we assessed construct validity. The F-QuickDASH-D/S scale has good correlation with ADL score and patient perceived handicap, which reflects its ability to measure shoulder disability. The F-QuickDASH-D/S scale has shown similar construct validity to the Japanese and English short versions (Beaton et al., 2005; Imaeda et al., 2006) as well as to the original full-length version (Beaton et al., 2001). In addition, the F-QuickDASH-D/S showed an excellent correlation with the full F-DASH-D/S (r ¼ 0.96) as observed in the study by Beaton et al. (2005). These findings suggest that the F-QuickDASH-D/S scale should give a view of disability that is relatively similar to that provided by the full-length DASH. We performed principal component factor analysis for our sample of 153 patients. Indeed, no consensus exists on the minimum number of subjects needed to perform principal component analysis. A minimum of 100e300 subjects has been proposed as necessary (Comrey, 1973; Kline, 1993) or 5e10 times the number of variables (Nunnally and Bernstein, 1994; Streiner, 1994). Principal component analysis of the F-QuickDASH-D/ S scale revealed 1 major factor accounting for 48.4% of the total variance, which was consistent with results of the Japanese version (Imaeda et al., 2006). The current study is the first to provide results of factor analysis with varimax rotation of the QuickDASHD/S and revealed 2 independent factors explaining 59.1% of the total variance. All items retained in each factor have a high loading (>0.5) in 1 factor and weak loading in others. The assignment of item 7 is problematic because this item, representing social activities, would be more clinically relevant in factor 1, whereas
F. Fayad et al. / Manual Therapy 14 (2009) 206e212
it showed high loading in factor 2. That factors could be easily identified after varimax rotation reinforces clinically the robustness of the factorial structure of the scale. We used exploratory and not confirmatory analysis to assess the factorial structure of the F-QuickDASH-D/S scale. Confirmatory analysis is considered more appropriate if the aim of a study is to confirm the existing second-order single-factor of the F-QuickDASH-D/S scale (de Vet et al., 2005). However, exploratory analysis is considered appropriate if the aim of the study is to examine the factor structure of the scale in a population or language in which the QuickDASH has not yet been evaluated (de Vet et al., 2005). Because the factorial structure of the QuickDASH-D/S scale in shoulder disorders is unknown in the French population, we considered that exploratory analysis was relevant. Imaeda et al. (2006) performed principal component analysis without varimax rotation. The authors also found 2 factors; the first 1 had an eigenvalue of 5.12, which explained 47% of the total variance of the QuickDASH-D/S. The second factor had an eigenvalue of 1.74. The authors stated that ‘‘The unidimensionality was found to be strong as a result of a substantial difference between the first and the second factors’’. The responsiveness of the F-QuickDASH-D/S shows that the scale has excellent ability to detect clinical meaningful changes in disability over time in patients with degenerative shoulder disorders after corticosteroid injection and physical treatment. As far as we know, this is the first study of the responsiveness of the QuickDASH-D/S scale in medical shoulder disorders. The sensitivity statistics of the F-QuickDASH-D/S scale are similar to those of the original version (Beaton et al., 2005) with higher SRM than was found for the full-length scale after arthroscopic acromioplasty in 25 patients (Gummesson et al., 2003). Although SRM and ES are useful indicators of the amount of changes, these sensitivity statistics lack discriminate power (Fortin et al., 1995). Thus, relevant changes must be assessed by comparing the change score with an external indicator of change such as self-perceived handicap. In the current study, the correlation between change in patient’s perceived handicap and change in QuickDASHD/S score was moderate. We studied the psychometric properties of the FQuickDASH scale in patients with various shoulder pathologies, rotator cuff tendinopathies, frozen shoulder, osteoarthritis and proximal humeral fracture. As far as we know, the current study was the first to evaluate this self-administered questionnaire in non-operative proximal humerus fracture although this condition is common and cause prolonged disability. The fracture group was characterized by a shorter duration of symptoms, and a lower pain and disability scores than other groups. This particularity may be explained by the fact that patients in this group were seen after bone healing.
211
Nevertheless, our study shows the applicability of the FQuickDASH in patients with medical and traumatic shoulder disorders. The validation of this short version of the DASH outcomes tool may help the clinicians and physical therapists by facilitating the monitoring of disability and dependence of their patients. Our study is limited by the fact that we had to extract the QuickDASH item responses from the full-length DASH questionnaire for the psychometric testing of the scale. This use of data may constitute a bias: patients’ responses to the 11 items would have been different if only the QuickDASH was administered. Thus, our results could lead to an overestimation of the similarity between the short and full-length scales (Haavardsholm et al., 2000). This problem is inherent to many studies validating the short versions of scales (Gummesson et al., 2006; Imaeda et al., 2006; Baron et al., 2007). Two other limitations warrant acknowledgment. First, because all patients were recruited in a tertiary care centre, the results may not be generalizable to a primary care setting. As well, the results of the current study are limited to shoulder disorders and cannot be generalized to other upper extremity disorders.
5. Conclusion The F-QuickDASH-D/S scale is a reliable, valid and responsive instrument for assessing disability in common shoulder disorders. Its psychometric properties are comparable to those of the full-length version of this scale. Therefore, the QuickDASH-D/S could be the preferred scale because it is easier and quicker to use.
Acknowledgement The authors thank the patients who participated in the study and the technical staff of the Department of Rehabilitation Medicine, Cochin Hospital, Paris, France, for their help with data collection.
References Atroshi I, Gummesson C, Andersson B, Dahlgren E, Johansson A. The disabilities of the arm, shoulder and hand (DASH) outcome questionnaire: reliability and validity of the Swedish version evaluated in 176 patients. Acta Orthopaedica Scandinavica 2000;71:613e8. Baron G, Tubach F, Ravaud P, Logeart I, Dougados M. Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis. Arthritis and Rheumatism 2007;57:633e8. Beaton DE, Katz JN, Fossel AH, Wright JG, Tarasuk V, Bombardier C. Measuring the whole or the parts? Validity, reliability, and responsiveness of the disabilities of the arm, shoulder and
212
F. Fayad et al. / Manual Therapy 14 (2009) 206e212
hand outcome measure in different regions of the upper extremity. Journal of Hand Therapy 2001;14:128e46. Beaton DE, Wright JG, Katz JN. Upper Extremity Collaborative Group. Development of the QuickDASH: comparison of three item-reduction approaches. The Journal of Bone and Joint Surgery American Volume 2005;87:1038e46. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet 1986;1:307e10. Bland JM, Altman DG. Measuring agreement in method comparison studies. Statistical Methods in Medical Research 1999;8:135e60. Bot SD, Terwee CB, van der Windt DA, Bouter LM, Dekker J, de Vet HC. Clinimetric evaluation of shoulder disability questionnaires: a systematic review of the literature. Annals of the Rheumatic Diseases 2004;63:335e41. Comrey AL. A first course in factor analysis. New York: New York Academic Press; 1973. Constant CR, Murley AHG. A clinical method for the functional assessment of the shoulder. Clinical Orthopaedics and Related Research 1987;214:160e4. Dubert T, Voche P, Dumontier C, Dinh A. The DASH questionnaire. French translation of a trans-cultural adaptation. Chirurgie de la Main 2001;20:294e302. Fayad F, Mace Y, Lefevre-Colau MM. Shoulder disability questionnaires: a systematic review. Annales de Re´adaptation et de Me´decine Physique 2005;48:298e306. Feleus A, Bierma-Zeinstra SM, Miedema HS, Bernsen RM, Verhaar JA, Koes BW. Incidence of non-traumatic complaints of arm, neck and shoulder in general practice. Manual Therapy 2008;13:426e33. Fermanian J. Measuring agreement between 2 observers: a quantitative case. Revue Epide´miologique et Sante´ Publique 1984;32:408e13. Fermanian J. Validation of assessment scales in physical medicine and rehabilitation: how are psychometric properties determined?. Annales de Re´adaptation et de Me´decine Physique 2005;48:281e7. Fortin PR, Stucki G, Katz JN. Measuring relevant change: an emerging challenge in rheumatologic clinical trials. Arthritis and Rheumatism 1995;38:1027e30. Gabel CP, Michener LA, Burkett B, Neller A. The upper limb functional index: development and determination of reliability, validity, and responsiveness. Journal of Hand Therapy 2006;19:328e48. Gummesson C, Atroshi I, Ekdahl C. The disabilities of the arm, shoulder and hand (DASH) outcome questionnaire: longitudinal construct validity and measuring self-rated health change after surgery. BMC Musculoskeletal Disorders 2003;4:11. Gummesson C, Ward MM, Atroshi I. The shortened disabilities of the arm, shoulder and hand questionnaire (QuickDASH): validity and reliability based on responses within the full-length DASH. BMC Musculoskeletal Disorders 2006;7:44. Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality of life. Annals of Internal Medicine 1993;118:622e9. Haavardsholm EA, Kvien TK, Uhlig T, Smedstad LM, Guillemin F. A comparison of agreement and sensitivity to change between AIMS2 and a short form of AIMS2 (AIMS2-SF) in more than 1,000 rheumatoid arthritis patients. Journal of Rheumatology 2000;27:2810e6. Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: the DASH (disabilities of the arm, shoulder and hand). American Journal of Industrial Medicine 1996;29:602e8.
Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. Journal of Clinical Epidemiology 2000;53:459e68. Imaeda T, Toh S, Wada T, Uchiyama S, Okinaga S, Kusunose K, et al. Validation of the Japanese society for surgery of the hand version of the quick disability of the arm, shoulder, and hand (QuickDASH-JSSH) questionnaire. Journal of Orthopaedic Science 2006;11:248e53. Kline P. The handbook of psychological testing. London/New York, NY: Routledge; 1993. Lee EW, Chung MM, Li AP, Lo SK. Construct validity of the Chinese version of the disabilities of the arm, shoulder and hand questionnaire (DASH-HKPWH). Journal of Hand Surgery (European Volume) 2005;30:29e34. Lefevre-Colau MM, Poiraudeau S, Oberlin C, Demaille S, Fermanian J, Rannou F, et al. Reliability, validity, and responsiveness of the modified Kapandji index for assessment of functional mobility of the rheumatoid hand. Archives of Physical Medicine and Rehabilitation 2003;84:1032e8. Linsell L, Dawson J, Zondervan K, Rose P, Randall T, Fitzpatrick R, et al. Prevalence and incidence of adults consulting for shoulder conditions in UK primary care; patterns of diagnosis and referral. Rheumatology 2006;45:215e21. Matheson LN, Melhorn JM, Mayer TG, Theodore BR, Gatchel RJ. Reliability of a visual analog version of the QuickDASH. The Journal of Bone and Joint Surgery American Volume 2006;88:1782e7. Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New York: Mc Graw Hill; 1994. Offenbaecher M, Ewert T, Sangha O, Stucki G. Validation of a German version of the disabilities of arm, shoulder, and hand questionnaire (DASH-G). The Journal of Rheumatology 2002;29:401e2. Padua R, Padua L, Ceccarelli E, Romanini E, Zanoli G, Amadio PC, et al. Italian version of the disability of the arm, shoulder and hand (DASH) questionnaire. Cross-cultural adaptation and validation. Journal of Hand Surgery (European Volume) 2003;28:179e86. Poiraudeau S, Chevalier X, Conrozier T, Flippo RM, Liote F, Noel E, et al. Reliability, validity, and sensitivity to change of the Cochin hand functional disability scale in hand osteoarthritis. Osteoarthritis and Cartilage 2001;9:570e7. Rekola KE, Keinanen-Kiukaanniemi S, Takala J. Use of primary health services in sparsely populated country districts by patients with musculoskeletal symptoms: consultations with a physician. Journal of Epidemiology and Community Health 1993;47:153e7. Schmitt JS, Di Fabio RP. Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria. Journal of Clinical Epidemiology 2004;57:1008e18. Shrout PE, Fleiss JL. Intraclass coefficients: uses in assessing rater reliability. Psychological Bulletin 1979;86:420e8. Streiner DL. Figuring out factors: the use and misuse of factor analysis. Canadian Journal of Psychiatry 1994;39:135e40. de Vet HC, Ader HJ, Terwee BC, Pouwer F. Are factor analytical techniques used appropriately in the validation of health status questionnaires? A systematic review on the quality of factor analysis of the SF-36. Quality of Life Research 2005;14:1203e18. Ware Jr JE, Keller SD, Gandek B, Brazier JE, Sullivan M. Evaluating translations of health status questionnaires. Methods from the IQOLA project. International Quality of Life Assessment. International Journal of Technology Assessment in Health Care 1995;11:525e51.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 213e221 www.elsevier.com/math
Original Article
Inter- and intra-examiner reliability of single and composites of selected motion palpation and pain provocation tests for sacroiliac joint* Amir Massoud Arab a,*, Iraj Abdollahi a, Mohammad Taghi Joghataei b, Zahra Golafshani c, Anoshirvan Kazemnejad d a
Department of Physical Therapy, University of Social Welfare and Rehabilitation Sciences, Evin, Koodakyar Avenue, P.O. Box 19834, Tehran, Iran b Iran University of Medical Sciences, Tehran, Iran c University of Social Welfare and Rehabilitation Sciences, Tehran, Iran d Department of Biostatistics, School of medical sciences, Tarbiat Modarres University, Tehran, Iran Received 14 March 2007; received in revised form 2 February 2008; accepted 7 February 2008
Abstract The sacroiliac joint (SIJ) has been implicated as a potential source of low back and buttock pain. Several types of motion palpation and provocation tests are used to examine the SIJ. It has been suggested that use of a cluster of motion palpation or provocation tests is a more acceptable method than single test to assess SIJ. This study examined the inter- and intra-examiner reliability of single and composites of the motion palpation and provocation tests together. Twenty-five patients between the ages of 20 and 65 years participated. Four motion palpation and three provocation tests were examined three times on both sides (left, right) by two examiners. Kappa coefficient and prevalence-adjusted and bias-adjusted kappa (PABAK) were calculated to evaluate the reliability. PABAK for intra- and inter-examiner reliability of individual tests ranged from 0.36 to 0.84 (95% CI: 0.22 to 1.12) and 0.52 to 0.84 (95% CI: 0.18 to 1.08) which is considered fair to substantial. PABAK for intra- and inter-examiner reliability for clusters of motion palpation or provocation tests ranged from 0.44 to 0.92 (95% CI: 0.36 to 1.2) which is considered moderate to excellent reliability. PABAK for intra- and inter-examiner reliability of composites of motion palpation and provocation tests ranged from 0.44 to 1.00 (95% CI: 0.22 to 1.12) and 0.52 to 0.92 (95% CI: 0.02 to 1.32) which is considered substantial to excellent. It seems that composites of motion palpation and provocation tests together have reliability sufficiently high for use in clinical assessment of the SIJ. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Sacroiliac joint; Reliability; Low back pain; Test
1. Introduction
* This research was reviewed and was approved by the Human Subject Committee at University of Social Welfare and Rehabilitation Sciences. * Corresponding author. Tel./fax: þ98 21 22418746 (Office). E-mail addresses:
[email protected], amarab@ uswr.ac.ir (A.M. Arab).
1356-689X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2008.02.004
Low back pain (LBP) is one of the most frequent musculoskeletal complaints in today’s societies. Epidemiologic studies have indicated a lifetime prevalence of 70e80% in the western population (Ehrlich, 2003). Several factors have been associated with the development of LBP. The sacroiliac joint (SIJ) has been implicated as a potential source of the pain in low back and buttock
214
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
with or without lower extremity symptom (Fortin et al., 1994a,b; Schwarzer et al., 1995; Slipman et al., 2001). Schwarzer et al. (1995) reported 13e30% prevalence of SIJ pain in LBP patients. A wide variety of diagnostic tests are used to evaluate the SIJ in patients with LBP. These tests are classified into three categories: (1) motion palpation tests to assess movement; (2) pain provocation tests to stress SIJ structures and (3) tests designed to assess location and relative symmetry of SIJ landmark. Pain provocation tests attempt to assess whether or not the structure being stressed is a source of pain while motion palpation tests may be used to assess SIJ dysfunction. In clinical practice, it is suggested to treat on the basis of an accurate diagnosis. Many SIJ tests could be influenced by various structures in the low back, hip and other tissues, so tests might lose their precision (Maigne et al., 1996). For diagnostic tests to yield meaningful results in clinical practice, they should be both valid and reproducible. Previous studies of individual SIJ tests indicate that inter-examiner reliability is poor for motion palpation tests and from poor to excellent for provocation tests (Potter and Rothstein, 1985; Laslett and Williams, 1994; Vincent-Smith and Gibbons, 1999). Current evidence suggests that single test is not reliable enough to be used for diagnosing SIJ pain or dysfunction, whereas the use of a cluster of tests (combining the results of a number of tests) is a more acceptable method (Cibulka et al., 1988; Haas, 1991; Kokmeyer et al., 2002; Laslett et al., 2005; Robinson et al., 2007). Review of the literature revealed some limitations in previous studies for reliability of SIJ tests. Potter and Rothstein (1985) only reported percent agreement and did not calculate kappa. Therefore, the results were not corrected for chance agreement. Some were unclear as to when the performed tests were positive (Dreyfuss et al., 1996). In some studies tests were classified as positive or negative, regardless of apparent referencing to a particular side (Cibulka and Koldehoff, 1999). In most previous studies motion palpation and provocation tests have been evaluated separately and tests clusters are combination of either motion palpation or provocation tests. Many clinicians use combinations of motion palpation and provocation tests, yet the studies evaluating the reliability and validity deal with either single tests or either of the two types (Bernard, 1997; Mooney, 1997). Utilization of different types of motion palpation and provocation tests has been reported in the literature. Some investigators published systematic methodological review articles on studies of SIJ tests. Based on some criteria list including study population, test procedure and results, articles related to SIJ tests were scored (van der Wurff et al., 2000a,b; Stuber, 2007). Among several types of provocation tests have been previously described to assess SIJ pain, we selected PatrickeFABER test, thigh thrust or posterior shear test and resisted abduction test that have acceptable level of
sensitivity and specificity, high method scores and valid authors’ conclusions in systematic methodological reviews (van der Wurff et al., 2000a,b; Stuber, 2007). The inter-examiner reliability of single thigh thrust and Patrick’s tests has been reported from poor to excellent (Laslett and Williams, 1994; Dreyfuss et al., 1996; Strender et al., 1997), but no data were found regarding intraexaminer reliability. However, we found no study that has evaluated reliability of the resisted abduction test. Of motion palpation procedures, we included the standing flexion test, Gillet test, sitting flexion test and prone knee flexion test (Potter and Rothstein, 1985; Cibulka and Koldehoff, 1999; Meijne et al., 1999; Vincent-Smith and Gibbons, 1999; Riddle and Freburger, 2002) which are widely used in clinics. Despite several published studies regarding the reliability of single and clusters of motion palpation or provocation tests, we found no study that has examined reliability for composites of motion palpation and provocation tests. With the exception of one study (Kokmeyer et al., 2002), paradoxical effects of bias and prevalence on kappa coefficient have not been considered in previous studies. This is important since the magnitude of kappa could be influenced by prevalence and bias index (BI) (Byrt et al., 1993; Sim and Wright, 2005). It seems it is necessary to consider these potential biases when calculating kappa for better interpretation of the results. The current study examined selected pain provocation and motion palpation tests and evaluated the inter- and intraexaminer reliability of single and composites of tests considering the particular side in test results, including bias and prevalence effects on kappa.
2. Methods 2.1. Subjects A total of 25 subjects aged 20e65 presenting in Orthopaedic and Physical Therapy Departments with LBP were selected for inclusion in the study. All subjects signed an informed consent form approved by the human subjects committee at the University of Social Welfare and Rehabilitation Sciences before participating in the study. Patients were included in the study if their reported pain was below L5, over the posterior aspect of SIJ around posterior superior iliac spine (PSIS) and buttock with or without leg pain. The patients were excluded if they had only midline or symmetrical pain above the level of L5 or radicular pain with neurological sign (sensory or motor deficit) (Laslett et al., 2003, 2005; Young et al., 2003). Subjects with history of spinal surgery, fracture of the spine, pelvis and lower extremities, hospitalization for severe trauma or car accident, leg length difference, hip/knee dysfunctions, pregnancy, any systemic disease and liver and/or kidney failure were
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
also excluded. Two physical therapists with 1-year experience, blinded to all patient information, tested the subjects. Both examiners were given a written description of test procedures and instructed to practice on each other prior to examining patients. 2.2. Procedures The patients were assigned to one of two rooms by an independent observer. The first examiner completed tests with a subject, and the second examiner repeated the examination after a 15 min rest period. The second examiner carried out the test procedures in a random order, different from the first examination sequence. This process was repeated three times in a random order with a break of 30 min between them. The first and second examiners were randomly selected and their results blinded from each other. All seven tests were applied three times to both sides on all subjects by both examiners. In overall 2100 measurements were taken. The procedure for each of the tests was as follows. 2.3. PatrickeFABER test The patient lies supine on the table, and the examiner stands next to him/her. The examiner brings the ipsilateral hip into flexion, abduction and external rotation and knee into flexion so that the heel is on the contralateral knee. Then the examiner fixates the contralateral anterior superior iliac spine (ASIS) and applies pressure on the subject’s flexed knee. The test is positive when similar buttock or groin pain below L5 is reproduced (Maigne et al., 1996; Kokmeyer et al., 2002; Robinson et al., 2007). 2.4. Thigh thrust or posterior shear test The subject lies supine on the table. The examiner flexes the hip and knee that hip is approximately in 90 flexion and slight adduction and thigh is at right angle to the table while the knee remains relaxed. One of the examiner’s hand cups the sacrum and the other arm and hand wraps around the flexed knee. The axial pressure applied is directed through the long axis of the femur, which causes anterior to posterior shear to the SIJ. The test is considered positive when familiar pain is provoked over the posterior aspect of SIJ below L5 (Kokmeyer et al., 2002; Laslett et al., 2003, 2005). 2.5. Resisted abduction test The subject’s position is supine with the leg fully extended as well as being abducted to 30 . The examiner holds the ankle and pushes medially while the subject pushes laterally. The test is positive when similar pain
215
is produced over the SIJ below L5 (Broadhurst and Bond, 1998). 2.6. Standing flexion test The standing flexion test is performed by palpating the PSISs while the subject is bending forward from standing position. The test is negative if PSISs appear to move equally and symmetrically and positive on the side in which the PSIS moves cranially more than other side. A positive result in a standing flexion test indicates limited movement of the ilium on the sacrum, and therefore limited SIJ motion on the side of the superior PSIS (Potter and Rothstein, 1985; Cibulka and Koldehoff, 1999; Riddle and Freburger, 2002). 2.7. Sitting flexion test The procedure is similar to standing flexion test except that it is performed with the patient sitting on a level surface. The test is positive on the side in which the PSIS moves cranially more than other side and negative if PSISs move equally. A positive result in this test indicates limited movement of the sacrum on the ilium, and limited SIJ motion on the side of the superior PSIS (Potter and Rothstein, 1985; Cibulka and Koldehoff, 1999; Riddle and Freburger, 2002). 2.8. Gillet test While the subject stands, the examiner palpates the PSISs. The subject is asked to stand on one leg while pulling the opposite knee up to chest. The test is repeated with other legs. On the normal side, the PSIS move inferiorly. If the PSIS on the side on which the knee is flexed and pulled to chest remains at the level of other PSISs or moves down minimally or even paradoxically moves superiorly, it indicates a positive test (Potter and Rothstein, 1985; Dreyfuss et al., 1996; Meijne et al., 1999). 2.9. Prone knee flexion test The subject’s position is prone. While the examiner holds both heels, the patient’s knees are passively flexed to 90 . The leg lengths are compared by examining the left and right soles of the heel in the prone and prone knees flexed position. The test is negative if no relative change in leg lengths between two positions occurred. If one leg appears shorter than other in the prone knee extended position, apparent lengthening of the short leg in prone knees flexed position implies a hypothesized posterior innominate rotation (Potter and Rothstein, 1985; Cibulka and Koldehoff, 1999; Riddle and Freburger, 2002).
k ¼ kappa coefficient, SE ¼ standard error, 95% CI ¼ 95% confidence interval, kmax ¼ maximum kappa coefficient, PI ¼ prevalence index, BI ¼ bias index, and PABAK ¼ prevalence-adjusted and bias-adjusted kappa.
0.78 (0.14) 0.49e1.07 0.78 0.52 (0.08) 0.84 0.5 (0.26) 0.02 to 1.03 0.84 0.72 (0.04) 0.76 0.78 0.52 (0.08) 0.68 1 0.76 (0) 0.84 0.56 (0.19) 0.17e0.95 0.62 (0.25) 0.11e1.12 0.48 (0.2) 0.07e0.88 0.5 (0.22) 0.06e0.95 R L Resisted abduction
0.79 0.48 (0.04) 0.6 0.75 0.6 (0.08) 0.68
1 0.44 (0) 0.68 0.80 0.44 (0.08) 0.52 0.6 (0.18) 0.24e0.96 0.4 (0.21) 0.00e0.82 0.62 0.48 (0.12) 0.6 0.51 0.6 (0.16) 0.68 0.49 (0.2) 0.09e0.89 0.51 (0.22) 0.08e0.95 0.44 (0.19) 0.06e0.83 0.4 (0.21) 0.00e0.82 R L Thigh thrust
1 0.36 (0) 0.52 0.80 0.44 (0.08) 0.52
1 0.36 (0) 0.52 0.70 0.48 (0.12) 0.6 0.44 (0.19) 0.06e0.83 0.49 (0.2) 0.09e0.89 0.31 (0.2) 0.08 to 0.70 0.74 0.28 (0.08) 0.36 0.4 (0.21) 0.03 to 0.82 0.80 0.44 (0.08) 0.52 0.75 0.24 (0.12) 0.44 0.91 0.24 (0.04) 0.44 0.41 (0.18) 0.07e0.78 0.40 (0.19) 0.03e0.78 R L FABER
0.72 0.4 (0.12) 0.44 0.89 0.48 (0.04) 0.6 0.41 (0.18) 0.07e0.78 0.75 0.24 (0.12) 0.44 0.27 (0.25) 0.22 to 0.78 0.52 0.6 (0.16) 0.52 R L Prone knee flexion
0.34 (0.21) 0.06 to 0.7 0.48 (0.2) 0.07e0.88
0.58 (0.16) 0.25e0.91 0.75 0.24 (0.12) 0.6 0.33 (0.26) 0.18 to 0.85 0.61 0.64 (0.12) 0.6
0.75 0.6 (0.08) 0.84 0.64 0.36 (0.16) 0.68 0.75 (0.16) 0.42e1.08 0.64 (0.16) 0.32e0.96 0.65 0.56 (0.12) 0.76 0.73 0.32 (0.12) 0.6 0.91 0.32 (0.04) 0.76 0.82 0.28 (0.08) 0.68 0.73 (0.14) 0.45e1.01 0.65 (0.15) 0.34e0.96 Sitting flexion R L
0.65 (0.18) 0.29e1.02 0.56 (0.17) 0.21e0.9
0.51 0.6 (0.16) 0.68 0.73 0.32 (0.04) 0.6 0.51 (0.22) 0.08e0.95 0.55 (0.17) 0.2e0.9 0.6 0.64 (0.12) 0.76 0.51 0.6 (0.16) 0.68 0.6 (0.21) 0.18e1.02 0.51 (0.22) 0.08e0.95 0.89 0.48 (0.04) 0.76 0.75 0.44 (0.16) 0.68 0.68 (0.16) 0.35e1.01 0.61 (0.17) 0.27e0.96 R L Standing flexion
PABAK kmax PI (BI) 95% CI
0.41 (23) 0.03e0.87 0.34 (0.21) 0.06 to 0.7
PABAK k (SE)
0.25 0.6 (0) 0.52 0.61 0.44 (0.16) 0.36
kmax PI (BI) 95% CI
0.25 (0.26) 0.2 to 0.77 0.23 (0.22) 0.2 to 0.67
PABAK k (SE) kmax PI (BI) 95% CI k (SE)
0.42 (0.22) 0.01 to 0.87 0.44 0.56 (0.12) 0.6 0.49 (0.2) 0.09e0.89 0.70 0.48 (0.12) 0.6
Inter-tester Tester 2 Side Tester 1
Twenty-five subjects (15 males and 10 females) between the ages of 20 and 65 with a mean age of 43 10 years participated in the study. The subjects’ mean height was 168 7 cm and mean weight was 68 10 kg. Table 1 presents the intra- and interexaminer reliability estimates for each single motion palpation and provocation test used in the study. For intra- and inter-examiner reliability of individual provocation tests, PABAK ranged from 0.36 to 0.84 and 0.52 to 0.84 and kappa from 0.31 to 0.62 and 0.44 to 0.78 (95% CI: 0.08 to 1.12 and 0.06 to 1.07). For intraand inter-examiner reliability of individual motion
Tests
3. Results
Table 1 Intra- and inter-examiner reliability of the single motion palpation and pain provocation test.
MedCalcÒ statistical software was used for data analysis. The kappa coefficient (k) with 95% confidence interval which discounts the proportion of agreement that is expected by chance was calculated to assess reliability. Although the kappa coefficient is widely used to assess the reliability, there are two main paradoxes that can influence the magnitude of kappa. Thus alongside the obtained value of kappa, it is necessary to consider the paradoxical effects of prevalence and BI on kappa for better interpretation. For a situation in which raters choose between classifying cases as either positive or negative in respect to a clinical sign, prevalence effect exists when the proportion of agreements on the positive classification differs from that of the negative classification. This can be expressed by the prevalence index (PI). When the PI is high, i.e. approaches to 1.0, chance agreement is also high and kappa is reduced accordingly. Bias is the extent to which the raters disagree on the proportion of positive (or negative) cases and could be stated by the BI. For example, in the 2 2 contingency table is shown as in Table S1, cells a and d indicate, respectively, the numbers of subjects for whom both examiner agree on negative and positive and cells b and c indicate the numbers of subjects on whom the examiners disagree. PI is the ja dj/n, where ja dj is the absolute value of the difference between cells. BI is the jb cj/n. Some statisticians have devised kappa adjustments to take account of prevalence and bias influences by calculating prevalence-adjusted and biasadjusted kappa (PABAK). Use of the BI is equivalent to replacing cells b and c by their average ([b þ c]/2) while use of the PI is equivalent to replacing cells a and d by their average ([a þ d]/2) and calculating kappa in the usual fashion. We included bias and prevalence effects on kappa coefficient by calculating BI, PI and PABAK values as suggested by others (Byrt et al., 1993; Hoehler, 2000; Sim and Wright, 2005). Kappa maximum (kmax) was also calculated.
R L
2.10. Data analysis
0.88 0.56 (0.04) 0.6 0.72 0.4 (0.12) 0.44
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
Gillet
216
0.88 0.56 (0.04) 0.92 0.83 0.72 (0.04) 0.92 0.88 (0.11) 0.66e1.10 0.83 (0.16) 0.51e1.15 0.88 0.56 (0.04) 0.76 0.76 (0) 0.84 1 1 1
0.6 (0) 0.68 (0)
0.84 0.84 0.75 (0.17) 0.41e1.08 0.7 (0.2) 0.3e1.09 3 of 3 provocation R L
0.65 (0.18) 0.28e1.02 0.62 (0.25) 0.11e1.12
0.89 0.48 (0.04) 0.76 0.75 0.6 (0.08) 0.68 0.68 (0.16) 0.35e1.01 0.5 (0.22) 0.06e0.95 0.78 0.52 (0.08) 0.68 0.51 0.6 (0.16) 0.68 0.63 (0.16) 0.30e0.96 1 0.36 (0) 0.68 0.41 (0.23) 0.03 to 0.87 0.88 0.56 (0.04) 0.6 2 of 3 provocation R L
0.56 (0.19) 0.17e0.95 0.51 (0.22) 0.08e0.95
0.92 0.08 (0.04) 0.6 0.83 0.2 (0.16) 0.52 0.59 (0.16) 0.28e0.91 0.51 (0.17) 0.17e0.85 0.75 0.24 (0.12) 0.44 0.73 0.32 (0.04) 0.6 0.41 (0.18) 0.07e0.78 0.55 (0.17) 0.2e0.9 0.91 0.16 (0.04) 0.44 0.91 0.16 (0.04) 0.44 0.42 (0.18) 0.06e0.78 0.42 (0.18) 0.06e0.78 1 of 3 provocation R L
0.8 (0.04) 0.92 0.8 (0.04) 0.76 0.77 (0.21) 0.35e1.2 0.77 0.33 (0.35) 0.36 to 1.04 0.78 R L 4 of 4 motion palpation
0.77 (0.21) 0.35e1.2 0.77 0.8 (0.04) 0.92 0.46 (0.36) 0.23 to 1.17 0.46 0.84 (0.08) 0.84
0.84 (0) 0.52 (0)
0.84 0.84 0.45 (0.36) 0.26 to 1.17 1 0.78 (0.14) 0.48e1.07 1
0.6 0.64 (0.12) 0.76 0.92 0 (0.04) 0.44 0.6 (0.21) 0.18e1.02 0.44 (0.17) 0.08e0.79 R L 3 of 4 motion palpation
0.6 (0.21) 0.18e1.02 0.68 (0.14) 0.39e0.96
0.6 0.64 (0.12) 0.76 0.68 0.12 (0.16) 0.68
0.41 (0.27) 0.11 to 0.94 0.71 0.68 (0.08) 0.68 0.42 (0.18) 0.07e0.79 0.91 0.16 (0.12) 0.44
0.78 0.52 (0.08) 0.84 0.81 0.36 (0.08) 0.84 0.78 (0.14) 0.49e1.07 0.81 (0.12) 0.57e1.06 0.78 0.52 (0.08) 0.68 1 0.28 (0) 0.68 0.56 (0.19) 0.17e0.95 0.65 (0.15) 0.34e0.96 1 0.44 (0) 0.84 0.91 0.32 (0.04) 0.76 0.8 (0.13) 0.53e1.06 0.73 (0.14) 0.45e1.01 R L 2 of 4 motion palpation
PABAK
0.76 0.08 (0.12) 0.6 1 0.52 (0) 0.68
kmax PI (BI) 95% CI
0.6 (0.15) 0.29e0.91 0.56 (0.2) 0.16e0.95
PABAK k (SE)
0.69 0.12 (0.16) 0.52 0.89 0.48 (0.04) 0.76
kmax PI (BI) 95% CI
0.52 (0.16) 0.19e0.85 0.68 (0.16) 0.35e1.01
PABAK k (SE) kmax PI (BI)
0.91 0.16 (0.04) 0.44 0.89 0.48 (0.04) 0.76
Inter-tester Tester2
95% CI k (SE)
0.42 (0.18) 0.06e0.78 0.68 (0.16) 0.35e1.01 R L
The kappa values for composites of motion palpation and provocation tests together revealed reliability exists along a continuum from no agreement (k ¼ 0) to excellent (e.g. k ¼ 1). Although the magnitude of kappa is widely used to test reliability in several studies, the interpretation of kappa, however, is not so straightforward, as there are some other factors that can influence the magnitude of the coefficient. Among those factors that can influence the magnitude of kappa the main are prevalence and bias (Hoehler, 2000; Sim and Wright, 2005). This issue has been completely explained in data analysis section. As discussed, in Table S1 as 2 2 contingency table of data from two examiner: cells a and d indicate,
Side Tester1
4.1. Kappa and PABAK
Tests clusters
To interpret kappa values, the guidelines proposed by Landis and Koch (1977) were used. Based on the kappa values, the results derived from this study mostly demonstrate fair to moderate reliability for the single motion palpation and provocation tests and moderate to substantial reliability for cluster of provocation or motion palpation tests (Tables 1e3).
Table 2 Intra- and inter-examiner reliability for cluster of motion palpation or provocation tests.
4. Discussion
1 of 4 motion palpation
palpation tests, the range of PABAK was between 0.44 and 0.76 and between 0.60 and 0.84 and kappa between 0.23 and 0.73 and between 0.33 and 0.75 (95% CI: 0.2 to 1.01 and 0.18 to 1.08). The ranges are for kappa and PABAK from test with lowest to test with highest scores. The results of the intra- and inter-examiner reliability for cluster of provocation and motion palpation tests separately are presented in Table 2. For intra- and inter-examiner reliability of clusters of provocation tests, PABAK ranged from 0.44 to 0.84 and 0.52 to 0.92 and kappa from 0.41 to 0.75 and 0.50 to 0.88 (95% CI: 0.03 to 1.08 and 0.06 to 1.1). In cluster of motion palpation tests, PABAK ranged from 0.44 to 0.92 and 0.44 to 0.84 for intra- and inter-examiner reliability and kappa ranged from 0.41 to 0.80 and 0.33 to 0.81 (95% CI: 0.11 to 1.06 and 0.36 to 1.06) for intra- and inter-examiner reliability. The ranges are for kappa and PABAK from cluster with lowest to highest kappa and PABAK. Table 3 represents the intra- and inter-examiner reliability for composites of motion palpation and provocation tests together. The range of kappa and PABAK for intra-examiner reliability varied from 0.00 to 1.00 and 0.44 to 1.00 (95% CI: 1.92 to 1) and for inter-examiner reliability ranged between 0.00 and 0.77 and between 0.52 and 0.92 (95% CI: 1.32 to 1), respectively. The ranges are for kappa and PABAK from composite with lowest to highest kappa and PABAK.
k ¼ kappa coefficient, SE ¼ standard error, 95% CI ¼ 95% confidence interval, kmax ¼ maximum kappa coefficient, PI ¼ prevalence index, BI ¼ bias index, and PABAK ¼ prevalence-adjusted and bias-adjusted kappa.
217
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
218
Table 3 Reliability for the composites of the tests. Tests clusters Side Tester1 k (SE)
Tester2 95% CI
R L
0.34 (0.23) 0.11 to 0.8 0.45 (0.19) 0.06e0.83
1 mp/2p
R L
1 mp/3p
PABAK k (SE)
1 0.52 (0) 0.81 0.36 (0)
0.52 0.52
Inter-tester 95% CI
kmax PI (BI)
95% CI
kmax PI
PABAK
0.68 0.6
0.33 (0.26) 0.18 to 0.85 0.61 0.64 (0.12) 0.6 0.47 (0.18) 0.11e0.84 1 0.28 (0) 0.52
0.41 (27) 0.11 to 0.94 0.71 0.68 (0.08) 0.68 0.2 (0.25) 0.03 to 0.70 0.66 0.56 (0.12) 0.44
0.33 (0.35) 0.36 to 1.04 0.78 0.8 (0.04) 0.76 0.43 (0.26) 0.07 to 0.94 0.43 0.68 (0.16) 0.68
0.41 (0.27) 0.11 to 0.94 0.71 0.68 (0.08) 0.68 0.33 (0.35) 0.36 to 1.04 0.78 0.8 (0.04) 0.76
R L
0.5 (0.26) 0.02 to 1.03 0.84 0.72 (0.04) 0.76 0.33 (0.35) 0.36 to 1.04 0.78 0.8 (0.04) 0.76
0.33 (0.35) 0.36 to 1.04 0.78 0.8 (04) 0.45 (0.36) 0.26 to 1.17 1 0.84 (0)
0.62 (0.25) 0.13e1.12 0.62 0.76 (0.08) 0.84 0.33 (0.35) 0.36 to 1.04 0.78 0.8 (0.04) 0.76
2 mp/1p
R L
0.4 (0.27) 0.13 to 0.93 1 0.63 (0.16) 0.3e0.96 1
0.00 (0.4) 0.78 to 0.78 0.00 0.5 (0.22) 0.06e0.95 0.75
2 mp/2p
R L
0.62 (0.25) 0.11e1.12 0.5 (0.22) 0.06e0.95
1 0.76 (0) 0.84 0.75 0.6 (0.08) 0.68
0.62 (0.25) 0.13e1.12 0.51 (0.26) 0.00e1.03
2 mp/3p
R L
0.62 (0.25) 0.11e1.12 0.77 (0.21) 0.35e1.2
1 0.76 (0) 0.84 0.77 0.8 (0.04) 0.92
0.77 (0.21) 0.35e1.2 0.77 0.8 (0.04) 0.92 0.45 (0.36) 0.26 to 1.17 1 0.84 (0) 0.84
0.77 (0.21) 0.35e1.2 0.77 0.8 (0.04) 0.92 0.45 (0.36) 0.26 to 1.17 1 0.84 (0) 0.84
3 mp/1p
R L
0.77 (0.21) 0.35e1.2 0.49 (0.2) 0.09e0.89
0.77 0.8 (0.04) 0.92 0.70 0.48 (0.12) 0.6
0.62 (0.25) 0.13e1.12 0.51 (0.26) 0.00e1.03
0.62 (0.25) 0.11e1.12 0.49 (0.2) 0.09e0.89
3 mp/2p
R L
1.0 (0.0) 1.0e1.0 0.62 (0.25) 0.13e1.12
1 0.84 (0) 1.0 0.62 0.76 (0.08) 0.84
0.77 0.8 (0.04) 0.92 0.77 (0.21) 0.35e1.2 0.45 (0.36) 0.26 to 1.17 1 0.84 (0) 0.84
0.77 (0.21) 0.35e1.2 0.77 0.8 (0.04) 0.92 0.64 (0.34) 0.02 to 1.32 0.64 0.88 (0.04) 0.92
3 mp/3p
R L
1.0 (.0) 1.0e1.0 1 0.45 (0.36) 0.26 to 1.17 1
1.0 0.84
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.64 (0.34) 0.02 to 1.32 0.64 0.88 (0.04) 0.92
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.45 (0.36) 0.26 to 1.17 1 0.84 (0) 0.84
4 mp/1p
R L
1.0 (0.0) 1.0e1.0 1 0.84 (0) 1.0 0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.00 (0.98) 1.92 to 1.92 0.00 0.96 (0.04) 0.92
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.64 (0.34) 0.02 to 1.32 0.64 0.88 (0.04) 0.92
4 mp/2p
R L
1.0 (0.0) 1.0e1.0 1 0.84 (0) 1.0 0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.00 (0.98) 1.92 to 1.92 0.00 0.96 (0.04) 0.92
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.64 (0.34) 0.02 to 1.32 0.64 0.88 (0.04) 0.92
4 mp/3p
R L
1.0 (0.0) 1.0e1.0 1.0 0.84 (0) 1.0 0.00 (0.98) 1.92 to 1.92 0.00 0.96 (0.04) 0.92
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.00 (0.98) 1.92 to 1.92 0.00 0.96 (0.04) 0.92
0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84 0.00 (0.67) 1.32 to 1.32 0.00 0.92 (0.08) 0.84
mp ¼ motion palpation and p ¼ provocation.
0.68 (0) 0.36 (0)
0.84 (0) 0.84 (0)
0.68 0.68
0.56 (0.2) 0.16 to 0.95 1 0.5 (0.19) 0.11e0.89 0.5
PABAK k (SE)
0.52 (0) 0.48 (0.2)
0.76 0.84
0.8 (0.2) 0.6 0.6 (0.08) 0.68
0.59 (0.22) 0.16e1.02 0.46 (0.19) 0.09e0.83
0.86 0.64 (0.04) 0.76 0.64 0.36 (0.16) 0.52
0.62 0.76 (0.08) 0.84 0.51 0.72 (0.12) 0.76
0.77 (0.21) 0.35e1.2 0.62 (0.25) 0.13e1.12
0.77 0.8 (0.04) 0.92 0.62 0.76 (0.08) 0.84
0.62 0.76 (0) 0.84 0.51 0.72 (0.12) 0.76
0.62 0.76 (0) 0.84 0.70 0.48 (0.12) 0.6
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
1 mp/1p
kmax PI (BI)
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
respectively, the numbers of subjects for whom both examiner agree on negative or positive and cells b and c indicate the numbers of subjects on whom the examiners disagree (cell b: examiner 1 positive while examiner 2 negative; cell c: examiner 1 negative while examiner 2 positive) (Table S1). Considering our data, PI in examining the reliability of individual tests is not very high and the kappa and PABAK are similar (Table 1). But for cluster of provocation and motion palpation tests especially composites of motion palpation and provocation tests, it can be seen that PI values are nearly to 1 (high) (Tables 2 and 3), indicating that kappa is affected by prevalence. For more explanation, the raw data of the 2 2 contingency tables of data for composites of motion palpation and provocation tests are displayed in Table S2 which is placed on the electronic version only. As an example, for inter-examiner reliability of composite of four motion palpation and two provocation tests in right side the proportion of examiners’ agreement on the negative results is high (23 of 25 patients) but agreement on positive result is 0 (a ¼ 23, d ¼ 0, b ¼ 2, c ¼ 0). The PI, therefore, is high (0.92) and PABAK is 0.84 while k ¼ 0.00 (Table 3). Table S2 presents the raw data of 2 2 tables of two examinations for other composites of tests for better interpretation. Thus PABAK was used to interpret the results especially for tests clusters. The standards proposed by Landis and Koch (1977) is also used to interpret the magnitude of PABAK (Hoehler, 2000; Kokmeyer et al., 2002; Sim and Wright, 2005). 4.2. Reliability of the individual tests Using PABAK, therefore, our data indicate fair to substantial reliability for the individual tests (Table 1). Some authors have suggested that motion palpation tests are reliable (Herzog et al., 1989; Cibulka and Koldehoff, 1999), while some other studies have demonstrated low reliability for individual motion palpation tests and poor to substantial for single pain provocation tests (Potter and Rothstein, 1985; Laslett and Williams, 1994; Strender et al., 1997; Meijne et al., 1999). However, they did not use exactly the same tests. We attempted to select tests with acceptable level validity, sensitivity and specificity (van der Wurff et al., 2000a,b). Reliability can be influenced by several factors such as the participants, therapists and clinical tests. In former studies, some researchers have used asymptomatic subjects. In this study the participants were recruited from LBP patients with clinical signs suggestive of SIJ and patients with symptoms suggesting other sources of LBP were excluded. For the pain provocation tests, concordant pain response is one in which there is reproduction of a pain that is similar to or exactly the same as the complaint, and discordant pain is provocation of a pain that is
219
atypical of the complaint. The tests in this study were classified as positive or negative referencing to a particular side, as it has been recommended. In some clinical situations, doing a test on one side may produce pain on the opposite side, and it may be improperly considered positive. We considered tests positive if a concordant pain was reproduced in the same side. Laslett (1998) believes that insufficient pressure when applying provocation tests may generate many false negatives and affect the reliability. It has been assumed that variability in applied force and time interval force could affect the results of provocation tests (Levin et al., 2001; Levin and Stenstrom, 2003). O’Haire and Gibbons (2000) attributed poor reliability of SIJ motion palpation tests to lack of reliability of SIJ landmark palpation and location. The moderate and substantial reliability for single provocation and motion palpation tests in the present study could be explained by our addressing these factors. Unlike other tests, reliability of the resisted abduction test has not been reported previously. Broadhurst and Bond (1998) reported 87% sensitivity and 100% specificity for it. Our data indicate substantial reliability for it as a single test (Table S1). It has been supposed that in this test the leg is used as a lever with the fulcrum at the inferior border of the SIJ, therefore, stressing the cephalic aspect of SIJ. 4.3. Reliability for the cluster of motion palpation or provocation tests Considering PABAK, the results of this study showed moderate to excellent reliability for the cluster of motion palpation or provocation tests (Table 2). By comparing the reliability data provided in Table S1 and Table 1, it is discernible that reliability of test clusters achieved better reliability than individually performed tests. From different clusters of motion palpation tests, reliability for cluster of three positive out of four tests was found to be substantial and more than other types. For clusters of provocation tests, reliability of three positive of three tests was substantial and better than other types (Table S1, Table 1). In multi-test regimens recommended for evaluation of SIJ, only clusters of motion palpation or provocation tests regardless of involved side have been used. Robinson et al. (2007) found good reliability for clusters of provocation tests and poor for the single tests in left and right sides. 4.4. Reliability for composites of motion palpation and provocation tests Our findings indicate substantial to excellent reliability for composites of motion palpation and provocation tests (Table 3). By considering that pain provocation and motion palpation tests assess SIJ pain and dysfunction, respectively, composites of motion palpation and provocation tests are generally used to assess and
220
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221
diagnose SIJ disorders as a commonly clinical practice. We attempted to examine whether the combination of motion palpation and pain provocation tests is reliable. For composites of motion palpation and provocation tests, the reliability of a composite of three or more motion palpation together with two or more provocation tests was found to be excellent and better than other composites (Table 3). Cibulka and Koldehoff (1999) used only four palpation tests and categorized positive or negative, regardless of side of SIJ dysfunction. Riddle and Freburger (2002) examined the degree of agreement between therapists for the same tests by taking into account the side off and type of the presumed dysfunction and found poor reliability for the composite results of four tests. The problem with this study is that internal consistency between test results was not assessed. Thus they did not account for either type or side of dysfunction. Kokmeyer et al. (2002) showed good reliability for a multi-test regimen of five provocation tests. Robinson et al. (2007) assessed the reliability for two clusters of three or five provocation tests regarding the side of pain and showed good reliability of clusters. As said, in those studies only clusters of palpation or provocation tests were examined. Based on the results of our study, we advocate the composites of three or more positive motion palpation and two or more positive pain provocation tests for clinical use. One of the limitations of this study is using testers with only 1 year of experience. We examined reliability of tests using two therapists with 1-year experience in order to know if single test or composites of tests are reliable when are clinically used by testers even with low experience.
5. Conclusion This study showed fair to substantial reliability for the individual motion palpation or pain provocation tests. Our data demonstrated moderate to substantial intra- and inter-examiner reliability for clusters of motion palpation or pain provocation tests. Considering excellent reliability for composites of motion palpation together with pain provocation tests from this study, it seems that composites of them could be used as a reliable method for SIJ assessment in clinical practice. Kappa is affected by paradoxical effects of the prevalence and BI and it seems that it is better to calculate PABAK for appropriate interpreting the reliability in such studies.
Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/ j.math.2008.02.004.
References Bernard TN. The role of the sacroiliac joints in low back pain: basic aspects of pathophysiology, and management. In: Vleeming A, Mooney V, Dorman T, Snijders C, Stoeckart R, editors. Movement, stability & low back pain. The essential role of the pelvis. 2nd ed. Edinburgh: Churchill Livingstone; 1997. p. 73e88. Broadhurst NA, Bond MJ. Pain provocation tests for the assessment of sacroiliac joint dysfunction. Journal of Spinal Disorders 1998;11(4):341e5. Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. Journal of Clinical Epidemiology 1993;46:423e9. Cibulka MT, Delitto A, Koldehoff RM. Changes in innominate tilt after manipulation of the sacroiliac joint in patients with low back pain: an experimental study. Physical Therapy 1988;68:1359e63. Cibulka MT, Koldehoff R. Clinical usefulness of a cluster of sacroiliac joint tests in patients with and without low back pain. Journal of Orthopaedic and Sports Physical Therapy 1999;9(2):83e9. Dreyfuss P, Michaelsen M, Pauza K, McLarty J, Bogduk N. The value of medical history and physical examination in diagnosing sacroiliac joint pain. Spine 1996;21(22):2594e602. Ehrlich GE. Low back pain. Bulletin of the World Health Organization 2003;81(9):671e2. Fortin J, Dwyer A, West S, Pier J. Sacroiliac joint: pain referral maps upon applying a new injection/arthrography technique. Part I: asymptomatic volunteers. Spine 1994a;19:1475e82. Fortin J, Aprill C, Ponthieux B, Pier J. Sacroiliac joint: pain referral maps upon applying a new injection/arthrography technique. Part II: clinical evaluation. Spine 1994b;19:1483e9. Haas M. Interexaminer reliability for multiple diagnostic test regimens. Journal of Manipulative and Physiological Therapeutics 1991;14(2):95e103. Herzog W, Read LJ, Conway PJ, Shaw LD, McEwen MC. Reliability of motion palpation procedures to detect sacroiliac joint fixations. Journal of Manipulative and Physiological Therapeutics 1989;12(2):86e92. Hoehler FK. Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity. Journal of Clinical Epidemiology 2000;53:499e503. Kokmeyer DJ, van der Wurff P, Aufdemkampe G, Fickenscher TC. The reliability of multitest regimens with sacroiliac pain provocation tests. Journal of Manipulative and Physiological Therapeutics 2002;25(1):42e8. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159e74. Laslett M. The value of the physical examination in diagnosis of painful sacroiliac joint pathologies. Spine 1998;23:962e4. Laslett M, Williams M. The reliability of selected pain provocation tests for sacroiliac joint pathology. Spine 1994;9(11):1243e9. Laslett M, Young S, Aprill C, McDonald B. Diagnosing painful sacroiliac joints: a validity study of a McKenzie evaluation and sacroiliac provocation tests. The Australian Journal of Physiotherapy 2003;49(2):89e97. Laslett M, Aprill C, McDonald B, Young S. Diagnosis of sacroiliac joint pain: validity of individual provocation tests and composites of tests. Manual Therapy 2005;10(3):207e18. Levin U, Nilsson-Wikmar L, Harms-Ringdahl, Stenstrom CH. Variability of forces applied be experienced physiotherapists during provocation of the sacroiliac joint. Clinical Biomechanics 2001;16:300e6. Levin U, Stenstrom CH. Force and time recording for validating the sacroiliac distraction test. Clinical Biomechanics 2003;18:821e6. Maigne JY, Aivaliklis A, Pfefer F. Results of sacroiliac joint double block and value of sacroiliac pain provocation tests in 54 patients with low back pain. Spine 1996;21(16):1889e92.
A.M. Arab et al. / Manual Therapy 14 (2009) 213e221 MedCalc statistical software. Broekstraat 52, B-9030 Mariakerke, Belgium. Meijne W, van Neerbos K, Aufdemkampe G, van der Wurff P. Intraexaminer and interexaminer reliability of the Gillet test. Journal of Manipulative and Physiological Therapeutics 1999;22(1):4e9. Mooney V. Sacroiliac joint dysfunction. In: Vleeming A, Mooney V, Dorman T, Snijders C, Stoeckart R, editors. Movement, stability & low back pain. The essential role of the pelvis. 2nd ed. Edinburgh: Churchill Livingstone; 1997. p. 37e52. O’Haire C, Gibbons P. Inter-examiner and intra-examiner agreement for assessing sacroiliac anatomical landmarks using palpation and observation: pilot study. Manual Therapy 2000;5(1):13e20. Potter NA, Rothstein JM. Intertester reliability for selected clinical tests of the sacroiliac joint. Physical Therapy 1985;65(11):1671e5. Riddle DL, Freburger JK. Evaluation of the presence of sacroiliac joint region dysfunction using a combination of tests: a multicenter intertester reliability study. Physical Therapy 2002;82(8):772e81. Robinson HS, Brox JI, Robinson R, Bjelland E, Solem S, Telje T. The reliability of selected motion- and pain provocation tests for the sacroiliac joint. Manual Therapy 2007;12(1):72e9. Schwarzer AC, Aprill CN, Bogduk N. The sacroiliac joint in chronic low back pain. Spine 1995;20(1):31e7.
221
Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy 2005;85:257e68. Slipman CW, Whyte 2nd WS, Chow DW, Chou L, Lenrow D, Ellen M. Sacroiliac joint syndrome. Pain Physician 2001;4(2): 143e52. Strender LE, Sjoblom A, Sundell K, Ludwig R, Taube A. Interexaminer reliability in physical examination of patients with low back pain. Spine 1997;22(7):814e20. Stuber KJ. Specificity, sensitivity and predictive values of clinical tests of the sacroiliac joint: a systematic review of the literature. Journal of the Canadian Chiropractic Association 2007;51(1):30e41. Vincent-Smith B, Gibbons P. Inter-examiner and intra-examiner reliability of the standing flexion test. Manual Therapy 1999;4(2):87e93. van der Wurff P, Hagmeijer RH, Meyne W. Clinical tests of the sacroiliac joint. A systemic methodological review. Part 1: reliability. Manual Therapy 2000a;5(1):30e6. van der Wurff P, Meyne W, Hagmeijer RH. Clinical tests of the sacroiliac joint. A systemic methodological review. Part 2: validity. Manual Therapy 2000b;5(2):89e96. Young S, Aprill C, Laslett M. Correlation of clinical examination characteristics with three sources of chronic low back pain. The Spine Journal 2003;3:460e5.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 222e230 www.elsevier.com/math
Professional Issue
Classification of low back-related leg paindA proposed patho-mechanism-based approach Axel Scha¨fer a,b,*, Toby Hall b,c,1, Kathy Briffa b,2 b
a Ru¨ckenzentrum am Michel, Ludwig Erhart Straße 18, 20459 Hamburg, Germany School of Physiotherapy, Curtin University of Technology, GPO Box U1987, Perth, WA 6845, Australia c Manual Concepts, P.O. Box 1236, Booragoon, WA 6954, Australia
Received 13 July 2006; received in revised form 30 September 2007; accepted 4 October 2007
Abstract Leg pain is a frequent accompaniment to low back pain, arising from disorders of neural or musculoskeletal structures of the lumbar spine. Differentiating between different sources of radiating leg pain is important to make an appropriate diagnosis and identify the underlying pathology. It is proposed that low back-related leg pain be divided into four subgroups according to the predominating pathomechanisms involved. The first subgroup features central sensitization with mainly positive symptoms such as hyperalgesia, the second subgroup involves denervation with significant axonal damage showing predominantly negative sensory symptoms and possibly motor loss and the third subgroup involves peripheral nerve sensitization with enhanced nerve trunk mechanosensitization. The fourth subgroup features somatic referred pain from musculoskeletal structures, such as the intervertebral disc or facet joints. Accordingly, four groups of patients with leg pain associated with structures in the lower back can be identified:
1. 2. 3. 4.
Central sensitization. Denervation. Peripheral nerve sensitization. Musculoskeletal.
Each group presents with a distinct pattern of symptoms and signs. Although there may be considerable overlap between the classifications, the authors propose the existence of an overriding mechanism. The importance of distinguishing low back-related leg pain into these four groups is to facilitate diagnosis and provide a more effective, appropriate treatment. Ó 2007 Published by Elsevier Ltd. Keywords: Neuropathic pain; Low back pain; Leg pain; Sciatica; Classification; Diagnosis
1. Introduction
* Corresponding author. Ru¨ckenzentrum am Michel, Ludwig Erhart Straße 18, 20459 Hamburg, Germany. Tel.: þ49 40 43280274. E-mail addresses:
[email protected] (A. Scha¨ fer), info@ manualconcepts.com (T. Hall), k.briff
[email protected] (K. Briffa). 1 Tel./fax: þ61 8 93164080. 2 Tel.: þ61 8 9266 3666; fax: þ61 8 9266 3699. 1356-689X/$ - see front matter Ó 2007 Published by Elsevier Ltd. doi:10.1016/j.math.2007.10.003
Low back pain (LBP) is one of the major health problems in western industrial societies with a lifetime prevalence of 84% (Taylor et al., 2000). The high economic cost this imposes on society is comparable to other disorders such as heart disease, depression or diabetes (Maetzel and Li, 2002). Accompanying leg
223
A. Scha¨fer et al. / Manual Therapy 14 (2009) 222e230
pain is present in approximately 25e57% of all LBP cases (Helio¨vaara et al., 1987; Cavanaugh and Weinstein, 1994; Selim et al., 1998), but these cases account for a disproportionately large amount of the costs of medical care and disability compensation caused by LBP (Ren et al., 1999). Furthermore, accompanying leg pain is an important predictor for chronicity of LBP and an indicator of the severity of the disorder (Selim et al., 1998). The primary pathology causing referred leg pain is often indistinct, as many structures are capable of evoking a similar pattern of pain (Adams et al., 2002; Bogduk and McGuirk, 2002). Failure to distinguish different forms of referred pain in the assessment of LBP is reported to be a common error leading to inappropriate investigations and treatment (Bogduk and McGuirk, 2002). Low back-related leg pain may be due to damage or dysfunction of neural or musculoskeletal structures. Possible events causing damage to neural structures may be mechanical, such as intervertebral disc protrusion, or biochemical, caused by cytokines or other inflammatory mediators. Thereby induced perturbation of neural structures may lead to a variety of clinical manifestations, ranging from negative symptoms such as motor disturbances and loss of sensation to positive symptoms such as paraesthesias or hyperalgesia associated with central sensitization (Table 1). Furthermore, it is well known that nerve injury alone is not always painful (Boden et al., 1990; Beattie et al., 2001) and that patients exhibiting severe symptoms may not necessarily have evidence of nerve root compression on imaging studies (Boos et al., 1995; Ohnmeiss et al., 1997). Consequently, pain is not a necessary event following neural compromise, and diagnosis based on pathology may not be the most relevant. A focus on pathomechanisms may be more appropriate. Although, as yet, it is not possible to diagnose pain mechanisms by clinical evaluation, a profound examination protocol consisting of neurological examination, screening for central sensitization and assessment of nerve tissue mechanosensitization may help to elucidate some of the mechanisms currently considered responsible for signs and symptoms seen in low back-related leg pain. Depending on the assumed predominance of pathomechanisms, differentiation of low back-related leg
pain into four distinct subgroups is proposed. These categories are central sensitization comprising major features of central nervous system sensitization, denervation arising from significant axonal compromise without evidence of central nervous system changes, peripheral nerve sensitization arising from nerve trunk inflammation without clinical evidence of significant denervation, and musculoskeletal pain referred from non-neural structures such as the disc or facet joints. The purpose of this article is to present the rationale for the proposed classification system and the corresponding signs and symptoms for each of the groups. Background pathoanatomy and pathomechanisms will be reviewed and an algorithm for clinical classification presented.
2. Pathoanatomy and pathomechanisms 2.1. Peripheral events 2.1.1. Inflammation The lumbar intervertebral disc (IVD) plays a central role in the development of low back-related leg pain and radiculopathy (Yoshizawa et al., 1995). The pathomechanisms involved are internal disc disruption, fissure formation and nucleus pulposus (NP) prolapse or sequestration leading to inflammation of the nerve root, and subsequent pain of nerve origin, even without mechanical compression. Inflammation caused by biochemical substances from the NP plays a significant role in the development of low back-related leg pain (Olmarker et al., 1993; Olmarker, 1997; Brisby, 2003). A suggested cause for this kind of inflammation are endplate fractures of the vertebrae, where NP material may become exposed to osteochondral blood and susceptible to progressive degradation of NP matrix (Bogduk, 2005). Degenerative changes of the IVD, associated with internal disc disruption, commonly lead to fissures in the annulus, which allow inflammatory mediators to disperse through the disc and contact the innervated outer third of the annulus (Videman and Nurminen, 2004; Peng et al., 2005). These chemicals may cause excitation of nociceptive afferents and thereby discogenic pain, which may then refer into the lower limb (O’Neill et al., 2002). In case of a full annular rupture, NP material and inflammatory mediators
Table 1 Positive and negative symptoms and signs in neuropathic pain. Symptoms
Positive
Negative
Signs
Sensory
Motor
Sensory
Motor
Pain Paroxysm Dysaesthesia Paraesthesia
Spasm
Hyperalgesia (thermal and mechanical)
Hyperreflexia Clonus Babinsky
Hypoaesthesia Hypopathia
Palsy Weakness
Allodynia (light touch, pin-prick) Wind-up Hypoaesthesia
Muscle weakness Hyporeflexia
224
A. Scha¨fer et al. / Manual Therapy 14 (2009) 222e230
may leak into the spinal canal, contact nerve tissues such as transiting or exiting nerve roots and lead to inflammation of these structures (Videman and Nurminen, 2004). A number of pro-inflammatory cytokines are associated with inflammation such as interleukin 6 (IL-6), which is up-regulated after macrophage infiltration (Takada et al., 2004; Mulleman et al., 2006). Increased levels of IL-6 could be one of the major causes for neurological signs and symptoms, especially neurogenic pain (Takada et al., 2004). Other inflammatory mediators may be expressed on the surface of NP cells (Kayama et al., 1998) such as the pro-inflammatory cytokine tumour necrosis factor a and nitric oxide (NO), which may enhance neuropathic pain states (Olmarker and Larsson, 1998; Brisby et al., 2000; Olmarker and Rydevik, 2001). Additionally, inflammatory changes may cause an increase in sodium channel density/conductance in the nerve root and dorsal root ganglion, which in turn may contribute to increased ectopic discharges and nerve trunk mechanosensitivity (Devor and Rappaport, 1990; Chen et al., 2004; Devor, 2006). This concept of chemically induced nerve root pain is supported by animal experimental evidence indicating that locally induced inflammatory processes in the vicinity of nerves may lead to marked pain behaviour with increased allodynic and hyperalgesic responses even in the absence of axonal damage (Eliav et al., 1999, 2001). Other studies have demonstrated that locally induced neuritis to a peroneal or sciatic nerve caused pressure and stretch mechanosensitivity of the nerve trunk (Bove et al., 2003; Dilley et al., 2005). This is consistent with findings by Olmarker and Myers (1998), who found lowered mechanical and heat pain thresholds, but with minor evidence of axonal damage, after application of NP material to nerve roots in rats. It is possible that these processes could account for some types of movement-dependent referred pain (Bove et al., 2003). For example, Greening et al. (2005) demonstrated altered median nerve movement and elevated nerve trunk mechanosensitivity to pressure and stretch in whiplash patients and patients with non-specific arm pain. These findings suggest that injury related inflammation may cause widespread changes to nerve fibres leading to increased nerve trunk mechanosensitivity and dysfunction at the peripheral terminals (Greening, 2004). 2.2. Compression Mechanical nerve root compression can be caused by prolapsed IVD tissue, osteophytes, facet joint hypertrophy or ligamentum flavum hypertrophy (Taylor and Twomey, 1986; Kobayashi et al., 2005). The putative effects of nerve root compressions include impaired intraradicular blood flow, increased endoneural fluid pressure and nerve fibre deformation (Rydevik et al., 1984,
1991; Olmarker et al., 1989). This combination of increased endoneural fluid pressure and decreased blood flow may result in neuronal ischaemia leading to breakdown of axonal myelin sheaths and alteration of the bloodenerve barrier (Cornefjord et al., 1997; Kobayashi et al., 2004; Igarashi et al., 2005). Such structural nerve damage may be the cause of sensory and motor dysfunction and radiating pain. On the other hand, contrary to this notion, it is well known that compression of nerves does not always cause pain (McNab, 1972; Wiesel et al., 1984; Kjaer et al., 2005), although the reason for this is unclear. One factor may be the rate of nerve compression. Rapid-onset neural compromise is likely to be associated with inflammatory change and development of neural irritation according to the process described above (Kobayashi et al., 2005). Another common situation is chronic, gradual onset nerve compression (Olmarker et al., 1990), but the effect of chronic compression has been less thoroughly studied than acute compression. Although the extent of nerve injury from chronic or acute compression cannot be easily compared in animal experiments, it seems that acute nerve injury causes more severe changes (Olmarker et al., 1990; Yoshizawa et al., 1995; Cornefjord et al., 1997; Igarashi et al., 2005). A typical example of chronic nerve root compression of gradual onset is spinal stenosis, where inflammation is usually not well developed. In this example, pain usually occurs after sustained extension loading such as in standing or walking (Takahashi et al., 1995a, b), which causes reduced foraminal and spinal canal volume, vascular compromise and nerve root anoxia (Blau and Logue, 1978). In pure spinal stenosis, there is usually an absence of nerve trunk mechanosensitivity (Arbit and Pannullo, 2001). 2.3. Central events Continued noxious input from the peripheral nervous system as a result of inflammation or compression of nerve structures may lead to augmented response of signalling neurons in the central nervous system, a process commonly referred to as central sensitization (Campbell and Meyer, 2006). Most of the changes leading to central sensitization take place in the dorsal horn of the spinal cord, where intense and sustained nociceptor activation, especially C fibre activation, leads to phosphorylation of N-methyl-D-aspartate receptors in nociceptive specific dorsal horn neurons. This may cause longer lasting changes in their excitability so that previously subthreshold C fibre inputs can now drive the postsynaptic neuron (Costigan and Woolf, 2000). A subtype of dorsal horn neurons are wide dynamic range (WDR) neurons where tactile Ab and nociceptive C fibres converge. Therefore, involvement of WDR neurons may lead to enhanced synaptic efficacy of tactile Ab fibres and consequently innocuous Ab signals are coded as pain (Simone
A. Scha¨fer et al. / Manual Therapy 14 (2009) 222e230
et al., 1991; Woolf and Doubell, 1994). Co-existent with the above-mentioned changes, nerve injury may alter the properties of Ab fibres, such that they begin to act like nociceptive fibres expressing neuropeptides, which enables these fibres to evoke central sensitization (Woolf and Salter, 2000). This mechanism is called cell phenotypic shift. Similarly, Ab fibres may sprout onto Lamina II in the dorsal horn, an area which normally receives only nociceptor information (Mannion et al., 1996). Sensitization of WDR neurons, phenotypic shift and Ab fibre sprouting lead to enhanced pain in response to normally innocuous signals (allodynia), and this again may drive central sensitization. In addition to enhanced pain input, spontaneous activity in central nociceptive neurons may be caused by loss of sensory input due to damage of primary afferent axons (deafferentation) in the dorsal nerve root (Baumga¨rtner et al., 2002). Furthermore, diminished inhibitory mechanisms may contribute to enhanced pain processing including cell death of inhibitory interneurons in the dorsal horn (Woolf and Mannion, 1999) as well as changed descending modulatory mechanisms from the brain stem (Ren and Dubner, 1996, 2002; Gardell et al., 2003). Finally, secondary changes in cortical and subcortical brain regions, triggered by cognitions, emotions and attention, may further enhance central sensitization and development of spontaneous activity and pain (Tracey et al., 2002; Zusman, 2002; Apkarian et al., 2005). In summary, the main mechanisms responsible for enhanced pain processing associated with central sensitization are sensitization of nociceptive specific dorsal horn neurons, especially WDR neurons, disinhibition, deafferentation, phenotypic switch and sprouting of Ab fibres, as well as changes in cortical and subcortical brain regions. 2.4. Referred leg pain from musculoskeletal structures A large proportion of low back-related leg pain is accounted for by disorders of musculoskeletal structures (Bogduk and McGuirk, 2002). It has been shown that intervertebral discs (Ohnmeiss et al., 1997; O’Neill et al., 2002), facet joints (Mooney and Robertson, 1976; Schwarzer et al., 1994), sacroiliac joints (Fortin et al., 1994) and muscles (Travell and Simons, 1983) may refer pain into the lower limb. The convergence theory explains the reason for this phenomenon. Here, afferent impulses from different regions converge upon the same viscerosomatotopic neurons in the central nervous system, causing a mental projection of pain to the region corresponding with the spinal nerve through which the afferent nerve fibres enter the spinal cord (Jinkins, 2004). For example, a projection neuron for the fifth lumbar nerve may receive input from the hip, thigh,
225
leg, and foot. Noxious input of sufficient strength from an injured facet joint or intervertebral disc can activate this projection neuron so that, via second-order neurons, the contralateral somatosensory cortex receives information that the nociceptive input arises from the lumbar structure and the extremity (Gillette et al., 1993). 2.5. Proposed classification and their corresponding signs and symptoms Based on the mechanisms described above, the following classification of low back-related leg pain into four categories is proposed (Table 2). 2.5.1. Central sensitization Some patients report primarily positive symptoms such as paraesthesias, dysaesthesias, hyperalgesia, dynamic mechanical allodynia and stimulus independent pain driven by central processes (Woolf and Mannion, 1999; Baumga¨rtner et al., 2002). Clinical features of central sensitization (Table 2) are revealed by pain descriptors such as shooting, lancinating or burning. The patient may report paroxysms, or complain of mechanical or thermal allodynia. The neurological examination may reveal light touch allodynia or altered pin prick thresholds (Bennett, 2001). A number of studies have investigated the influence of central sensitization on sensory changes. There is good evidence demonstrating mechanical pressure and thermal hyperalgesia attributed to central sensitization in acute and chronic whiplash patients (Moog et al., 2002; Sterling et al., 2003, 2005; Scott et al., 2005). Mechanical hyperalgesia is also a feature of complex regional pain syndrome which is now widely acknowledged to be enhanced by central sensitization linked to neuroimmune activation (Rommel et al., 2001; Alexander et al., 2005). Although there are no published data so far, it seems reasonable to extrapolate that central sensitization following nerve root injury in the lumbar spine may also be associated with mechanical and/or thermal hyperalgesia. 2.5.2. Denervation Denervation can be caused by structural nerve damage with primarily negative symptoms such as sensory or motor deficits (Baumga¨rtner et al., 2002). For example, radiculopathies defined as ‘‘objective loss of sensory and/or motor function as a result of conduction block in axons of a spinal nerve or its roots’’ (Merskey and Bogduk, 1994) are seen as a common cause of neuropathic pain (Dworkin et al., 2003; Baron and Binder, 2004; Jensen et al., 2004). Clinical examination of neurological function (muscle power, reflexes and skin sensitivity to light touch and
226
A. Scha¨fer et al. / Manual Therapy 14 (2009) 222e230
Table 2 Diagnostic group and related features. Diagnostic group
Central sensitization
Denervation
Peripheral nerve sensitization
Musculoskeletal
Classification Symptomatic structure
Neuropathic Neural
Neuropathic Neural
Neuropathic or nociceptive Neural
Nociceptive Musculoskeletal
Mechanisms
Sensitization of WDR neurons Disinhibition Forebrain-mediated CS
Wallerian degeneration Demyelination
Convergence
Effect
Enhanced processing of peripheral input Distal pain Hyperaesthesia Hyperalgesia Paraesthesia Allodynia LANSS score P12 May have features of the diagnostic groups denervation and peripheral sensitization
Conduction block Deafferentation Segmentally distributed distal pain Hypoesthesia Weakness Palsy Diminished light touch and pinprick Diminished or absent reflexes Muscle weakness Minimal features of peripheral sensitization LANSS score 0.4) when an orthopedic surgeon and a rheumatologist performed the test in
Positive response
Negative response Test examiner A Re-test examiner A 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 1. Neer impingement sign. Patient (n ¼ 33) responses at testeretest.
234
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239
Positive response
Negative response
Examiner A Examiner B
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 2. Neer impingement sign. Patient’s (n ¼ 33) responses from examiners A and B.
consecutive patients with shoulder problems. Dromerick et al. (2006) evaluated the Neer impingement sign, among other tests, assigning patients from an academic inpatient stroke rehabilitation service. They reported good interexaminer reliability (k ¼ 0.78) using two examiners who evaluated patients with hemiplegic shoulder pain. These researchers present various conclusions about reliability for some of the maneuvers in focus. Results based upon more or less divergent materials which make comparisons difficult and could explain the variation in levels of agreement. When interpreting the perfect to almost perfect levels of agreement in the current study, some methodological aspects must be taken into consideration. Only two examiners were included in this study which limited the source of variation and influenced the levels of agreement. This is partly compensated by the reasonable
number of participating patients, but should be accounted for in aspects of extrapolations. Further, the evaluated maneuvers had a dichotomous response, a positive or negative finding. This also has a limitative effect on the possibility of measurement variation and consequently on the levels of agreement, both for intraand interexaminer reliability. However, a dichotomous response is the reality of how these tests are used in clinical practice. At the second test occasion, re-test, there was a risk of patients remembering their responses from the first test occasion and a possibility that the patients tried to be helpful when responding. On the contrary, the clinical experience is that these responses are distinctly expressed both verbally and in body language supporting a true test response. Agreement levels could further be biased by the fact that the examiner remembered the test response from the first text occasion
Positive response
Negative response Test examiner A Re-test examiner A 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 3. HawkinseKennedy impingement test. Patient (n ¼ 33) responses at testeretest.
235
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239
Positive response
Negative response Examiner A Examiner B
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 4. HawkinseKennedy impingement test. Patient’s (n ¼ 33) responses from examiners A and B.
that influenced the interpretation of the second. A preselection of participants with suspected subacromial pain were enrolled in this study. This limits the amount of negative responses, but in the actual clinical encounter these maneuvers are chosen especially when subacromial soft tissue involvement is suspected. The results of this study present both negative and positive responses (Table 1), but most prevalent are the positive responses for three of four maneuvers. Only Jobe supraspinatus test had a more even distribution (Table 1). The patients also reported limited duration of symptoms and no one reported extensive pain or disability. Patients with higher pain ratings or more disabled shoulders, probably due to increased involvement of surrounding tissues, could make the test response more difficult to interpret and thereby increase variability. But since these tests
have been reported as highly sensitive (Park et al., 2005), inclusion of patients with more disabled shoulders would probably increase the number of positive test and not diminish reliability. In summary, all these aspects could influence the k-coefficients (Sim and Wright, 2005). The standardization used (Appendix) emphasizes the importance of locking the thoraco-scapular movement. This is crucial in order to provoke the subacromial structures as well as to obtain this high degree of reproducibility. This is supported in the study by De Wilde et al. (2003). The Jobe supraspinatus test was performed unilaterally, a conscious choice in order to secure a correct performance. Jobe and Moynes (1982) recommended the Jobe supraspinatus test as useful both when examining
Positive response
Negative response Test examiner A Re-test examiner A 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 5. Patte maneuver. Patient (n ¼ 33) responses at testeretest.
236
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239
Positive response
Negative response Examiner A Examiner B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 6. Patte maneuver. Patient’s (n ¼ 33) responses from examiners A and B.
strength in the supraspinatus muscle and when strengthening it. This test, as well as the Patte maneuver (Leroux et al., 1995), was in the current study interpreted in relation to pain provocation not only muscle force, which differs from the original description. When these maneuvers are performed by, for example, a PT in patients with suspected subacromial impingement, no pain or pain combined with varying degree of muscular weakness is the main response. The muscular weakness could be a result of muscularetendonal changes and/or probably due to neuro-muscular inhibition in the presence of pain (Farina et al., 2004). Accordingly, pain or no pain as test response seems more relevant since muscle force is hard to evaluate in the presence of pain. Intra- and interexaminer reliability is affected of different sources of variation that could influence reproducibility. The examiners experience of the used
maneuvers could probably be of importance, but the results in the current study where experience differed indicates that equal experience is not necessary to reach almost perfect intra- and interexaminer reliability. In this study, variation was limited by standardizing the maneuvers. Further the within-subject variation was monitored. The stability of the current shoulder complaint was assessed by VAS for pain at rest as well as VAS for functional disability, before each test occasion. Further, the duration of pain in case of a positive response was monitored by using VAS. The pain always returned to pre-test level before start of the test by the second examiner. Since these factors were stable, the examiner(s) seems to be the main source of variation. All together, these procedures can often be controlled in the actual clinical encounter to support reliability when using these maneuvers in daily practice.
Positive response
Negative response Test examiner A Re-test examiner A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 7. Jobe supraspinatus test. Patient (n ¼ 33) responses at testeretest.
237
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239
Positive response
Negative response Examiner A Examiner B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Patients Fig. 8. Jobe supraspinatus test. Patient’s (n ¼ 33) responses from examiners A and B.
5. Conclusion
examiner prevented the thoraco-scapular movement fixating the acromion with a depressive force.
The Neer impingement sign, the HawkinseKennedy impingement test, the Patte maneuver as well as the Jobe supraspinatus test, are all highly reliable. In combination with earlier research about their validity, these four maneuvers seem suitable for use in clinical practice to identify patients with subacromial pain with impingement phenomena. However, their ability to discriminate between structures in the area is limited. Their high level of intra- and interexaminer reliability, together with validity aspects, are the clinicians’ tool in the diagnostic procedure. A homogenous diagnostic classification is a prerequisite for relevant choice of treatment and necessary when implementing research results into clinical practice.
Acknowledgements We wish to thank participating patients, involved family physicians and physical therapists, especially Sofi Tagesson, for making this study possible as well as Henrik Magnusson and Elisabeth Wilhelm for cooperation in the statistical area. Financial support: Linko¨ping University and Hemborgs Memorial. Illustrates the Neer impingement sign.
Appendix The HawkinseKennedy impingement test All maneuvers were performed with the patient in a seated position. The Neer impingement sign The patient’s arm was forward flexed combined with medial rotation in the gleno-humeral joint. The
The patient’s arm was positioned in 90 flexion in the gleno-humeral joint as well as in the elbow. Then the gleno-humeral joint was forcibly rotated medially by lowering the forearm while supporting the elbow. The examiner prevented the thoraco-scapular movement fixating the acromion with a depressive force.
238
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239
The Jobe supraspinatus test The patient’s arm was extended and medially rotated and elevated to 90 abduction in the scapular plane (90 abduction and then 30 horizontal adduction). The examiner instructed the patient to maintain position and resist a downward pressure.
Illustrates the Hawkins–Kennedy impingement test.
The Patte maneuver The patient’s arm was positioned in 90 flexion in the gleno-humeral joint with the elbow in 90 flexion and then medially rotated by lowering the forearm. The patient was then instructed to activate lateral rotation against the examiners resistance. The examiner prevented the thoraco-scapular movement fixating the acromion with a depressive force.
Illustrates the Patte maneuver.
Illustrates the Jobe supraspinatus test.
References C¸alis x M, Akgu¨n K, Birtane M, Karacan I, C¸alis x H, Tu¨zu¨n F. Diagnostic value of clinical diagnostic tests in subacromial impingement syndrome. Annuals of Rheumatic Disease 2000;59:44e7. De Wilde L, Plasschaert F, Berghs B, Van Hoecke M, Verstaete K, Verdonk R. Quantified measurement of subacromial impingement. Journal of Shoulder and Elbow Surgery 2003;12:346e9. Dromerick AW, Kumar A, Volshteyn Edwards DF. Hemiplegic shoulder pain syndrome: interrater reliability of physical diagnosis signs. Archives of Physical Medicine and Rehabilitation 2006;87:294e5. Farina D, Arendt-Nielsen L, Merletti R, Graven-Nielsen T. Effect of experimental muscle pain on motor unit firing rate and conduction velocity. Journal of Neurophysiology 2004;91:1250e9. Fritz JM, Wainner RS. Examining diagnostic tests: an evidence-based perspective. Physical Therapy 2001;81:1546e64. Hawkins RJ, Kennedy JC. Impingement syndrome in athletes. American Journal of Sports Medicine 1980;8:151e8. Holtby R, Razmjou H. Validity of the supraspinatus test as a single clinical test in diagnosing patients with rotator cuff pathology. Journal of Orthopaedic and Sports Physical Therapy 2004;34: 194e200. Jobe FW, Moynes DR. Delineation of diagnostic criteria and a rehabilitation program for rotator cuff injuries. American Journal of Sports Medicine 1982;10:336e9. Krebs DE. Measurement theory. Physical Therapy 1987;67:1834e9. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159e74. Leroux J-L, Thomas E, Bonnel F, Blotman F. Diagnostic value of clinical tests for shoulder impingement syndrome. Revue du Rhumatisme 1995;62:423e8 (Engl. ed.).
K. Johansson, S. Ivarson / Manual Therapy 14 (2009) 231e239 MacDonald P, Clark P, Sutherland K. An analysis of the diagnostic accuracy of the Hawkins and Neer subacromial impingement signs. The Journal of Shoulder and Elbow Surgery 2000;9:299e301. Neer CS. Anterior acromioplasty for the chronic impingement syndrome in the shoulder. The Journal of Bone and Joint Surgery 1972;54-A:41e50. Neer CS. Impingement lesions. Clinical Orthopaedics and Related Research 1983;173:70e7. Nørregaard J, Krogsgaard MR, Lorenzen T, Jensen EM. Diagnosing patients with longstanding shoulder joint pain. Annals of Rheumatic Disease 2002;61:646e9. Park HB, Yokota A, Gill HS, El Rassi G, McFarland G. Diagnostic accuracy of clinical tests for the different degrees of subacromial impingement syndrome. The Journal of Bone and Joint Surgery 2005;87-A:1446e55.
239
Sigholm G, Styf J. Subacromial pressure during diagnostic shoulder tests. Clinical Biomechanics 1988;3:187e9. Sim J, Wright CC. The Kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy 2005;85:257e68. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. 2nd ed. New York: Oxford University Press Inc.; 1998. p. 104e27 [chapter 8]. Valadie III A, Jobe C, Pink M, Ekman EF, Jobe FW. Anatomy of provocative tests for impingement syndrome of the shoulder. The Journal of Shoulder and Elbow Surgery 2000;9:36e46. Van der Windt DAWM, Koes BW, Boeke AJP, Deville´ W, De Jong BA, Bouter LM. Shoulder disorders in general practice: prognostic indicators of outcome. British Journal of General Practice 1996;46:519e23.
Available online at www.sciencedirect.com
Manual Therapy 14 (2009) 240 www.elsevier.com/math
Diary of events Back and beyond Theme The lumbar spine and pelvis Dates Sat 28e29th March 2009 Venue East Midlands Conference Centre, Nottingham For more details visit www.physiofirst.org.uk NZMPA biennial scientific conference, Heritage Hotel, Rotorua, New Zealand 28, 29 & 30 August 2009. The theme is ‘Striving for Excellence in OMT’ & also celebrating 40 years of Manual Therapy in New Zealand. The conference co-coordinator is Vicki Reid, Phone 0800 646 000 or 09 476 5353 Fax 09 476 5354 e-mail:
[email protected] Website: www.nzmpa.org.nz NOI International conference UK and Ireland Nottingham UK e April 15e17, 2010 Dublin IRELAND April 21e23, 2010
1356-689X/$ - see front matter doi:10.1016/S1356-689X(09)00012-5
For further details www.noi2010.com Fax þ 3906 51882443
Janet G. Travell, MD Seminar Series, Bethesda, USA For information, contact: Myopain Seminars, 7830 Old Georgetown Road, Suite C-15, Bethesda, MD 20814-2432, USA. Tel.: þ1 301 656 0220; Fax: þ1 301 654 0333; website: www.painpoints.com/seminars.htm E-mail:
[email protected] If you wish to advertise a course/conference, please contact: Karen Beeton, Associate Head of School (Professional Development), School of Health and Emergency Professions, University of Hertfordshire, College Lane, Hatfield, Herts AL10 9AB, UK. E-mail:
[email protected] There is no charge for this service.