Development and reproducibility of a short questionnaire to measure use and usability of custom-made orthopaedic shoes

Jaap J. van Netten, MSc1,2, Juha M. Hijmans, PhD1,3, Michiel J.A. Jannink, PhD4,5, Jan H.B. Geertzen, MD, PhD1,2 and Klaas Postema, MD, PhD1,2,3

From the 1Department of Rehabilitation Medicine, Center for Rehabilitation, University Medical Center Groningen, 2Graduate School for Health Research, 3School of Behavioral and Cognitive Neurosciences, University of Groningen, Groningen, 4Roessingh Research and Development and 5University of Twente, Department of Biomechanical Engineering, Enschede, The Netherlands

OBJECTIVE: To develop a short and easy to use questionnaire to measure use and usability of custom-made orthopaedic shoes, and to investigate its reproducibility.

DESIGN: Development of the questionnaire (Monitor Orthopaedic Shoes) was based on a literature search, expert interviews, 2 expert meetings, and exploration and testing of reproducibility. The questionnaire comprises 2 parts: a pre part, measuring expectations; and a post part, measuring experiences.

Patients: The pre part of the final version was completed twice by 37 first-time users before delivery of their orthopaedic shoes. The post part of the final version was completed twice by 39 first-time users who had worn their orthopaedic shoes for 2–4 months.

RESULTS: High reproducibility scores (Cohen’s kappa > 0.60 or intra class correlation > 0.70) were found in all but one question of both parts of the final version of the Monitor Orthopaedic Shoes questionnaire. The smallest real difference on a visual analogue scale (100 mm) ranged from 21 to 50 mm. It took patients approximately 15 min to complete one part.

CONCLUSION: Monitor Orthopaedic Shoes is a practical and reproducible questionnaire that can measure relevant aspects of use and usability of orthopaedic shoes from a patient’s perspective.

Key words: orthopaedic shoes, usability, questionnaire, patient satisfaction.

J Rehabil Med 2009; 41: 913–918

Correspondence address: Jaap J. van Netten, Center for Rehabilitation, University Medical Center Groningen, PO Box 30.001, NL-9700 RB Groningen, The Netherlands. E-mail: j.j.van.netten@rev.umcg.nl

Submitted March 3, 2009; accepted June 17, 2009

INTRODUCTION

Custom-made orthopaedic shoes (OS) can be used to diminish or prevent serious foot and/or ankle problems for a wide-range of patient groups (e.g. patients with diabetes, rheumatoid arthritis, degenerative foot disorders, muscle diseases) (1–3). OS are frequently prescribed in the Netherlands, with almost 50,000 pairs provided in 2006 (4). The total cost of OS in that year was almost 60 million Euros (4). In England and Wales, approximately 200,000 persons were provided with OS by the National Health Service (NHS) in 2000, with an expenditure of almost 40 million pounds sterling (5).

The non-use of OS is a well-known problem. Non-use rates vary, ranging from 20% to 25% for first-time users (2, 6, 7), and from 4% to 19% when experienced users who were provided with a subsequent pair of OS were also taken into account (8–11). Non-use is influenced by the usability of OS. Usability is “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction, in a specified context of use” (International Organization for Standardization (ISO), 9241-11). Within the domains of usability stated, effectiveness, efficiency and satisfaction, the following aspects are associated with non-use of OS: benefits of OS with regard to walking capacities, wound healing, pain, etc. (domain effectiveness); comfort and ease of use, and the efficiency of the delivery process of OS (domain efficiency); communication with prescribing specialists, and cosmetic appearance (domain satisfaction) (2, 6–21). However, in a systematic review it was concluded that none of the randomized clinical trials focused on all these domains of usability (22). Subsequently, Jannink et al. (2) focused on all domains of usability, however, only in patients with degenerative disorders of the foot. Therefore, further insight into the usability of OS in a wide range of patient groups is required.

Of the instruments that are available to gain insight into aspects of use and usability of OS, only the Questionnaire for Usability Evaluation (QUE) quantifies aspects of all domains of usability as defined by the ISO (22, 23). However, the QUE was developed specifically for patients with degenerative disorders of the foot. Even more importantly, the QUE is a lengthy and time-consuming questionnaire (it takes about 30 min to complete one questionnaire (pre or post part)), which limits its practical application. Therefore, the aim of the present study was to develop a short and easy to use questionnaire that could measure the most relevant aspects of use and usability of OS from a patient’s perspective, in a wide-range of patient groups. In addition, reproducibility of the questionnaire was studied.

METHODS

The development of the questionnaire, Monitor Orthopaedic Shoes (MOS), comprised 7 phases: 1) literature search and expert interviews; 2) development of the first (pilot) version (termed MOSv1); 3) expert meetings to discuss MOSv1; 4) development of the second version (MOSv2); 5) exploration of the reproducibility of MOSv2; 6) development of the final version (MOSfv); and 7) testing MOSfv for reproducibility. The procedures and patient inclusion criteria were the same for exploration (phase 5) and testing (phase 7) of reproducibility, and will therefore be described in the paragraph concerning phase 5. The methodology of the development of MOS is shown in Fig. 1.

Fig. 1. Methodology of the development of the Monitor Orthopaedic Shoes (MOS) questionnaire. OS: orthopaedic shoes.

MOS was developed specifically for first-time users of OS, because there is a large difference between first-time and experienced users, especially with regard to their expectations (8). MOS was developed primarily in Dutch, and reproducibility was defined in the Dutch language.

Phase 1 – Literature search and expert interviews

Articles were sought in MEDLINE (1989–2008) and EMBASE (1989–2008), using combinations of the following keywords: orthopaedic, therapeutic, surgical, prescribed, shoe, foot, patient satisfaction, usability, non-use, diabetes mellitus, rheumatic diseases, foot deformities, foot-diseases. The combination of keywords used in MEDLINE is shown in Appendix I. This combination was adapted to suit EMBASE. The initial selection of articles was based on the title and the content of the abstract. The reference lists of relevant publications found were checked carefully. A total of 17 articles concerning use and usability of OS were identified (2, 6–21).

To obtain additional information from clinical practice, semi-structured expert interviews were held with specialists in rehabilitation medicine (n = 3), certified orthopaedic shoe technicians (n = 3) and an experienced user of OS (n = 1). These experts were interviewed about the importance of the different aspects of usability of OS, different goals of OS, the relevance of different activities of the user, and the expectations of users.

Phase 2 – Development of MOSv1

Based on the literature search and the expert interviews, the most relevant aspects of use and usability of OS, divided into the domains of usability, were determined and operationalized into the questions of MOSv1. MOSv1 comprised 2 parts, a pre part (MOSprev1) and a post part (MOSpostv1). MOSprev1 is completed by patients prior to receiving their first pair of OS. MOSprev1 was designed to measure the expectations of the patient and the current status of several outcome variables. MOSpostv1 was designed as a follow-up to measure experiences with use and usability of patients’ OS and the change in outcome variables. MOSpostv1 is completed after patients have worn their OS for 2–4 months, with no insight into the answers of MOSprev1. The interviewed experts indicated that a period of 2 months is necessary to become accustomed to OS and that after 4 months problems with wearing-out of OS may have already developed.

MOSv1 comprised questions in different forms: multiple choice, visual analogue scale (VAS), open, and photo-based questions.

Phase 3 – Expert meetings

Content validity was taken into account during all phases of the development of MOS. It was not possible to determine criterion validity, because no similar measuring instrument to MOS exists. The QUE, which is most closely related, was developed only for patients with degenerative disorders of the foot, whereas MOS was developed for a wide-range of patient groups.

In order to ensure good content validity, 2 meetings were organized with experts who did not take part in a previous phase of this study. One meeting was held with specialists in rehabilitation medicine (n = 2) and certified orthopaedic shoe technicians (n = 4), and a second with experienced users of OS (n = 5). In these meetings, all experts were asked to comment on all aspects of MOSv1. A general discussion was held during both meetings regarding the relevance of the questions, and any important aspects missing about the use and usability of OS. Following this, a specific discussion was held regarding the clarity and practical application of each question in MOSv1.

Phase 4 – Development of MOSv2

Based on the recommendations and suggestions made during the expert meetings, MOSv2 was developed. MOSv2 was sent to the participants of both expert meetings again for final comment. This did not result in any changes.

As MOSv1, MOSv2 comprised 2 parts: a pre part, measuring expectations and the current situation (MOSprev2), and a post part, measuring experiences and changes (MOSpostv2). MOSv2 comprised questions in different forms: multiple choice, VAS, open, and photo-based questions. The expert meetings and pilot-testing indicated that patients understood the direction of the choices and how to answer the questions.

Phase 5 – Exploration of reproducibility of MOSv2

Patients. Inclusion criteria for patients were: (i) first-time user of OS; (ii) 16 years of age or older; (iii) able to read Dutch; and (iv) able to complete the questionnaire without help related to cognitive or physical impairments.

Study design. Although MOSv2 comprised 2 related parts, MOSprev2 and MOSpostv2, for practical reasons reproducibility was explored in 2 separate groups in a within-group design. Patients were recruited from 2 participating orthopaedic shoe companies (OIM Orthopedie, Haren, The Netherlands and Roessingh Revalidatie Techniek (RRT), Enschede, The Netherlands). For MOSprev2, patients meeting the inclusion criteria were recruited by the orthopaedic shoe technician after their visit for foot measurements for a first pair of OS, whereas for MOSpostv2 patients were telephoned by the orthopaedic shoe technician if they had worn their first pair of OS for 2–4 months. After patients gave informed consent, MOSv2 was sent to them by post. Once MOSv2 was completed for the first time and received by the researchers, MOSv2 was again sent to patients by post. Because of privacy matters, researchers had no insight into the personal characteristics of patients who did not give informed consent. The procedures of this study were approved by the medical ethics committee of the University Medical Center Groningen.

Data analysis. Reproducibility was analysed using SPSS for Windows, version 16.0 (SPSS Inc. Chicago, IL, USA). The reproducibility of the multiple choice questions was analysed using Cohen’s kappa. A kappa value of less than 0.40 indicates poor to fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement, and 0.81–1.00 almost perfect agreement (24). The reproducibility of the VAS questions was analysed using an intra-class correlation (ICC), absolute agreement, 2-way random model for single measures. This model measures the degree of absolute agreement among the scores on the first and the second completed questionnaire (25). In general, values over 0.70 are considered as high agreement (24). The 95% confidence interval of the ICC was also calculated. The open and photo-based questions were not included in the reproducibility analysis.

In addition, the standard error of measurement (SEM) and the smallest real difference (SRD) of the VAS questions were also calculated. The SEM is the square-root of the within-subject variance (26), the SRD is 1.96*√2*SEM (27). Both are expressed in the same dimension as the measurement. The SEM provides an interpretation of the magnitude of within-subject variability, whereas the SRD indicates the point where the difference between 2 consecutive measurements within a subject can be interpreted as a real difference (26, 27).

Results of the exploration of reproducibility. MOSprev2 was completed twice by 20 out of 23 patients; MOSpostv2 was completed twice by 9 out of 13 patients. The interval between completing MOSv2 for the first and the second time varied between 1 and 3 weeks. This is a usual test-retest interval (26). It was not expected that any clinically relevant changes in patients’ health status occurred in this period. High reproducibility scores (> 0.70) were found in 47 out of 62 questions. The SRD showed a range of 18–50 mm. Because this was only an exploration of the reproducibility in a small group of patients, results will not be presented in detail.

Additional questions were answered by all patients regarding: the inclusion of all relevant aspects of use and usability of OS; whether each question was clear to them; and different aspects of clarity and practical application of MOSv2 (for example: ”Did you have enough space for your answers?”; ”Is the font clear and readable?”; ”What do you think of the length of MOS?”). Six questions (3 in MOSprev2 and 3 in MOSpostv2) were not clear for the patients. Patients were satisfied with the aspects of clarity and practical usefulness.

Phase 6 – Development of MOSfv

Based on the results of this exploration, MOSfv was developed. Compared with MOSv2, 8 questions were deleted (5 in MOSprev2 and 3 in MOSpostv2), 11 questions were adjusted (7 in MOSprev2 and 4 in MOSpostv2), and 4 questions were added (2 in both MOSprefv and MOSpostfv). It was ensured that the most relevant aspects of use and usability were still covered in MOSfv.

As with the previous versions, MOSfv comprised 2 parts: MOSprefv, measuring expectations and the current situation by means of 15 multiple choice questions and 12 VAS questions, and MOSpostfv, measuring experiences and changes by means of 11 multiple choice questions and 19 VAS questions. Both parts also contain open (5 and 7, respectively) and photo-based (2 in both) questions. MOSfv was sent to the participants of both expert meetings again for comments. This did not result in any changes, and experts indicated that the most relevant aspects of use and usability were included in MOSfv.

Phase 7 – Testing MOSfv for reproducibility

The same inclusion criteria, procedures, and data analysis as in phase 5 were applied, apart from 2 differences with regard to the recruitment of patients. Patients were recruited from OIM Orthopedie and Penders Voetzorg (Heythuysen, The Netherlands), instead of OIM Orthopedie and RRT, because Penders Voetzorg is a larger company with more patients. Furthermore, the same method of administration was of application for MOSprefv and MOSpostfv: at the same time, all patients from the 2 participating orthopaedic shoe companies meeting the inclusion criteria were contacted by post by the orthopaedic shoe company. For MOSprefv, a mailing was sent to patients who had visited the orthopaedic shoe company for foot measurements for a first pair of OS in the last 2 months; for MOSpostfv, a mailing was sent to patients who had worn their first pair of OS for 2–4 months. The mailing contained an information letter and an informed consent form, the letter could be returned to the researchers by pre-stamped envelope. After receiving a consent form, MOSfv was sent to patients by post. Once MOSfv was completed for the first time and received by the researchers, MOSfv was again sent to patients by post.

MOSprefv was sent to 58 patients. 37 patients completed MOSprefv twice; 3 patients completed MOSprefv only once. MOSpostfv was sent to 49 patients. 39 patients completed MOSpostfv twice; 2 patients completed MOSpostfv only once. The interval between completing MOSfv for the first and the second time varied between 1 and 3 weeks. This is a usual test-retest interval (26). It was not expected that any clinically relevant changes in patients’ health status took place in this period. Patient characteristics are summarized in Table I. There were no major demographic differences between the groups.

Table I. Characteristics of patients participating in the reproducibility study of the final version of the Monitor Orthopaedic Shoes (MOSfv) questionnaire
Characteristics		MOSprefv (n = 37)	MOSpostfv (n = 39)
Age, years, mean (SD)		66 (13)	67 (13)
Gender, female, n (%)		26 (70)	22 (56)
Disorder, n (%)*
	Diabetes mellitus	9 (24)	13 (33)
	Rheumatoid arthritis	10 (27)	7 (18)
	Foot disorder†	22 (60)	29 (74)
	Muscle disease	4 (11)	2 (5)
	Other disease	10 (27)	12 (31)
*More than one disorder was possible; disorders were indicated by patients themselves. †Foot disorder was unspecified. SD: standard deviation.

RESULTS

Tables II and III list the results with respect to the reproducibility analysis, for MOSprefv and MOSpostfv, respectively. Cohen’s kappa for 13 questions was between 0.61 and 0.80, indicating substantial agreement. Cohen’s kappa for 8 questions was between 0.81 and 1.00, indicating almost perfect agreement. Cohen’s kappa of one question (regarding the patient’s expectation of other people’s opinion of the cosmetic appearance) was 0.47, which indicates moderate agreement. Detailed analysis of this question revealed that most variation was between the response possibilities “neutral” and “beautiful” or “neutral” and “I do not know”. The ICC of all questions was above 0.70, which indicates high agreement. The SRD was in the range 21–50 mm. It took patients approximately 15 min to complete either the pre or the post part.

Table II. Results of the reproducibility analysis of the pre part of the final version of the Monitor Orthopaedic Shoes (MOSprefv) questionnaire (n = 37)
Domain	n	Cohen’s kappa	n	ICC (CI)		SRD, mm
Domain	n	Cohen’s kappa	n	Smallest	Largest	Smallest	Largest
Use	1	0.66
Effectiveness
Walking capacity	3	0.72–0.76
Wounds	2	0.88–1.0
Pain			4	0.76 (0.56–0.87)	0.86 (0.74–0.93)	29	50
Sprains			2	0.76 (0.42–0.92)	0.93 (0.87–0.97)	25	31
Efficiency
Comfort in use			1	0.76 (0.57–0.87)		27
Satisfaction
Communication	4	0.72–0.80	2	0.84 (0.71–0.91)	0.84 (0.68–0.92)	26	27
Cosmetic appearance	2	0.47–1.0	1	0.72 (0.50–0.85)		42
Acceptability			2	0.84 (0.70–0.92)	0.85 (0.71–0.92)	21	31
Note: Three questions concerning demographic information are not shown. n: number of questions, the range of the Cohen’s kappa is shown; ICC: Intra Class Correlation; CI: confidence interval; the smallest and largest ICC with corresponding CI are shown; SRD: smallest real difference; the smallest and largest SRD are shown.

Table III. Results of the reproducibility analysis of the post part of the final version of the Monitor Orthopaedic Shoes (MOSpostfv) questionnaire (n = 39)
Domain	n	Cohen’s kappa	n	ICC (CI)			SRD, mm
Domain	n	Cohen’s kappa	n	Smallest	Largest		Smallest	Largest
Use	5	0.75–0.90	1	0.92 (0.85–0.96)
Effectiveness
Walking capacity	2	0.78–0.89
Wounds	2	0.93–1.0
Pain			4	0.78 (0.52–0.91)		0.95 (0.89–0.98)	23	48
Sprains			2	0.84 (0.62–0.94)		0.93 (0.85–0.96)	22	27
Efficiency
Comfort in use			7	0.71 (0.50–0.85)		0.91 (0.82–0.95)	33	49
Satisfaction
Communication			2	0.76 (0.55–0.88)		0.81 (0.66–0.90)	37	41
Cosmetic appearance	1	0.82	1	0.86 (0.75–0.93)			36
Acceptability			2	0.72 (0.52–0.85)		0.73 (0.51–0.86)	41	42
Note: One question concerning demographic information is not shown. n: number of questions, the range of the Cohen’s kappa is shown; ICC: intra class correlation; CI: confidence interval; the smallest and largest ICC with corresponding CI are shown; SRD: smallest real difference; the smallest and largest SRD are shown.

DISCUSSION

The result of this study is a practical and reproducible questionnaire that can measure the most relevant aspects of use and usability of OS from a patient’s perspective, for a wide-range of patient groups. This questionnaire, MOS, can be applied at 2 levels: (i) to measure expectations (prior to prescription) and experiences with use and usability of OS in a group of patients; and (ii) to identify problems with OS on an individual level, prior to provision or after experience with OS. Insight into the use and usability of OS on both levels may lead to clearer identification of problems relating to the usability of OS, an increase in patient satisfaction, and an increased rate of use of OS.

It can be concluded that MOS is a reproducible questionnaire, because high agreement was found in all but one question of the final version. This one question explored the patient’s expectation of other people’s opinion of the cosmetic appearance of their OS. It was apparent that patients’ concern for other people’s opinion 3 months before actual delivery of their OS is subject to much variation. As the detailed analysis showed, answers in the categories “neutral”, “beautiful”, or “I do not know” varied the most, thus these answers should be treated with care. It may be even better to not interpret this question at all. In contrast, the same question in the experienced part of MOS resulted in “almost perfect agreement”.

Relatively large SRD values were found, ranging from 21% to 50% of the VAS. When a part of MOS is completed twice by the same patient (e.g. by administering MOSpost after 3 and after 12 months of using OS), differences between consecutive scores should be large (greater than 21–50%) before they can be interpreted as a real difference. However, most scores were found near either one of the extreme end-points (e.g. patients either have a lot of pain or almost no pain; shoes fit either very badly or very well; etc.). It can be expected that a clinically relevant change will correspond with a shift from one extreme end-point to the other. The resulting large difference between the 2 scores will then likely be larger than the SRDs found. Consequently, the large SRDs do not present a problem in the application of MOS.

Validity is regarded as the most fundamental consideration in developing and evaluating a measuring instrument (28). Various categories of validity have been described in the literature. Of these, construct validity is the overarching category that can be addressed in 2 different ways, by criterion and content validity (26, 29). Criterion validity is not relevant in this study, as no similar measuring instrument exists that can be used as a criterion. Emphasis has therefore been placed on the content validity. Good content validity increases motivation and reduces dissatisfaction among users (e.g. OS technicians or medical specialists) and respondents (e.g. patients), and makes it more likely that other stakeholders (e.g. policy makers, health insurance companies) accept the results (26). These are all essential aspects when considering application of MOS. Apart from the opinion of experts, there is no method to analyse content validity. MOS is based on a literature search and all experts indicated that the most relevant aspects of use and usability of OS are included. We conclude that content validity of MOS can therefore be regarded as satisfactory.

All questions in MOS were formulated in a manner that was preferred by most experts. This is a strong point from a respondent-focused view, because it enhances clarity and the practical application of the questionnaire. On the other hand, it is also a limitation. The questions in MOS are formulated in different ways by using either a VAS or multiple choices with 2, 3, 4, or 5 response categories. Consequently, it is not possible to calculate an interpretable sum-score or to perform a factor-analysis on MOS. This means that the underlying constructs of the questionnaire cannot be determined. With MOS, it is therefore not possible to identify the relationships between the different aspects of usability or the relative importance of these aspects with regard to use. MOS can, however, be used to gain insight in patients’ opinions with regard to the measured aspects of use and usability separately.

A possible limitation of the reproducibility study is that MOSprefv and MOSpostfv were not completed by the same subjects. However, reproducibility was determined through a within-group design that is not affected by differences between the groups. Furthermore, both parts of MOSfv were completed by patients at appropriate timing (i.e. prior to receiving OS or after having used OS for 2–4 months). If the same patients had completed both parts, there would have been an interval of approximately 6 months between completing the pre and the post part. In that period, relevant changes in a patient’s health status may occur. Finally, there are no reasons to expect different reproducibility scores between the groups, because there were no large differences in the patient characteristics of both groups and the methods of administration of the patients were the same.

With regard to the application of MOS, some remarks can be made. First, MOS was specifically designed for first-time users. The difference between first-time and experienced users is essential in the interpretation of outcomes of MOS. First-time users compare their (future) pair of OS with their normal shoes, whereas experienced users compare their new pair of OS with their previous pair(s) of OS and possibly with previous normal shoes. This different frame of reference may influence expectations and experiences of a user’s (future) pair of OS, and thereby influence the outcomes of MOS. Therefore, when MOS is administered to a group of patients, the outcomes of first-time users and experienced users should be separated. Furthermore, it should be kept in mind that some questions may need to be added when MOS is administered to experienced users. For example, with regard to the number of pairs that have previously been worn. The second remark concerns the language of the questionnaire. MOS has been developed specifically for the Dutch situation. Cross-cultural adaptation of MOS to other countries will be a useful step to enhance research regarding use and usability of OS around the world. The Dutch version of MOS and its English version (unilaterally translated by the authors) are available from the corresponding author.

In conclusion, MOS is the first practical and reproducible questionnaire that can measure the most relevant aspects of use and usability of OS from a patient’s perspective, for a wide range of patient groups. MOS is short and easy to complete, and can be used for evaluation of a group of patients, as well as for assessment of an individual patient.

ACKNOWLEDGEMENTS

We gratefully acknowledge financial support provided by the Development Fund for the Orthopaedic Shoe Company (OFOM); by the National Centre for Innovation in Rehabilitation Technology; by the Foundation OIM Orthopedie; and by the Foundation Beatrixoord North–Netherlands, The Netherlands. The authors would like to thank J. Stötefalk, G. Roelofs and I. Meijers and their orthopaedic shoe companies OIM Orthopedie (Haren, The Netherlands), RRT (Enschede, The Netherlands), and Penders Voetzorg (Heythuysen, The Netherlands), for their co-operation in this project. The Dutch version of MOSfv and its unilateral translated English version can be obtained from the corresponding author.

REFERENCES

Original report

Development and reproducibility of a short questionnaire to measure use and usability of custom-made orthopaedic shoes

Comments