Skip to main content

Advertisement

ADVERTISEMENT

Original Research

Reliability Assessment of an Innovative Wound Score

June 2016
1044-7946
Wounds 2016;28(6):206-213

Abstract

The authors describe an innovative wound score and demonstrate its versatility for scoring a variety of wound types in addition to diabetic foot ulcers (DFUs). To further test its merits, they determined its interobserver reliability in a prospective series of patients.The Wound Score system the authors created integrates the most important features of 4 predominantly used wound scoring systems. It utilizes a logical 0 to 10 format based on 5 assessments each graded from 2 (best) to 0 (worst). The versatility and reliability of the Wound Score were studied in a prospective series of 94 patients with lower extremity wounds. The Wound Score was quick to determine, applicable to a variety of wound types and locations, and highly objective for grading the severity of each of the 5 assessments. The Wound Score categorized wound types as “healthy,” “problem,” or “futile” for evaluation and management. Diabetes was present in 75.9%, with 70% of the DFUs scoring in the “problem” wound range. Interobserver reliability was high (r = 0.81). The objectivity, versatility, and reliability of the Wound Score system facilitates making decisions about the management of wounds, whether DFUs or not, and provides quantification for comparative effectiveness research for wound management.

Introduction

Wound scoring systems help with the evaluation and management of wounds. A secondary benefit is that they may help to validate effectiveness of different interventions of like-with-like wounds, ie, comparative effectiveness research (CER). More than a dozen diabetic foot ulcer (DFU) scoring systems exist, while there are about half that number for pressure ulcer classifications.1 Although many findings can be used to make evaluation and management decisions for wounds, 1 or 2 assessments typically provide the basis for the scoring of a wound in the most frequently published diabetic foot and pressure ulcer literature. The following is a brief analysis of 4 predominantly cited wound/ulcer classifications systems and the main criteria each use for grading a wound.

Wagner Classification. In 1979, Wagner published a system for decision making and management of DFUs.2 All decisions on whether to salvage or to refer to vascular service for revascularization or amputation were based on ankle-brachial indices (ABIs); if greater than 0.45, salvage was recommended. If salvage was the option then 6 disparate clinical findings, none of which was graded on a continuum of severity, were used as a basis for generating algorithms for management. In Wagner’s system, a grade 0 lesion (deformity/pre-ulcer) could result in an amputation, while a grade 5 (hindfoot gangrene) could be salvaged. In 1981, Wagner expanded his system to include non-DFUs, but lowered the ABI to 0.35 in this group to decide the crucial salvage/amputation decision.3 It is noteworthy that in a subsequent article, co-authored by Wagner in 1983,4 ABIs were stated as being inaccurate in diabetics, and the use of them in the decision-making process was not recommended. 

In summary, perfusion is the criterion for making the crucial decision for salvage versus amputation in the Wagner system. Even with its popularity, no reliability or validity studies are available to verify its precision in predictability of outcomes. At best, the only validity type that it meets is “face” type based on its widespread and long-standing use. With respect to outcomes, Wagner stated that 93% of cases resulting in amputation from partial toes to above knee healed utilizing his system.2

National Pressure Ulcer Advisory Panel (NPUAP). In 1989, the panel published a 4-stage grading system of pressure ulcers based solely on the assessment of wound depth and then updated it in 1989 and 2007.5 In this respect it is not actually a DFU classification system. Since depth is always an important consideration when evaluating and managing a wound, it has applications for all types of wounds.

In 2007 the panel supplemented its classification to include deep tissue injury (when skin continuity was still present) and unstageable (when an eschar or similar covering obscured the depth).6 Even though the NPUAP system is used extensively, it does not consider other important wound characteristics such as perfusion and bioburden. Like the Wagner system, its validity is based on usage that is “face” type. No reliability studies for the NPUAP system with respect to DFUs exist. In summary, for NPUAP staging, wound depth is the sole criterion.

University of Texas Health Science Center San Antonio Diabetic Wound Classification (UTDWC). In 1996 Lavery et al7 published a 16-square matrix for classifying diabetic foot wounds. Wound depths, similar to the NPUAP system, are expressed in a horizontal continuum from I to IV. Four columns, labeled from A to D, stage wounds on absence or presence of infection and/or ischemia. Consideration is not given to the severity of the infection or the criticalness of the ischemia.

The classification has been criticized for lacking management recommendations for each of the 16 options in the matrix. However, other publications8-10 have addressed this. Although no reliability studies are identified, the authors10 and others state that with movement down and to the right on the matrix, outcomes worsen and amputations become statistically significant as wounds increase in depth and stage. In summary, the UTDWC incorporates the perfusion (utilizing the term ischemia) assessment of Wagner combined with the presence or absence of infection and couples these 2 assessments with the depth assessment of the NPUAP staging system.  

Infectious Disease Society of America (IDSA) Clinical Classification of a Diabetic Foot Infection. In 2004, the IDSA generated a clinical classification of a diabetic foot infection in its “Diagnosis and Treatment of Diabetic Foot Infections” IDSA guidelines.11 The IDSA categorizes diabetic foot infections into a 4-level continuum using words rather than letters or grades from “uninfected” to “severe” and describes the criteria for each using objective parameters. Again, no reliability or validity measurements are available using this system, although the IDSA guidelines provide direction for the evaluation and management of each level of DFU wound severity.  

In developing a simple-to-use wound score that grades severity of assessments using objective criteria, the authors credit the Apgar scoring system12 used to evaluate the vitality of the neonate. The Apgar system is a 0 (worst) to 10 (best) scoring tool which grades 5 elements in a continuum of robustness from 2 (optimal) to 0 (worst possible) using objective findings for each grade. The challenge was to integrate the essential elements of the widely used wound evaluation systems (previously described) using an Apgar-like scoring model. This paper describes the features of the authors’  Wound Score and demonstrates its versatility for scoring a variety of wounds in addition to DFUs in a prospective series of 94 patients. In addition, interobserver reliability of the Wound Score was studied by measuring the correlations between 2 observers who independently scored each patient’s wound.

Methods

In a 1999 informal review of a 10-point evaluation system, Strauss and Strauss13 reported that approximately 90% of wounds scored as “healthy” (8 to 10 points) or “problem” (4 to 7 points) had favorable outcomes. Over a 12-year period, the authors have modified and refined this 10-point scoring system. Subsequently, they obtained approval from the Institutional Review Board (IRB) to formally study the reliability and validity of their Wound Score in a prospective series of patients.  

The Wound Score integrates the essential elements of the 4 most frequently utilized methods to evaluate DFUs: namely, the Wagner, NPUAP, UTDWC, and IDSA systems. Collectively, these 4 wound scoring systems consider only 3 elements: depth, infection, and perfusion. Obviously, these assessments are essential for scoring wounds. However, 2 additional important assessments were added to generate a 5 element 0 to 10 (Apgar-like) scoring system. Of nearly 60 possible wound descriptors, the 2 next important in the authors’ experience are size of the wound and the appearance of the wound base.14 Each of the 5 assessments that include appearance, size, depth, infection, and perfusion is graded on a 2 (best) to 0 (worst) continuum using objective criteria to establish each grade and then summated to generate a 0 to 10 wound score (Figure 1). 

After obtaining IRB approval, the authors scored all inpatient lower extremity wounds that came to their attention from consults requested by their attending physicians in a prospective fashion whether the patient has diabetes mellitus or not. The research team for this paper included an orthopedic foot and ankle specialist, who focuses on extremity wound management (first author), an emergency medicine-hyperbaric medicine specialist, podiatry medicine residents, and a biostatistician. The initial attention focused on evaluating the versatility, user friendliness, and reliability of the Wound Score in the first 94 patients who came to the team’s attention. Patients excluded from the study include those younger than 18 years, with cancer in their wounds, not agreeing to participate in the study, and whose attending wound management physicians did not want their patients to be included. 

Results

The analysis of 94 patients demonstrated the versatility and reliability of the Wound Score. In 11 patients (11.7%), data was insufficient to include them in the analysis. Of the remaining 83 patients, 46 (55.4%) were males with a mean age of 57.7 years and 37 (44.6%) were females with a mean age of 60.4 years. Neither gender differences in Wound Scores (P = 0.34) nor age differences between genders (n = 0.43) were significant. Sixty-three (75.9%) of the 83 patients had diabetes mellitus (Figure 2). Of the diabetic wounds, 54 (85.7%) were in the foot including toes, forefoot, midfoot, and hindfoot. The remaining 9 patients (14.3%) had lower extremity wounds proximal to the foot. For patients without diabetes (n = 20; 24.1%), slightly fewer wounds occurred in the foot (n = 9; 45%) versus more proximally in the lower extremity (n = 11; 55%). However, the mean difference in wound scores between these 2 groups was not significant (P = 0.69). The wound scores between those with diabetic foot wounds and those with wounds proximal to the foot were not significantly different (P = 0.35). Similarly, the wound scores between patients without diabetic foot wounds and wounds proximal to the foot for patients without diabetes were not significantly different (P = 0.74). 

The mean wound score of patients with diabetes mellitus was 4.9 points (on the 0 to 10 scale, with 10 being the best possible score). For the patients without diabetes mellitus, the mean score was 6 points. The differences between the mean wound score of the 2 groups were measured by Welch’s t-test15 and were statistically significant (P = 0.0008). The majority of wound types (eg, healthy, problem, and futile) scored in the “problem” range (70%). Scoring for wound types between staff and resident physicians was very similar (Figure 3). The overall mean differences across the wound types were measured by 1-way analysis of variance (ANOVA) and were also statistically significant (P < 0.0001). Scores for 4 patients were not available from the staff physicians. Based on Tukey’s Studentized Range (also known as HSD) test for mean score difference between 2 wound types, each mean score difference between “healthy” and “problem,”  “healthy” and “futile,” and “problem” and “futile” showed statistical significance (P < 0.0001). 

With respect to demonstrating the reliability of scoring wounds using the Wound Score between resident and staff physicians, coefficients of correlations and Cohen’s kappa statistics were used. The Pearson product-moment correlation of the Wound Score data from 76 pairs of scores was 0.81, and the degree of association was tested using the Student t-test (P < 0.0001; 95% CI = 0.714-0.875). The Cohen’s kappa method for further assessing reliability generated a value of 0.227 (P < 0.0001), which indicates “fair agreement” using this statistical technique.

Discussion

The Wound Score addresses the major concerns with the most frequently used DFU scoring systems. Each only gives a limited view of the severity of the DFU and for the most part does not rate the severity of the observation. For example, Wagner’s primary decision for management is based upon perfusion, and if perfusion is acceptable, 6 single disparate observations determine the wound grade; the NPUAP staging is based upon depth only; the UTDWC matrix utilizes depth plus a combination of perfusion and infection information; and the IDSA grading is based upon the severity of infection. Although these scoring systems are frequently used and have merits for DFU wound management, the authors found no reports of interobserver reliability for any of them. Important to note, in Wagner’s second paper3 he expands his classification system to include non-DFUs as well as DFUs. The algorithms for management are identical except Wagner recommends the “salvage” approach for the non-DFU be based on an ABI of 0.35 versus 0.45 for the DFU.

The Wound Score integrates the essential features of each of the above scores, adds 2 additional assessments (wound size and appearance of the wound base), and grades each assessment on a continuum of severity. Except for the IDSA DFU infection score, none of the other grading systems considers the severity on a continuum. Rather, they use a primary feature(s) to rate the wound (ie, the assessment in question is only documented as present or absent). The use of a continuum of objective findings for grading each of the 5 assessments of the Wound Score on a 2 (best) to 0 (worst) scale generates a 0 to 10 Wound Score. It is intuitively obvious that the higher the score, the healthier the wound.   

The Wound Score is adaptable to wounds in patients with diabetes mellitus in locations other than the foot as well as wounds in patients without diabetes mellitus. This demonstrates its versatility whereas Wagner, UTDWC, and IDSA systems are limited to DFUs. In this prospective series of 83 patients, 14.3% of patients with diabetes mellitus had lower extremity wounds proximal to the foot. In addition, nearly a quarter (24.1%) occurred in patients without diabetes mellitus. The authors feel equal consideration for documentation and treatment of these wound situations need to be as thorough as for DFUs to optimize care and minimize morbidity of patients with lower extremity wounds.

The Wound Score immediately quantifies the severity of the wound (ie, “healthy” (7.5 to 10 points), “problem” (3.5 to 7 points), or “futile” (0 to 3 points) types) and provides the basis for wound management (Figure 1). For example, a “healthy” wound only requires physiologically sound wound dressing agents to heal, while a “futile” wound would require amputation unless revascularization can be done successfully. For the “problem” wound, comprehensive management to avoid a lower limb amputation is required. Considerations include: 1) surgical management of the wound base with debridements and/or removal of deformities; 2) protection, off-loading, and/or stabilization; 3) optimal medical management of diabetes mellitus and other medical conditions; 4) selection of optimal wound dressing agents; and 5) addressing wound ischemia-hypoxia concerns with revascularizations, hyperbaric oxygen, and medical interventions.16

To the best of the researchers’ knowledge, this is the first interobserver wound scoring study for grading DFUs. Another favorable feature of the Wound Score is a minimal “learning curve” to use it effectively. This was demonstrated by our high interobserver correlation (r = 0.81) between experienced physician wound care providers and first-year through third-year podiatry residents. The special features of this Wound Score are discussed with each grader at their initial orientation to the Wound Score (Table 1). The orientation is brief, taking no longer than a couple of minutes. Typically, the initial orientation for using the Wound Score in this study included a discussion of the 5 assessments, the continuum of grading each assessment, and possible sources of confusion with evaluation of the wound (Table 1). A simplified scoring form is used to record the observations (Figure 4).

This study’s authors believe the use of the Wound Score provides the best possible approach for CER of wound care products for DFUs as well as for wounds in other locations in patients without diabetes mellitus. By appropriately addressing confounders such as underlying deformity, unresolved deep infection (of bone, bursa, and/or cicatrix), and ischemia-hypoxia, the effectiveness of wound care products can be objectively determined by comparing the results with wounds of similar Wound Score grades. Currently used DFU evaluation systems do not offer a complete enough evaluation of the wound to do so. Such information would not only be beneficial to the patient, but also cost effective to insurance payers.

The objectivity and use of grading continuums of the Wound Score lends itself to additional research. It provides an approach to measure validity by comparing like-with-like wounds, thereby facilitating CER studies of treatment interventions as well as meaningful, clinically important improvement (MCII). In a fashion similar to the interobserver reliability study reported in this paper, the reliability between observers of the Wagner, NPUAP, UTDWC, and IDSA wound evaluation systems could be statistically analyzed. A multifactorial analysis of each of the 5 assessments of the Wound Score paired with both the Wound Score and the outcome could possibly demonstrate which of the 5 assessments is most important in predicting outcomes. Finally, the relationship of patient wellness and their goals, as paired with the Wound Score, could be instrumental in making decisions regarding limb salvage versus amputation in those wounds in the transition zone between “futile” and “problem” (ie, in the 2.5 to 4 point range).

The authors believe the Wound Score has the potential to make a substantial contribution to the evaluation and management of DFUs as well as other wounds. Criticisms may be raised with this approach. It is a prospective series based on observational information and not a randomized, controlled trial (RCT).  When preparing this study’s request to the IRB, the authors determined wound scoring was better suited to assessing the usefulness of the Wound Score in an observational, prospective study format. This is in contrast to treatment interventions where RCTs would be more appropriate. The correlations based on the Wound Score were summated from the 5 assessments. However, there were minor variations (usually within a half point) in the grades the observers assigned to each assessment. Although the researchers tried to eliminate patient selection bias, no patient who came to the authors’ attention and met inclusion criteria was eliminated from the study. Only 3 patients refused to sign the informed consent to grade their wounds and to participate in the study, even though the study itself does not offer advice to the attending physician for evaluation and management of the wound. Several attending physicians who had patients with DFUs did not want their patients included in the study and were not evaluated. The authors feel these exceptions did not have an appreciable effect on the results reported. Finally, a crucial consideration in any scoring system is how it facilitates management and how it predicts outcomes. Although the Wound Score makes it easy to separate wounds into “healthy,” “problem,” and “futile,” information needs to be further analyzed as to the outcomes for each.  This paper was not generated for that purpose.

Conclusions

The objectivity and versatility of the Wound Score made scoring of the initial series of 83 patients’ foot, leg, and ankle wounds easy, whether the patient had diabetes mellitus or not.  It provides quantitative information for scoring a wound as “healthy,” “problem,” or “futile.” It lends itself to objectively evaluating wound improvement (or deterioration) with subsequent scorings of the wound. The interobserver agreement in scores between the staff physician and resident scores was high (r = 0.81), which demonstrates its reliability. The speediness (usually a minute or less), logical approach, and objectivity to determine the Wound Score make it user friendly. Finally, the authors feel it is an especially valuable tool for CER of wound care products as well as other management interventions. It also provides objective criteria for MCII determinations.

Acknowledgements

The authors would like to thank Keith Penera, DPM for his assistance in formulating the IRB proposal, Diane Eisenstein for implementing the data processing, and to the following doctors of podiatristric medicine for scoring wounds and collecting data: Karim Manji, Derrick Lew, Alex Craig, Jose Ponce, Steven La, Chris Jones, Anna Tan, and Suzanna Chan. Their contributions were instrumental in bringing this paper to fruition even though none aided in the actual writing, data analysis, or editing of the manuscript.

From the Long Beach Memorial Medical Center Hyperbaric Medicine Program, Long Beach, CA; Clinical Professor of Orthopaedic Surgery, University of California, Irvice School of Medicine, Irvine, CA; Department of Mathematics and Statistics, Cal State University Long Beach, Long Beach, CA; and Memorial Hermann Surgical Hospital, Kingwood, TX

Address correspondence to:
Michael B. Strauss, MD
Hyperbaric Medicine Department
Long Beach Memorial Medical Center
2801 Atlantic Avenue
Long Beach, CA 90801
MStrauss@memorialcare.org 

Disclosure: Drs. Strauss and Miller receive royalties from MasterMinding Wounds. In addition, Dr. Strauss receives royalties from Diving Science. The authors have no conflicts of interest to disclose.

References

1. Strauss MB, Aksenov IV, Miller SS. Classification systems for the diabetic foot wound. MasterMinding Wounds. Flagstaff, AZ: Best Publishing Company; 2010:57-108.  2. Wagner FW Jr. The dysvascular foot: a system of diagnosis and treatment. Foot Ankle. 1981;2(2):64-122. 3. Wagner FW Jr.  A classification and treatment program for diabetic, neuropathic and dysvascular foot problems. Instructional Course Lectures: The American Academy of Orthopaedic Surgeons. Vol 28. St. Louis, MO: Mosby; 1979.  4. Siegel ME, Stewart CA, Wagner FW Jr, Sakimura I. An index to measure the healing potential of ischaemic ulcers using Thallium 201. Prosthet Orthot Int. 1983;7(2):67-68. 5. Agency for Health Care Policy and Research Guidelines. Pressure Ulcers in Adults: Prediction and Prevention Clinical Practice Guideline. No. 92-0047. Rockville, MD: AHCPR; 1992. 6. National Pressure Ulcer Advisory Panel Website. http:www.npuap.org/resources/educational-and-clinical-resources/noise-pressure-injury-stages. 7. Lavery, LA, Armstrong DG, Harkless LB. Classification of diabetic foot wounds. J Foot Ankle Surg. 1996;35(6):528-531. 8. Armstrong DG, Lavery LA. Diabetic foot ulcers: prevention, diagnosis and classification. Am Fam Physician. 1998;57(6):1325-1332, 1337-1338. 9. Armstrong DG, Lavery LA, Harkless LB. Treatment-based classification system for assessment and care of diabetic feet. J Am Podiatr Med Assoc. 1996;86(7):311-316. 10. Armstrong DG, Lavery LA, Harkless LB. Validation of a diabetic wound classification system. The contribution of depth, infection, and ischemia to risk of amputation. Diabetes Care. 1998;21(5):855-859. 11. Lipsky BA, Berendt AR, Cornia PB, et al; Infectious Diseases Society of America. 2012 Infectious Diseases Society of America clinical practice guideline for the diagnosis and treatment of diabetic foot infections. Clin Infect Dis. 2012;54(12):e132-e173. 12. Apgar V. A proposal for a new method of evaluation of the newborn infant. Originally published in July 1953, volume 32, pages 250-259. Anesth Analg.2015;120(5):1056-1059. 13. Strauss, MB, Strauss WG. Wound scoring system streamlines decision-making. BioMechanics. 1999;6(8):37-43. 14. Strauss MB, Aksenov IV, Miller SS. The wound score - a solution for the wound classification dilemma. MasterMinding Wounds. Flagstaff, AZ: Best Publishing Company, 2010:109-128. 15. Welch BL. The generalization of student’s problems when several different population variances are involved. Biometrika. 1947;34(1-2):28-35. 16. Strauss MB, Miller SS, Aksenov IV, Manji K. Wound oxygenation and an introduction to hyperbaric oxygen therapy: interventions for the hypoxic/ischemic wound. Wound Care Hyperbaric Med. 2012;3(2):36-51.

Advertisement

Advertisement

Advertisement