Scoring Systems

The declared goal of intensivists is to provide medical treatment of the highest quality in order to achieve the best results for the patient. To reach this goal, intensivists have made enormous strides in the past 50 years, but with increasing difficulty. Because of the pressure of rising costs, intensivists must be able to justify their therapeutic efforts.

The purpose of the first scoring systems derived 20 years ago was to have an instrument available with which to compare patient groups with each other and also to develop individual forecasts. As the most important—and most easily measurable—result, the vital status of the patient at discharge was defined. Other outcomes, which are much more difficult to gather, are 1-year mortality or quality of life after an intensive care stay.

Hospital mortality depends, besides the treatment received, on a vast number of patient-related factors. Such characteristics include age, comorbidities or the reason for admission to the intensive care unit and are described as “case mix”. To objectify the effect of treatment, it is necessary adjust for all these patient-related factors that have an effect on mortality. Severity scoring systems make thus intensive care patients comparable, through adjusting for these factors. This method is called “case mix adjustment”.

William Knaus was a pioneer in evaluating these methods. In the late 1970s he developed the first version of the APACHE score (Acute Physiology and Chronic Health Evaluation). It was followed by two simplified versions APACHE II (created also by Dr. Knaus and his colleagues) and SAPS (Simplified Acute Physiology Score), was developed by Jean-Roger Le Gall. Both were updated a second time. APACHE III and SAPS II are the latest updated versions. Currently, a new version of SAPS, namely the SAPS 3 is under development and should be published soon ( At the same time, Stanley Lemeshow developed the MPM (Mortality Prediction Model), which is almost solely used in the United States. APACHE II and SAPS II are the systems most commonly used in Europe. According to the European Society of Intensive Care Medicine (ESICM), SAPS II, APACHE II and MPM II are equally suitable for use in determining the severity of illness of the intensive care patient.

These scoring systems were developed by analysing information about important characteristics of a great number of intensive care patients from a database. Different models were used to choose and emphasize certain characteristics. The first version of APACHE used parameters from various sources that were agreed upon by experienced doctors (the subjective or expert method). The newer systems used statistical methods (the objective method) to filter out relevant parameters.

All models describe the correlation between independent variables (patient attributes) and the dependent variable (hospital mortality) in the form of a mathematical equation. APACHE II, APACHE III and SAPS II calculate the relative importance of the independent variables and display the results in a formula that predicts the mortality rate from 0 to 100%. However, because the category “vital status at hospital discharge” shows only two possibilities (survival or death), it must be clear that these scores are not suitable for predicting outcome for individual patients. An extremely high predicted mortality, i.e. 90% would still mean that 10 of 100 patients leave the hospital alive – but it is impossible to determine which of them. For this reason the above mentioned consensus conference strictly warns intensivists to use these scoring systems for the purpose of individual outcome prediction.

Scoring systems are dependent on sample of patients they were developed from. For this reason ,the performance of these systems should be tested in the regional patient sample by the evaluation of calibration and discrimination. With the help of Hosmer and Lemeshow’s goodness-of-fit-test, the correlation between predicted and actual numbers of survivors and non-survivors for all groups of patients can be tested. Another test (the area under the receiver operating characteristic curve) shows the capability of the system to discriminate between survivors and non-survivors. If the predicted mortality greatly differs from the observed mortality, customization of the coefficients can help to improve prognostic performance.

The scale of applications for severity of illness scores is large and includes internal (analysis of effectiveness and efficiency) and external (comparison of different intensive care units) quality management, and the grouping of patients for clinical studies. For this purpose the predicted hospitality mortality is compared with the actual hospital mortality. This relationship can be expressed by the so-called standardized mortality ratio (SMR), also called O/E ratio. If the observed mortality is higher than predicted, the SMR value is greater than 1. If the observed mortality is smaller then the predicted, the SMR value is less than 1. The SMR allows an independent comparison of the performance of different intensive care units—assuming a careful documentation and the correct use of the methods. For intensive care units with a performance which is significantly different from the average performance, an exact evaluation of possible reasons should be done.

Another group of scores is used to determine the therapeutic workload. TISS (Therapeutic Intervention Scoring System) serves as a prototype. David Cullen designed this tool in the 1970s to attempt to determine the severity of illness from the degree of the level of care. Although this correlation could not be verified (because the therapeutic effort is greatly influenced by the structure of the intensive care unit), TISS remained for a long time the most commonly used workload score. The system was revised by the same authors in 1983 and in 1996. A simplified version was published by Dinis Reis Miranda, the so-called TISS-28, which replaced the original TISS and is now the most widely used version.

All scoring systems discussed here have, however, methodological and application-related restrictions which necessitate a critical appraisal of obtained results. Besides methodological guidelines and provisos concerning the interpretation of results, the collected data must be correct and complete. The explicit definition of the data items forms a major prerequisite for a high data quality.

All scoring systems discussed here have, however, methodological and application-related restrictions which necessitate a critical appraisal of obtained results. Besides methodological guidelines and provisos concerning the interpretation of results, the collected data must be correct and complete. The explicit definition of the data items forms a major prerequisite for a high data quality.

Advisable literature

Le Gall JR, Lemeshow St, Saulnier F. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American Multicentre Study. JAMA 1993; 270(24): 2957-2963.

Knaus WA, Draper EA, Wagner DP, Zimmermann JE. APACHE II: a severity of disease classification system. Crit-Care-Med 1985; 13(10): 818-829.

Lemeshow St, Teres D, Klar J, Avrunin JS, Gehlbach StH, Rapoport J. Mortality Probability Models Based on an International Cohort of Intensive Care Unit Patients. JAMA 1993; 270(20): 2478 – 2486.

Cullen DJ, Civetta JM, Briggs BA. Therapeutic intervention scoring system: A method for quantitative comparison of patient care. Crit-Care-Med 1974; 2: 57.

Miranda DR, De Rijk A, Schaufeli W. Simplified Therapeutic Interventions Scoring System: The TISS-28 items - Results from a multicentre study. Crit-Care-Med 1996; 24(1): 64 – 73

Boyd CR, Tolson MA, Wayne SC. Evaluating Trauma Care: The TRISS Method. J-Trauma 1987; 27: 370 – 378.

Predicting outcome in ICU patients. 2nd European Consensus Conference in Intensive Care Medicine. Intensive-Care-Med. 1994; 20(5): 390-7

Chalfin DB, Cohen IL, Lambrinos J. The economics and cost-effectiveness of critical care medicine. Intensive-Care-Med 1995; 21: 952 – 961.

Cowen JS, Kelley MA. Errors and bias in using predictive scoring systems. Crit-Care-Clinics 1994; 10(1): 53 – 77.

Teres D, Lemeshow St. Why severity models should be used with caution. Crit-Care-Clinics 1994; 10(1): 93 – 110.