The ‘Centre Effect’ in Nephrology: What Do Differences between Nephrology Centres Tell Us about Clinical Performance in Patient Management?Hodsman A.a · Ben-Shlomo Y.b · Roderick P.c · Tomson C.R.V.a
aUK Renal Registry, Southmead Hospital, and bDepartment of Social Medicine, University of Bristol, Bristol, and cDepartment of Public Health Sciences and Medical Statistics, School of Medicine, University of Southampton, Southampton, UK Corresponding Author
Dr. Alexandra Hodsman
UK Renal Registry
Bristol BS10 5NB (UK)
Improving the quality of care provided by nephrology centres to patients with kidney disease requires a clear understanding of how to compare performance after adjustment for case mix, combined with a detailed understanding of the structure and processes that are associated with the achievement of good clinical results. In this review, we discuss how to measure quality of care (using process or outcome measures), how to take case mix into account, how best to display comparisons between nephrology centres, and how to study the causes of real variations in quality between centres. This is a narrative review; we include examples from other fields in which the centre effect has been studied, including education.
© 2011 S. Karger AG, Basel
Measuring variation in the quality of healthcare provided by different organisations is key to understanding how to improve the clinical and financial quality of care that they provide, and is increasingly necessary as the costs and complexity of healthcare continue to increase. Publicly available sources of data (e.g. large administrative databases and disease-specific registries) can be used relatively easily to generate analyses of variation, but the interpretation of such analyses is complex, particularly relating to adjustment for differences in case mix. In assessing organisational performance, it is critical to ensure that comparison of organisations is fair and truly reflects underlying differences in quality, as failure to do so can be potentially harmful and can stigmatise organisations. Traditionally, epidemiological studies have examined how individual characteristics predict future outcome. In contrast, organisational characteristics, e.g. infra-structure, staffing and processes of care, can also influence outcome independent of the individual characteristics of patients (‘compositional effect’). This approach originated from the educational field, driven by the desire to identify which properties of schools add value to the academic achievement of pupils over and above their innate ability and family background . These higher level factors are either referred to as ‘contextual effects’ in relation to spatial exposures such as where one lives or as ‘centre effects’ in relation to organisations such as hospitals or schools [2,3]. To further complicate matters, centre effects may interact with individual level factors and hence the average effect for a centre may be misleading (the ‘constant risk fallacy’) . For instance, some centres may be better than others at providing care for a particular co-morbidity, so two centres with similar overall performance may be quite different when patients with and without co-morbidity are compared.
Measuring centre effects could identify both ‘failing’ organisations that require closure, investigation, additional investment, or support and high-performing organisations enabling the dissemination of best practice to other organisations. It may therefore seem surprising, at first glance, that relatively little research has been undertaken in this area. However, it is clear that several major issues need to be considered and resolved before we can translate the potential insights provided by centre effects into real patient benefit. In the rest of this paper, we will discuss the challenges of how one (a) measures and defines quality of care; (b) takes patient case mix into account; (c) presents comparative data on centres, and (d) unpacks centre effects into modifiable interventions that could be translated into clinical practice.
The development of clinical performance measures and comparison of organisational performance has been led by both an increased demand for accountability of public services and the development of ‘evidence-based’ guidelines . Numerous clinical practice guidelines and associated audit measures for the management of patients with chronic kidney disease (CKD) have been developed and revised over the last decade . These measures have enabled clinicians to develop strategies to manage CKD populations more effectively and have, inevitably, led to the possibility of making comparisons in ‘clinical performance’ between centres providing care for patients with CKD.
Clinical performance measures can be divided into those which measure patient outcomes, such as survival, quality-adjusted survival or quality of life; those that measure intermediate outcomes, such as urea reduction ratio (URR), serum phosphate concentration, or haemoglobin concentration which are believed to have causal associations with morbidity and overall patient survival , and those which directly measure the clinical processes, such as use of arteriovenous fistulae for haemodialysis, or prescription of erythropoiesis-stimulating agents for anaemia which are believed to influence the patient outcome either directly or through the intermediate outcome. The distinction between outcome and process measures is important, as it may be easy for a centre to change processes, especially if they know they are being judged on such a measure (technically known as ‘gaming’) without necessarily improving outcomes. A systematic review of studies examining the association between quality of care, as measured by adherence to evidence-based standards of clinical care (process), and variability in hospital mortality rates found only an inconsistent association . Patient outcomes such as survival are more valid but much more susceptible to variations in case mix so require greater caution in assuming causal associations.
There are now multiple examples demonstrating real variation in outcomes between centres, both in CKD and in other patient populations (e.g. cardiac surgery) having accounted for differences in case mix [8,9,10,11]. The UK Renal Registry is unique amongst renal registries in providing annual centre-specific comparisons on a wide range of these clinical performance measures using electronic data extracts from over 95% of UK dialysis centres . UK dialysis centres are therefore able to benchmark their performance against a national average and against high-performing centres. Unfortunately, little is known about what drives these differences, assuming it is not due to case mix; they may be dependent on multiple clinical processes, one or more of which may be responsible for the variation in quality.
Whilst many intermediate outcome measures (e.g. haemoglobin, phosphate) are continuous in nature, it is common to describe poor or good performance by dichotomising such measures so that the proportion of patients meeting these definitions can be compared between centres or within centres over time. The specification of the performance measure can markedly influence which organisations are identified as achieving good (or bad) results. Thus, it is possible to define a poor outcome as the proportion lying in one tail of the distribution or, alternatively, assuming a U- or J-shaped relationship with harder end points, one can define a poor outcome as being in either tail or, conversely, a good outcome as lying within a range of acceptable values. For most biological variables, where the acceptable range is wide, there is an inevitable linear relationship between the individual centre mean and the proportion in that centre achieving the performance measure  due to the obvious auto-correlation. For instance, if the clinical performance measure for correction of anaemia is the proportion of patients with a haemoglobin ≥10 g/dl, then centres with a higher mean haemoglobin will have a higher proportion of patients meeting that ‘target’ assuming a normal distribution. However, if good performance is defined as having a haemoglobin value of between 10.5 and 12.5 g/dl, then both the mean and variance (spread of values around mean and usually measured by the standard deviation or 95% reference range) become important in determining the proportion of patients meeting the target. Thus, it is possible to have two centres with identical mean haemoglobin values of 11.5 g/dl but with one centre having a larger standard deviation and hence ‘worse’ performance (fig. 1). It is therefore important to understand the determinants both of mean and of variance, though most analytical approaches merely examine predictors of group mean values. This effect is not merely of theoretical interest. The UK Renal Registry has demonstrated this phenomenon when comparing centre performance either using a one-tailed or a range definition for both haemoglobin and phosphate, which resulted in a marked change in which centres were deemed ‘high performers’ .
Patient outcomes can be influenced both by individual level factors (e.g. type of CKD) and by factors that are determined by the centre that manages them. The development of multilevel statistical models enables one to account for the hierarchical nature of these data as these methods allow patients (level 1) to be nested within centres (level 2), which in turn can be embedded within geographical areas (level 3). If data were analysed only at a centre level, then individual level data on case mix could only be adjusted by some simple summary measure, e.g. mean age of patients at that centre: this would be far less effective for adjustment than using individual level data. On the other hand, if data are analysed only at individual level, then no account is taken of clustering of individuals within centres , as patients within a centre will be more similar than between centres, and conventional estimates of standard errors for the centre effects will be too narrow and p values too low. In addition, this approach allows one to quantify how much of the observed variability between centres is due to compositional differences because of differing patient populations. Opponents of performance monitoring argue that unmeasured case mix will still confound most performance analyses. For instance, it is possible that poor compliance with treatment (e.g. missing dialysis sessions) is more common amongst socioeconomically deprived groups, and that this will confound comparisons of centres serving deprived versus less deprived populations . Using a multilevel model also allows exploration of the effects of socioeconomic status at centre level, i.e. whether patients treated in more affluent areas perform better than those treated in less affluent areas. Despite imperfect case mix adjustment, administrative databases have been shown to correlate well with clinical databases in prediction of adjusted mortality rates despite having fewer case mix data .
Adjustment for case mix can have a variable influence on centre effects for CKD patients. Fink et al.  demonstrated variation between centres in dialysis dose in a US dialysis network. Approximately 12% of the variation between centres was accounted for by patient level case mix factors. Two other studies exploring centre variation of audit measures included in clinical practice guidelines for dialysis patients have demonstrated centre effects which were not modified by case mix adjustment [9,17]. Where adjustment has little influence on the observed centre effects, it is unlikely that even better measures of case mix would make much difference to the comparison, as the lack of attenuation suggests that case mix is not differentially associated with centre.
There is no consensus on the best approach to present statistical comparative data on healthcare organisations. League tables are a simple and intuitive way of presenting the comparison of multiple centres, are ‘sensitive’ in detecting variation between organisations and have popular appeal to a lay audience. They also have major limitations, including multiple comparisons and identifying ‘poor’ centres merely as a function of ranking even if actual differences in performance are minor. Differences between centres may be due to chance or have little clinical significance. An alternative is to use statistical process control (SPC) methodology (fig. 2) . Cross-sectional SPC analyses are used to compare multiple organisations: for instance, funnel plots allow identification of ‘outlier’ organisations whose performance is significantly different from the average of all organisations being compared. Longitudinal SPC charts are used to measure consistency and changes in performance within an organisation over time. These are valuable as they act as a natural experiment allowing real-time monitoring within an organisation. In addition, they are likely to reflect real changes in the processes within an organisation as potential confounding factors such as case mix will not change within a centre in the short-term, though artefactual variations can still occur due to random variations in rare events or changes in coding and/or completeness of data capture. Variation of data quality must always be considered a potential cause of between-centre variation in performance, and this might become an increasing problem with newly established registries or those without well-established data validation routines.
Appropriate display of performance data is also important to prevent stigmatisation and gaming between organisations. For example, when post-operative surgical mortality rates by individual cardiac surgeons in New York State were published as part of a quality improvement programme, there were claims that some of the apparent improvement was due to ‘up-coding’ of pre-operative risk , although this has not been consistently demonstrated in other populations . The use of league tables may encourage this behaviour as, by definition, 50% of centres will be below the median even if there are no statistically significant differences between any of the centres ranked in a league table. A randomised controlled trial which compared SPC and simple ranking methodology confirmed this observation and found that public health physicians were less likely to identify organisations as outliers from funnel plots than from league tables .
However, there are practical issues about how best to present longitudinal data on multiple centres simultaneously in a format that is simple to digest. There is also the issue of timeliness of data. Annual reports will present data over the past year, but with electronic data transfer it is possible to update analyses on a quarterly or monthly basis, though this may introduce type I errors due to multiple comparisons. Real-time analyses may be important if one wishes to identify what organisational factors are causally linked to either upward or downward trends and are essential in early intervention of very serious adverse events such as unexpected mortality rates. A retrospective review after a year may be too late in identifying anything other than large scale or crude alterations to service provision and more subtle changes may be missed. Further work on data presentation and also its impact on clinical reasoning and decision making is required to improve on existing methods.
A variety of different factors, usually those easiest to measure, have been suggested as potential influences on healthcare organisations. These include structural characteristics of the organisation (e.g. staff to patient ratios), institutional processes (e.g. organisation of outpatient review) and clinical processes (e.g. the use of written protocols). More subtle influences, such as the ethos and culture of the organisation, may also contribute but are more difficult to measure . A recent qualitative study of health professionals identified the following beliefs that represent ‘best practice’ within centres providing renal replacement therapy (RRT). These were multi-disciplinary meetings, regular audit, assessment of staff performance, methods of staff communication and staff training to educate patients, but there was less consensus about staffing ratios and frequency of patient review . There are limited data linking structural characteristics within organisations providing RRT and variation in quality of care. Findings may not always be generalisable as different health care systems may be governed by a variety of wider societal differences, e.g. national policy. Several studies from the US have looked at differences in outcomes based on different remuneration policies. A meta-analysis of differences in mortality between ‘for-profit’ and ‘not-for-profit’ haemodialysis centres identified a small but significant increased relative risk of mortality (after adjustment for case mix) for patients treated in for-profit centres . A more recent study showed that unadjusted mortality was similar in for-profit and in not-for-profit centres. In this study, patients treated in for-profit centres were more likely to be older, black, diabetic, and female and more patients in for-profit centres achieved clinical benchmarks (including for dialysis dose and correction of anaemia). Univariate analyses showed no association between profit status and mortality, but multivariate models (including process measures associated with mortality) showed a small but significant association with increased mortality in for-profit centres. However, due to the potentially conflicting results the authors concluded that this finding might be due to overadjustment .
One key factor which is applicable across all health care systems is the relationship between organisational capacity/volume and patient outcome. Data from cardiac surgery show that better patient survival is associated with surgeons performing more operations in so-called ‘high-volume’ centres. It is unknown whether this phenomenon exists between centres providing RRT, although there is marked variation in capacity, for example between ‘hub’ and ‘satellite’ dialysis centres which represent the model of care in the UK.
A number of other institutional and clinical processes have been associated with better outcomes. Plantinga et al. [24,25], in a series of prospective cohort studies, have identified several candidate institutional processes associated with variation between dialysis centres across a variety of outcome measures including mortality, hospitalisation and performance against audit measures. These include increased patient-physician contact time and frequency of sit-down care rounds. A series of analyses by the Dialysis Outcomes and Practice Patterns Study has described variation in practice between haemodialysis facilities across a number of aspects of care including vascular access, anaemia, dialysis dose and mineral metabolism. The variation in these institutional and clinical processes has been associated with differences in patient outcomes (after case mix adjustment). Variation in patient survival between centres has been associated with differential usage of central venous catheters for dialysis, prescribing patterns and provision of foot clinics . These comparisons are, however, susceptible to atomistic inference fallacy (analogous to the ecological fallacy), as centre level variables are derived from aggregates of patient level data and this may or may not explain centre differences. For example, within the UK (at an individual level) poverty is associated with increased coronary heart disease mortality; but at country level, developed (wealthier) countries have higher mortality from heart disease than developing countries.
There are limited data linking individual clinical processes to better outcomes in patients receiving RRT, but there are data in the wider literature. For example, Bradley’s 27 group examined the clinical processes associated with ‘door to balloon’ time in cardiac catheterisation for acute myocardial infarction. A number of candidate clinical processes were identified which significantly reduced median door to balloon time. For example, in hospitals where the emergency physician activated the catheterisation laboratory compared to the cardiologist. There is detailed mapping of some processes in dialysis care, e.g. the Fistula First Initiative, but this project was designed to improve patient outcomes rather than to study differences between dialysis centres (www.fistulafirst.org).
Registries and other administrative databases have provided the potential to identify differences between organisations, although no renal registries other than the UK Renal Registry currently provide centre-specific analyses of clinical performance. The challenge now is to understand the structure and processes that underlie high clinical performance as a prerequisite to quality improvement. Analyses comparing the performance of schools have demonstrated how difficult it is to describe the ‘value added’ of institutions over and above differences in student characteristics due to the complexity of educational organisations; a scenario paralleled by healthcare . The renal literature on this topic is scanty. Plantinga et al. [24,25] have identified some candidate processes associated with improved outcomes in dialysis patients, and Fistula First is a good example of a quality improvement initiative which recognised that understanding clinical processes is critical to changing patient outcomes [(www.fistulafirst.org)]. There is undoubtedly scope to extend this kind of work to other areas in provision of RRT; detailed qualitative research will be required before the analysis of observational quantitative studies and the subsequent design of large-cluster randomised controlled trials, assuming we can overcome financial and pragmatic barriers in undertaking such research studies. For example, a quality improvement program demonstrated an improvement in URR values in a prospective cohort design as well as a reduction in centre variation . The same investigators followed this up by a randomised controlled trial of an intensive quality improvement intervention compared to standard feedback, and this showed significantly improved URR values in centres which used the intensive intervention .
The current centre level comparisons of UK dialysis centres against the audit measures specified in the UK Renal Association Clinical Practice Guidelines provide an ideal starting point for further investigation into the factors underlying good performance through further qualitative and quantitative research in conjunction with multilevel analysis to allow case mix adjustment. For an individual dialysis centre, SPC methods enable rapid hypothesis testing of clinical or institutional processes, which may improve performance outcomes. Similar comparisons should be possible in other countries with well-established registries. These comparisons, following the principles outlined here, should stimulate centres – particularly those outlier centres shown to have exceptionally poor performance on a given quality marker – to initiate active quality improvement programmes focusing on that dimension of care. Similarly, analysis of structures and processes in centres with exceptionally high performance may provide important insights into how performance might be improved in other centres.
With quality improvement now at the forefront of healthcare policy in many countries, it is perhaps surprising that there are so few analyses of differences between organisations and the effects of these differences on patients’ outcomes in CKD or health care in general. We believe that such analyses can be a useful starting point and should form part of any quality improvement initiative to improve understanding of institutional factors that may improve patient outcomes.
Dr. Alexandra Hodsman
UK Renal Registry
Bristol BS10 5NB (UK)
Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher or, in the case of photocopying, direct payment of a specified fee to the Copyright Clearance Center.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in government regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.