Start Submission Become a Reviewer

Reading: Understanding population health needs: How data-driven population segmentation can support t...


A- A+
Alt. Display

Conference Abstracts

Understanding population health needs: How data-driven population segmentation can support the planning of integrated care


Sabine Ingrid Vuik ,

Erik Mayer,

Ara Darzi


Introduction: Internationally, integrated care organisations have started to look beyond delivering reactive care, moving towards a population health approach. However, to be able to improve the health of a population, it is crucial to understand the population. Different groups of patients have different risks, outcomes and needs. By understanding these groups the right care can be provided to the right people, improving the overall health of the population.

A number of methods have been developed to segment and understand populations, such as risk stratification, hierarchical diagnosis models like the ACG and CRG systems, and grouping patients based on age and conditions. However, there are important limitations to these approaches, as they tend to focus on high-use patients and do not explicitly consider different care settings.

More advanced data mining methods, borrowed from computer science and marketing, can be used by integrated care organisations to improve their understanding of their population. By applying data mining to patient databases, similar patients can be grouped into clusters in a data-driven approach. More importantly, this approach segments patients based on their care usage across a range of care settings, and can therefore directly inform new integrated care models that bridge these settings.

Methods: Combining CPRD and HES, we have constructed a database of 300,000 English patients with information on their care usage, morbidities and other characteristics. We apply cluster analysis to identify patients with similar use patterns across emergency care, elective care, outpatient care, primary care and prescribing. To understand the different segments and their care needs, we analyse the characteristics of each cluster.

Results: Our results show that, when applied to a general population of 300,000 patients, data-driven segmentation produces eight distinct care user types, such as “High needs but low risk of emergencies”, “High primary care users” and “Specialist care users”. The method identifies various patient types at both the high- and low-end of the utilisation spectrum, and across diseases. For each of the segments, distinct care priorities can be identified which should form the basis of a population health strategy. In addition, each of these segments can be targeted with tailored public health interventions, case management programmes or even a fully integrated new care provision model.

The methods can also be used for smaller initiatives focused on specific disease groups. We are in the process of applying the same techniques to the subset of patients with a diagnosis for diabetes. Initial analysis shows how cluster analysis segments diabetes patients into distinctive groups, all with different care and prevention needs.

Discussion: Traditional population analysis methods have important limitations when used for integrated population health. Methods based on morbidities fail to capture the large part of the population without chronic conditions. Similarly, risk stratification works best for very high risk patients, but leave the rest of the population out of scope. A population health strategy should consider the entire population, and focus on prevention for patients with no conditions or at low risk. In addition, none of these methods provide insight into care usage across different care settings, a crucial element for integrated care.

Data mining methods are very flexible and can be adjusted to different projects by the choice of segmentation variables and population. A cluster analysis based on utilisation creates segments that divide the population into roughly equal groups with distinct care usage patterns. By considering care usage in different care settings, the information from this type of segmentation can directly inform the development of new integrated care models.

Conclusion: Integrated care organisations wanting to improve population health require information on care needs across the population and across care settings. While there exist a range of traditional methods to segment the population, a data-driven approach based on utilisation can provide a unique insight into actual care use for different settings. This provides an evidence-based to develop comprehensive and tailored population health strategies. 

How to Cite: Vuik SI, Mayer E, Darzi A. Understanding population health needs: How data-driven population segmentation can support the planning of integrated care. International Journal of Integrated Care. 2016;16(6):A170. DOI:
Published on 16 Dec 2016.


  • PDF (EN)