The SAIL databank: linking multiple health and social care datasets
2009

The SAIL databank: linking multiple health and social care datasets

Sample size: 500000000 publication Evidence: high

Author Information

Author(s): Ronan A Lyons, Kerina H Jones, Gareth John, Caroline J Brooks, Jean-Philippe Verplancke, David V Ford, Ginevra Brown, Ken Leake

Primary Institution: Health Information Research Unit (HIRU), Swansea University

Hypothesis

Can a unique Anonymous Linking Field (ALF) be accurately assigned to person-based records for record-linkage research studies?

Conclusion

The SAIL databank provides a reliable matching process that enables consistent allocation of ALFs to records, making it a research-ready platform for record-linkage studies.

Supporting Evidence

  • Over 500 million records have been loaded into the SAIL databank.
  • The matching process achieved specificity values greater than 99.8% and sensitivity values greater than 94.6%.
  • Using the NHS number as a unique identifier resulted in error rates of less than 0.2%.
  • More than 95% of records from the PARIS database were successfully matched to the NHS Administrative Register.

Takeaway

The SAIL databank helps connect health and social care data so researchers can study how different factors affect people's health.

Methodology

An SQL-based matching algorithm (MACRAL) was developed to assign unique identifiers to records by matching datasets to the NHS Administrative Register.

Potential Biases

The use of probabilistic record linkage may increase the risk of false positive matches.

Limitations

The study could not identify the sources of errors in the matching process due to the use of anonymised data.

Participant Demographics

The study involved datasets from primary care, secondary care, and social services in Wales.

Digital Object Identifier (DOI)

10.1186/1472-6947-9-3

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication