The SAIL databank: linking multiple health and social care datasets
Author Information
Author(s): Ronan A Lyons, Kerina H Jones, Gareth John, Caroline J Brooks, Jean-Philippe Verplancke, David V Ford, Ginevra Brown, Ken Leake
Primary Institution: Health Information Research Unit (HIRU), Swansea University
Hypothesis
Can a unique Anonymous Linking Field (ALF) be accurately assigned to person-based records for record-linkage research studies?
Conclusion
The SAIL databank provides a reliable matching process that enables consistent allocation of ALFs to records, making it a research-ready platform for record-linkage studies.
Supporting Evidence
- Over 500 million records have been loaded into the SAIL databank.
- The matching process achieved specificity values greater than 99.8% and sensitivity values greater than 94.6%.
- Using the NHS number as a unique identifier resulted in error rates of less than 0.2%.
- More than 95% of records from the PARIS database were successfully matched to the NHS Administrative Register.
Takeaway
The SAIL databank helps connect health and social care data so researchers can study how different factors affect people's health.
Methodology
An SQL-based matching algorithm (MACRAL) was developed to assign unique identifiers to records by matching datasets to the NHS Administrative Register.
Potential Biases
The use of probabilistic record linkage may increase the risk of false positive matches.
Limitations
The study could not identify the sources of errors in the matching process due to the use of anonymised data.
Participant Demographics
The study involved datasets from primary care, secondary care, and social services in Wales.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website