Development of a repository of individual participant data from randomized controlled trials of therapists delivered interventions for low back pain.
Hee SW., Dritsaki M., Willis A., Underwood M., Patel S.
BACKGROUND: Individual patient data (IPD) meta-analysis of existing randomized controlled trials (RCTs) is a promising approach to achieving sufficient statistical power to identify sub-groups. We created a repository of IPD from multiple low back pain (LBP) RCTs to facilitate a study of treatment moderators. Due to sparse heterogeneous data, the repository needed to be robust and flexible to accommodate millions of data points prior to any subsequent analysis. METHODS: We systematically identified RCTs of therapist delivered intervention for inclusion to the repository. Some were obtained through project publicity. We requested both individual items and aggregate scores of all baseline characteristics and outcomes for all available time points. The repository is made up of a hybrid database: entity-attribute-value and relational database which is capable of storing sparse heterogeneous datasets. We developed a bespoke software program to extract, transform and upload the shared data. RESULTS: There were 20 datasets with more than 3 million data points from 9328 participants. All trials collected covariates and outcomes data at baseline and follow-ups. The bespoke standardized repository is flexible to accommodate millions of data points without compromising data integrity. Data are easily retrieved for analysis using standard statistical programs. CONCLUSIONS: The bespoke hybrid repository is complex to implement and to query but its flexibility in supporting datasets with varying sets of responses and outcomes with different data types is a worthy trade off. The large standardized LBP dataset is also an important resource useable by other LBP researchers. SIGNIFICANCE: A flexible adaptive database for pain studies that can easily be expanded for future researchers to map, transform and upload their data in a safe and secure environment. The data are standardized and harmonized which will facilitate future requests from other researchers for secondary analyses.