The diagram below shows how linked data is extracted:
1. Identify Study Population
The study population first needs to be selected. This can be done via linkage, where the researcher already has the study population chosen, or via selection from one or more of the health data collections. For example:
- All people who went to hospital for a colonoscopy (from HMDC)
- All people with colorectal cancer (from WA Cancer Registry)
- People in both these groups (from both HMDC and WA Cancer Registry).
Control populations are also identified (e.g. random sample of people from the electoral roll who are the same age and gender as the cases).
2. Extract linkage keys
Once the study population is defined, the linkage team extracts the linkage keys for each requested dataset. The Project Manager then distributes these lists of keys to the relevant data collections for the service data to be attached.
3. Attach service data
The Data Custodians arrange for the requested service data from their collection to be attached to the linkage keys. For core data collections, the files are sent back to the DLB Project Manager. For some external datasets, the service data is released directly to the researcher.
4. Checking
The service data files come back in various formats. The DLB Project Manager arranges for a DLB analyst to check the data matches the request and convert all the data to fixed width text files. Supporting documentation is also written to describe the data requested.
5. Data release
The Project Manager prepares the data for release by encrypting it and burning it to a disc, then arranges for secure delivery to the researcher.
