BiBBS person level geographic information
Table ID: BiBBS_Geographic.bibbs_geog_person
Entity type | Variables | Entities | Updated | Related tables |
---|---|---|---|---|
Participant | 14 | 8440 | 2023-05-02 | NA |
BiBBS participants address location estimated for every 6 months from first event of study participation (e.g. birth or registration). Contains person level information, such as age, number of property moves and data quality indicators.
Data construction: The source data for this table is address strings from monthly searches of the NHS Tracing service (e.g. health care records), which contain limited geographic information and are temporally irregular. To improve the quality of the geographic data, the address strings are passed through the OS Places API, which attempts to match each address string to a geo-located address, providing additional geographic information.
Important information on temporal data: This dataset contains data from two separate timelines, which are important to clarify. Temporal variables fixed to the participants age such as age_m are calculated at 6 monthly periods based on date of birth. Separately, address data from NHS tracing services, such as date_address_data_received is received as and when, and may not align with the 6 monthly time periods used to structure the dataset. Where this occurs, address data for some 6 monthly time points has been interpolated from the nearest available data from NHS tracing. The variable is_data_interpolated flags if data has been interpolated, whilst months_to_closest_data_received can be used to assess the temporal difference between the participants age, and the recency of the address data.
Usage instructions: Can be linked to bibbs_geog_property by “property_id” and bibbs_geog_lsoa by “LSOA11CD”.
variable | label | value_type | summary |
---|---|---|---|
ParticipantType | Participants relationship in the family | text | 3 unique values 69677 non-missing values 5 to 7 characters |
age_m | Participants actual age (months) | decimal | mean (sd) 210.51 (186.57) min < median < max 0.00 < 252.00 < 792.00 Complete: 69677 (100.00%) |
age_closest_data_point | Age at latest available address data (months) | decimal | mean (sd) 189.90 (184.49) min < median < max -1.00 < 252.00 < 770.00 Complete: 63460 (91.08%) |
date_address_data_received | Date of latest available address data for this time point | date | from 1986-11-15 to 2022-02-15 Complete: 63460 (91.08%) |
months_to_closest_data_received | Time (months) to most recently received address data | decimal | mean (sd) 27.54 (23.10) min < median < max -38.00 < 24.00 < 432.00 Complete: 63460 (91.08%) |
property_id | Unique ID of property (BiB generated) | text | 3479 unique values 63462 non-missing values 1 to 40 characters |
move_no | Cumulative number of times participant moved house including current data point | decimal | mean (sd) 0.07 (0.26) min < median < max 0.00 < 0.00 < 2.00 Complete: 64927 (93.18%) |
total_moves | Total number of times participant has moved house within BiB study period | decimal | mean (sd) 0.09 (0.29) min < median < max 0.00 < 0.00 < 2.00 Complete: 64927 (93.18%) |
LSOA11CD | LSOA 2011 code | text | 124 unique values 63463 non-missing values 1 to 9 characters |
data_source | Source of data: registration or tracing | decimal | 1 : Registration 2 : Tracing Complete: 64927 (93.18%) |
not_in_eng_wales | If yes (1) sensitive columns have been set to NA to avoid disclosure of data | decimal | 0 : No 1 : Yes Complete: 64927 (93.18%) |
no_historical_address_data | If yes (1) then no address data was available for any time point | decimal | 0 : No 1 : Yes Complete: 69677 (100.00%) |
low_qual_data | If yes (1), the geocoded data is of unreliable quality and has been removed | decimal | 0 : No 1 : Yes Complete: 64927 (93.18%) |
is_data_interpolated | If yes (1), address record has been interpolated from closest available data | decimal | 0 : No 1 : Yes Complete: 63460 (91.08%) |