Born in Bradford Data Dictionary

BiB person level geographic information

Table ID: BiB_Geographic.bib_geog_person

github | csv

Entity type Variables Entities Updated Related tables
Participant 14 29664 2023-04-26 NA

BiB participants address location estimated for every 6 months from first event of study participation (e.g. birth or registration). Contains person level information, such as age, number of house moves and data quality indicators.
Data construction: The source data for this table is address strings from monthly searches of the NHS Tracing service (e.g. health care records), which contain limited geographic information and are temporally irregular. To improve the quality of the geographic data, the address strings are passed through the OS Places API, which attempts to match each address string to a geo-located address, providing additional geographic information.
Important information on temporal data: This dataset contains data from two separate timelines, which are important to clarify. Temporal variables fixed to the participants age such as age_m are calculated at 6 monthly periods based on date of birth. Separately, address data from NHS tracing services, such as date_address_data_received is received as and when, and may not align with the 6 monthly time periods used to structure the dataset. Where this occurs, address data for some 6 monthly time points has been interpolated from the nearest available data from NHS tracing. The variable is_data_interpolated flags if data has been interpolated, whilst months_to_closest_data_received can be used to assess the temporal difference between the participants age, and the recency of the address data.
Usage instructions: Can be linked to bib_geog_property by “property_id” and bib_geog_lsoa by “LSOA11CD”.

variable label value_type summary
ParticipantType Participants relationship in the family text 3 unique values
851932 non-missing values
5 to 6 characters
age_m Participants actual age (months) decimal mean (sd)
266.60 (185.82)

min < median < max
0.00 < 294.00 < 912.00

Complete: 851932 (100.00%)

age_closest_data_point Age at latest available address data (months) decimal mean (sd)
267.47 (188.16)

min < median < max
-1.00 < 300.00 < 911.00

Complete: 834563 (97.96%)

date_address_data_received Date of latest available address data for this time point date from 2007-04-15
to 2023-04-15

Complete: 834563 (97.96%)

months_to_closest_data_received Time (months) to most recently received address data decimal mean (sd)
-1.34 (16.92)

min < median < max
-109.00 < 0.00 < 186.00

Complete: 834563 (97.96%)

property_id Unique ID of property (BiB generated) text 32009 unique values
834579 non-missing values
1 to 40 characters
move_no Cumulative number of times participant moved house including current data point decimal mean (sd)
0.92 (1.40)

min < median < max
0.00 < 0.00 < 18.00

Complete: 850964 (99.89%)

total_moves Total number of times participant has moved house within BiB study period decimal mean (sd)
1.58 (1.88)

min < median < max
0.00 < 1.00 < 18.00

Complete: 850964 (99.89%)

LSOA11CD LSOA 2011 code text 3502 unique values
834573 non-missing values
1 to 9 characters
data_source Source of data: registration or tracing decimal 1 : Registration
2 : Tracing
Complete: 850964 (99.89%)

not_in_eng_wales If yes (1) sensitive columns have been set to NA to avoid disclosure of data decimal 0 : No
1 : Yes
Complete: 850964 (99.89%)

no_historical_address_data If yes (1) then no address data was available for any time point decimal 0 : No
1 : Yes
Complete: 851932 (100.00%)

low_qual_data If yes (1), the geocoded data is of unreliable quality and has been removed decimal 0 : No
1 : Yes
Complete: 850964 (99.89%)

is_data_interpolated If yes (1), address record has been interpolated from closest available data decimal 0 : No
1 : Yes
Complete: 834563 (97.96%)