Stable characteristics – information collected once during the first interview
Some datafiles are particularly useful for analysts, irrespective of their area of research.
| Filename | Description |
|---|---|
| w_indall bw_indall | Household grid data for all persons in the household, including children and non-repondents. The variable pidp or the combination of variables “w_hidp w_pno” uniquely identifies each row of w_indall. The variable pidp or the combination of variables “bw_hidp bw_pno” uniquely identifies each row of bw_indall. |
| w_hhresp bw_hhresp | Substantive data collected from responding households. The variable w_hidp uniquely identifies each row in b_hhresp. The variable bw_hidp uniquely identifies each row in bw_hhresp. |
| w_indresp bw_indresp | Substantive data collected from responding adults (16+) including proxies. Some information collected in these questionnaires are better presented in multi-level files (see Table 2). The variable pidp or the combination of variables “w_hidp w_pno” uniquely identifies each row of w_indresp. The variable pidp or the combination of variables “bw_hidp bw_pno” uniquely identifies each row of bw_indresp. |
| w_youth bw_youth | Substantive data from youth questionnaire. The variable pidp or the combination of variables “w_hidp w_pno” uniquely identifies each row of w_youthl. The variable pidp or the combination of variables “bw_hidp bw_pno” uniquely identifies each row of bw_youth. |
| w_child bw_child | Childcare, consents and school information of all children (0-15 years) in the household. This is a derived data file collecting information pertaining to children as reported by their parents and guardians in the adult questionnaire. The variable pidp or the combination of variables “w_hidp w_pno” uniquely identifies each row of w_child. |
| w_egoalt bw_egoalt | Kin and other relationships between pairs of individuals in the household. This is a derived data file based on information collected in the household grid about relationships between household members. The combination of variables “pidp apidp” or “w_hidp w_pno w_apno” uniquely identifies each row in w_egoalt. The combination of variables “pidp apidp” or “bw_hidp bw_pno bw_apno” uniquely identifies each row in bw_egoalt. |
| w_income bw_income | This file contains reports of unearned income and state benefits for each individual. The combination of variables “pidp w_fiseq” uniquely identifies each row in w_income. The combination of variables “pidp bw_ficode bw_fiseq” uniquely identifies each row in bw_income. |
| Filename | Description |
|---|---|
| xwavedat | Stable characteristics of individuals, such as date of birth, country of birth, ethnicity, which is typically collected only once in the lifetime of the Study are picked from different data files and put into this file. This file now includes all sample members ever enumerated in either Understanding Society and BHPS and variables have been harmonised across studies where possible. The variable pidp uniquely identifies each row. |
| xivdata xivdata_bh | Some basic information about interviewers is stored in these files (non-harmonised). These are available in the Special License version of the survey SN8579. |
| xwaveid xwaveid_bh | Some basic sampling information from each wave such as interview outcomes is included in this file (non-harmonised). The variable pidp uniquely identifies each row in these files. |
| xwlsten | Contains information on the latest known sample status of individuals (Only BHPS). The variable pidp uniquely identifies each row. |
| Xhhrel | Family matrix file which allows family members and households to be connected over time. The file compiles existing Understanding Society main survey data, particularly from the egoalt and indall files, and includes every sample member ever enumerated as part of the study. The variable pidp uniquely identifies each row and the variable osm_hh identifies the cross-wave household each pidp belongs to. |
| Filename | Description |
|---|---|
| a_natchild f_natchild n_natchild2 bb_childnt bk_childnt bl_childnt | Some basic information about all biological children born to the sample members, whether co-resident or not. These are collected in the first wave for any sample. So, for example, a_natchild was collected in Wave 1 of UKHLS for GPS, EMBS and bl_childnt was collected in Wave 12 of the BHPS for the Northern Irish boost sample. These files are not harmonised. The combination of variables “pidp w_childno” or “w_hidp w_pno w_childno” uniquely identifies each row of the files w_natchild. The combination of variables “pidp bw_lncno” or “bw_hidp bw_pno bw_lncno” uniquely identifies each row of the files bw_childnt. |
| a_adopt f_adopt n_adopt n_stepchild bb_childad bk_childad bl_childad | Some basic information about all adopted and step children born to the sample members, whether co-resident or not. These are collected in the first wave for any sample. So, for example, a_adopt was collected in Wave 1 for GPS, EMBS and bl_childat was collected in Wave 12 of the BHPS for the Northern Irish boost sample. These files are not harmonised. Note that in Wave 14, information about stepchildren for GPS2 was collected separately from that of adopted children and is available in a separate file n_stepchild. The combination of variables “pidp w_adoptno” or “w_hidp w_pno w_adoptno” uniquely identifies each row of the files w_adopt. The combination of variables “pidp bw_lacno” or “bw_hidp bw_pno bw_lacno” uniquely identifies each row of the files bw_childad. |
| w_newborn | Every wave after Wave 1, basic information about new born children such as birth weight, etc. is collected from new parents. The combination of variables “pidp w_newchno” or “w_hidp w_pno w_newchno” uniquely identifies each row of the files w_newborn |
| w_chmain | Information about child maintenance arrangements was collected in waves 3, 5, 7, 9,…The combination of variables “pidp c_absparno” or “c_hidp c_pno c_absparno” uniquely identifies each row in c_chmain. The combination of variables “pidp w_childpno” or “w_hidp w_pno w_childpno” uniquely identifies each row in w_chmain where w is e, g, I, k,… |
| w_parstyle | Every wave from onwards Wave 4, information about parenting styles was collected. The combination of variables “pidp w_childpno” or “w_hidp w_pno w_childpno” uniquely identifies each row of the files w_parstyle |
| Filename | Description |
|---|---|
| a_marriage f_marriage ba_marriag bk_marriag bl_marriag | Start and end dates of past marriages and how that marriage ended was collected during adult interviews in the first wave a sample was selected. So, for example, a_marriage was collected in Wave 1 for GPS & EMBS (non-harmonised). ] The combination of variables “pidp w_marno” or “w_hidp w_pno w_marno” uniquely identifies each row of the files w_marriage. The combination of variables “pidp bw_marno” or “bw_hidp bw_pno w_bmarno” uniquely identifies each row of the files bw_marriag. |
| a_cohabit f_cohabit bb_cohab bk_cohab bl_cohab | Start and end dates of past cohabitations and how that cohabitation ended was collected during adult interviews in the first wave a sample was selected. So, for example, a_cohabit was collected in Wave 1 for GPS & EMBS (non-harmonised). The combination of variables “pidp w_cohabno” or “w_hidp w_pno w_cohabno” uniquely identifies each row of the files w_cohab. The combination of variables “pidp bw_lcsno” or “bw_hidp bw_pno bw_lcsno” uniquely identifies each row of the files bw_cohabit. |
| bw_jobhist (bw_jobhistd) | Contains information about employment history between two waves collected during adult interviews (Only BHPS). The combination of variables “pidp bw_jspno” or “bw_hidp bw_pno bw_jspno” uniquely identifies each row in these files. |
| a_empstat e_empstat bw_lifemst bk_lifemst bl_lifemst | Employment history was collected during adult interviews in Wave 1 for part of the GPS & EMB samples and in Wave 5 for rest of the samples, this was not asked for the IEMBS in Wave 6 (non-harmonised). The combination of variables “pidp w_spellno” or “w_hidp w_pno w_empstat” uniquely identifies each row of w_empstat. The combination of variables “pidp bw_leshno” or “bw_hidp bw_pno bw_leshno” uniquely identifies each row of bw_lifemst. |
| bc_lifejob | Contains information about jobs held in employment spells (Only BHPS). The combination of variables “pidp bw_ljseq” or “bw_hidp bw_pno bw_ljseq” uniquely identifies each row of this file. |
| w_hhsamp bw_hhsamp | This file contains information about each household that the interviewer collects about the condition of the property, neighbourhood, interview outcome and so on. The variable w_hidp uniquely identifies each row in b_hhsamp. The variable bw_hidp uniquely identifies each row in bw_hhsamp. |
| w_indsamp bw_indsamp | Includes current interview outcome for anyone enumerated in the last interview wave, for example, whether they have respondended, only enumerated, couldn’t be contacted or refused, or were ineligible. If you restrict the data to cases where w_finloc=1/bw_finloc=1, then pidp uniquely identifies each row. |
| w_callrec | Includes information about each interview call made to each household, such as outcome of the call, interview ID. The combination of variables “w_hidp w_issueno w_callno” uniquely identifies each row in this file. |
| Timings Files | Various files are available that capture the time taken to complete questions and modules within individual and household questionnaires. Given that these files vary from wave to wave and are of limited, specialist use only, they are not released as standard. If you want to make use of them please contact usersupport@understandingsociety.ac.uk who will be happy to advise. |



