Dissemination Protocol

The final decisions about data dissemination will only be made after the data has been collected, processed and reviewed for unique disclosure issues. The strategies for dissemination will begin at the time data collection activity is being planned. Concerns about procedures for data dissemination in general and in regard to particular data components will be addressed during the planning processes, particularly for collaborative activities. Despite this, GNHR has the power to address at any time of the data collection process how data quality will be evaluated and decisions made concerning when data cannot be disseminated due to failure to meet quality standards.

Modalities of Microdata Files for Dissemination

i. As mentioned above, the purpose of building the GNHR is to consolidate into a single common database, structured and organized key information about current and potential beneficiaries of social programmes in order to create single entry point for the citizens to access the main social protection programmes. For that purpose, the GNHR database is composed of the household-ranked according to their level of poverty (non-poor / poor / extremely poor). In addition to the variables of socio-economic categorization, the database contains individual information from each of the household members; biometrics of every household member above the age of 15 to ensure effective identification of people and to decrease the chances of duplication; photograph of each household member; and geolocation of households. Any social programme may apply their own inclusion criteria to screen for potential eligible beneficiaries from the GNHR database.

ii. The GNHR creates multiple versions of any given microdata file; these differ in quality, content and number of records. They range from raw microdata files —containing all replies by each respondent obtained immediately after data entry— to cleaned and edited files for public use.

iii. As a general rule, the microdata files that GNHR will disseminate are the cleaned and edited files. In case any SP Programme requires raw microdata files, a special request is required where the requestor on behalf of the SP Programme explains the purpose and use of the data. This request must be approved by the GNHR Coordinator.

iv. In the event that raw data file request is made by an entity or person other than an SP Programme, GNHR would need to adjust the content and/or number of records. The content of records in microdata files for dissemination will be edited by suppressing information from direct and indirect identifiers to protect the anonymity of respondents. The process of suppressing information does not necessarily mean removing variables. In some cases, re-coding variables into less detailed categories to make them less informative would be sufficient; or in other cases would require truncating the number of records contained in a disseminated microdata file – especially in the case of population census data as a way to guarantee the anonymization. The decision on what type of adjustment will need to be applied to the data will always be the power of GNHR, after having evaluated the reasons for the information request.

v. In the context of this Protocol, data files will be disseminated in five (5) modalities. These files differ in their level of accessibility to users and the extent to which they are anonymized.

Social Protection Programmes Files

This are the files to be shared with the SP Programmes to apply their own inclusion criteria to screen potential eligible beneficiaries. These will be non-anonymized cleaned and edited files. For this reason, each SP Programme shall develop an MoU with GNHR before starting any data sharing process.

Public Use Files (PUFs)

Will be available to anyone agreeing to respect a core set of easy-to-meet conditions. Such conditions relate to what can be done with the data (e.g. the data cannot be sold, etc.), These PUFs will be available online since the risk of identifying individual respondents is considered minimal. For that purpose, all content that can identify respondents directly will be eliminated —for instance, names, addresses and telephone numbers. In addition, all relevant indirect identifiers will be purged from the microdata file, e.g.: geographical information below the sub-national level.

Licensed Files / Research Files

The dissemination of this category of files is restricted to users who have received authorization to access them after submitting application documents and signing an agreement governing the GNHR data use. The licensed files will be anonymized by removing direct identifiers such as Household Member’s names to ensure the risk of identifying individuals. The data files may, however, still contain indirect variables that could identify respondents by matching them to other data files such as voter’s IDs database, National ID database, etc.

GNHR Data Enclave

Some files will be offered to users under strict conditions in a data enclave. The GNHR data enclave will contain data that is particularly sensitive or that allows direct or easy identification of respondents. Examples include complete population census datasets, and certain personal related datasets containing highly-confidential information. Users interested in accessing the data enclave will have access to the approved particular data subset they have requested for. Interested users will be required to complete an application form demonstrating the legitimate need to access the data to fulfil a stated statistical or research purpose. The outputs generated must be scrutinized by GNHR for a full disclosure review before release.

Remote Job Submission

This approach is designed for users to conduct analyses of confidential data by creating a process that enables them to submit data processing and analysis programmes remotely to the GNHR data depositor. The user is given a synthetic dataset that replicates the structure and content of the actual datasets. This enables the researcher to develop programmes using tools such as SAS, SPSS or Stata. The programmes are then transmitted to the GNHR data depositor staff, who run the job against the actual dataset. The results will be vetted for disclosure and returned to the user. This process could have a cost for the requester.