Date came into force or revised
In effect July 1, 2004; revised 2009; revised 2020
Status
Current
Policy statement
This policy is intended to prevent the possibility of associating statistical data with an identifiable individual. To protect the privacy of individuals, very small population numbers must be suppressed (masked) when the Ministry of Education and Child Care reports or otherwise publicly releases aggregated data.
Rationale or purpose of policy
This document outlines the data masking policy for the de-identification of personal information as used by the BC Ministry of Education (“The Ministry”). The policy provides direction and tools for risk-based assessment to support the determination of the deidentification method, and the appropriate de-identification level. The policy and guidance are based on de-identified data being the endpoint of an iterative, risk-based assessment. The purpose of this policy is to prevent unauthorized access to or disclosure of any form of personally identifiable information present in any data products produced by the Ministry. The
Ministry may consider de-identifying personal information for the following reasons:
- As a process of de-identification, aggregation and/or anonymization personal information to the point where the data cannot be used, either alone or with other information, to identify any individual, based on what is reasonably foreseeable in the circumstances
- As a way of complying with the security provisions of FOIPPA, ensuring that reasonable steps have been taken to protect personal information from unauthorized disclosure (de-identification being one of many possible tools available).
This policy will serve the following objectives:
- Provide the Ministry of Education a policy framework in which personal information collected for the planning and evaluation of the Kindergarten to Grade 12 education sector is protected, and individuals cannot be identified within data products produced by the Ministry;
- Create a uniform and practical approach to de-identification to ensure consistency in its definition, language, and application;
- Implementation of this policy will enable the Ministry to:
- balance the risk of re-identification of de-identified personal information with the benefit and utility of de-identified data; o demonstrate compliance with FOIPPA;
- mitigate the risk of non-compliance with FOIPPA;
- enhance greater public accountability by developing trust and confidence that data are used for public good while privacy is being protected; and
- better enable meaningful information sharing and research.
- Define authorities, responsibilities, and accountabilities for appropriate use and disclosure of personal information;
- Establish technological procedures for the preparation of data products containing deidentified, aggregated or anonymized personal information.
Authority
This policy is in alignment with the provisions of FOIPPA that are already in place. The following legal considerations guided the development of this Policy and provided context for its application.
Freedom of Information and Protection of Privacy Act (FOIPPA)
The FOIPPA provides the public with a right of access to information in the custody or under the control of provincial government Ministries and other public bodies. It also protects personal information by prohibiting the unauthorized collection, use, or disclosure of personal information by public bodies. This masking policy seeks to supplement the goals of the FOIPPA and the BC Government Open Information and Open Data Policy (PDF) by encouraging the proactive disclosure and routine release of government information while also ensuring the protection of personal information and other confidential information through de-identification and/or aggregation of data.
This policy does not:
- affect an individual’s right of access to information under the FOIPPA; or,
- extend an individual’s right under the FOIPPAto request that the information andprivacy commissioner review a proactive disclosure or routine release decision (thatright only applies to a request for access made under the FOIPPA).
With respect to personal information, this policy:
- maintains the protection of personal information in accordance with the FOIPPA; and,
- does not affect an individual’s right to make a complaint to the information and privacy commissioner if personal information is inappropriately disclosed under this policy.
Policy Overview
The specific application of thispolicy isplaced into four categoriesof data:
- No Change,
- Level 1,
- Level 2 and
- Special Case.
An important change in the Ministry’s masking policy is the removal of school level data from certain publicly released data products. An additional change is that any and all cells of less than 10 observations will be masked in a dataset to be released publicly (e.g. through Open Data or through a Freedom of Information (FOI) request). Under the previous masking policy, some data sets released to Open Data or through the FOI process include cells that report one through nine observations in sensitive categories such as special needs categories and small communities. This may lead to the risk of potential re-identification of individual(s) either accurately or through the mosaic effect (the ability to combine Ministry data sets together with other publicly available information to identify and ascertain personal details about individuals (e.g. students, staff) in the BC Education system).
The desired end point of the de-identification process should consider that the identifiability of an individual lies on a spectrum (see below), with graduated, managed privacy risks throughout. On one end, lower de-identification may be appropriate, and in other cases, higher deidentification (i.e., anonymization) may be required. The use of low/moderate de-identification serves to increase the security of the data while still being personal information, whereas anonymization may permit the Ministry to conclude that the provisions of FOIPPA are no longer applicable, because the data product contains no personal information (i.e., fully anonymized data).
The de-identification level and method will vary depending on the balance of re-identification risk and retaining data utility. The Ministry will ensure that the disclosure of de-identified data is compliant with FOIPPA (e.g. if information is de-identified, but still contains indirect identifiers, the disclosure must be authorized by Part 3 of FOIPPA).
Policy in full
The following policy must be adhered to by the Ministry when masking data products containing personal information that are intended for disclosure outside of the public body, or for public reporting, FOI requests, or disclosure to external stakeholders, researchers, or internal stakeholders.
Process:
- Any and all cells of less than 10 observations will be masked in any data set to be released publicly (i.e. through Open Data or in response to a Freedom of Information (FOI) request).
- For cells of ten or more observations, assess the level of re-identification risk, and then, based on that assessment, determine the appropriate level and the specific method(s) to mask the data.
- Assess de-identified data that can no longer be considered identifiable personal information (i.e. subject to FOIPPA), where any data that can directly identify an individual and/or for which there is a “reasonable expectation” that it could be used, either alone or with other information to identify the individual, has been removed.
- To confidently determine that the de-identified data is no longer considered identifiable personal information under FOIPPA, re-identifiability via the mosaic effect must be considered by the Ministry. All direct identifiers must be removed, and indirect identifiers manipulated so that the probability of re-identification is very low, while maintaining the utility of the data.
- Ministry must consider three types of data masking based on:
- how the data are distributed and the effectiveness of the de-identification (method and level), e.g. doing at source masking.
- different types of de-identification to be analyzed (e.g. public releases are more likely to be subject to demonstration attacks whereas non-public data releases are more likely to be subject to re-identification attacks posed by insiders and breaches).
- removing data and showing ranges of exact numeric value, e.g. student outcomes, etc.
- While FOIPPA does not prevent disclosure of de-identified data, the Ministry will consider the ethical implications of de-identified data disclosure by carefully considering the context for the disclosure of de-identified data, including the method and level of de-identification applied, risk of re-identification and the intended objective of the use and/or disclosure of the data product.
- The Ministry will document the de-identification process and its results for the purposes of demonstrating due diligence, to enable the investigation of potential incidences of reidentification, and to provide evidence of compliance with this masking policy.
- Where the result of the de-identification process is considered hazardous (case where risk level is high and de-identification level is low), the Ministry will ensure that appropriate risk mitigation measures are applied.
- The Ministry must ensure that all Ministry-specific policies and standards related to the de-identification of personally identifiable information be reviewed by the Corporate Information and Records Management Office (CIRMO).
- Should any government employee discover an actual or suspected incident where deidentified data may have been re-identified by an unauthorized party, the employee must comply with the Information Incident Management Policy and immediately notify their supervisor and report to the Information Management Investigations Unit (IMIU) within the Office of the Chief Information Officer’s (OCIO) Corporate Information and Records Management Office by calling 250-387-7000 (toll-free: 1-866-660-0811) and selecting option 3.
Additional considerations
- The Ministry may not release the following non-exhaustive list of identifiable personal information in any published reports/public document:
- the individual's name, address or telephone number;
- the individual's race, national or ethnic origin, colour, or religious or political beliefs or associations;
- the individual's age, sex, sexual orientation, marital status or family status;
- an identifying number, symbol or other particular assigned to the individual;
- the individual's fingerprints, blood type or inheritable characteristics;
- the individual’s exam results;
- information about the individual's physical or mental disability;
- information about the individual's educational, financial, criminal or employment history;
- anyone else's opinions about the individual;
- the individual's personal views or opinions, about themselves or someone else;
- information about an individual scholarship recipient; or
- information about individual student enrollment.
Appropriate masking steps will be taken to ensure that none of the above can be reverse
engineered via other available data sets from Ministry data releases.
- The Ministry must:
- ensure that its employees and contracted individuals are trained to follow proper security procedures;
- set up a process to monitor its employees and contracted individuals’ compliance with the FOIPPA by establishing a Security Access Matrix;
- analyze the types and level of sensitivity of the personal information in its custody and control.
- The Ministry must create a process to verify that information/data released in different reports (or different columns in the same report) will not create a hint or linkage to help to identify a particular individual.
- The Ministry must provide mechanisms in dealing with 100% (complete) or 0% (no) data.
- The identity of a small homogenous group (fewer than 10) will not be divulged by releasing information about their characteristics.
- Similarly, the Ministry will take into consideration GBA+, ESL, Aboriginal and French immersion cases.
Risk assessment
This policy considers a risk-based approach to de-identification. The risk in this context is defined as re-identification risk or the probability of de-identified data being re-identified. The Ministry will consider if the resulting impacts of re-identification were to occur, including operational, financial, legal, and risk of harm to an individual (e.g. if it leaves an individual open to damage, distress or financial loss, a more rigorous form of risk analysis and de-identification would be required). As such, the Ministry will first determine the risk level to ensure the appropriate corresponding de-identification level and methods are selected. Risk level will depend on context (e.g. who is getting the data and under what circumstances). In determining the de-identification level and method, the Ministry must not undertake de-identification methods that result in circumstances that would be non-compliant with the FOIPPA.
Policy approach
In creating this policy, the following four considerations were deemed important:
- sufficient protection of personal privacy and personal information
- minimal loss of information
- ease of interpretation and
- ease of implementation
The aim is to achieve a balance between the above considerations while addressing the shortcomings with the previous policy and updating currently reported datasets over the normal reporting cycle to align with the policy.
After a review of literature in May/June, 2019 and a survey of existing best practices, shortcomings (i.e., cases where personally identifiable information was present in the data) were identified and masked manually. It was found that:
- implementing these manual fixes would be overly complicated, and extremely resource intensive to automate; and,
- over 90% of school level data was masked in many student outcomes data following the manual fixes.
Hence, a decision to remove school level data where applicable was made.
Below is an overview of the remaining changes, categorized by the four levels mentioned previously.
No change
Some public datasets such as those related to aggregate scholarships and class sizes would see no change (i.e., existing rules regarding applying ‘Msk’ will remain unchanged) as it was found that the existing masking, if any, would suffice for these.
Level 1 Masking
Datasets affected by this include those reporting student characteristics and outcomes. Level 1 masking stipulates:
- at district level, where applicable no sub-population (a subset of a larger population) level data or breakdown of numbers by facility type
- at district level, school districts with small student counts maybe completely masked from reporting. For example, if for the current school year, the following districts were found to have insufficient sample sizes: 049 (Central Coast), 084 (Vancouver Island West), 087 (Stikine) and 092 (Nisga’a) to report at many out at many of the pre-existing subpopulation levels.
- at province level, breakdown by male/female may not be included for some datasets.
- at province level, all non-standard facility types will be presented as “OTHER” for some datasets.
Level 2 Masking
- This is applicable for student outcomes datasets. Any data with level 2 masking will also have level 1 masking.
- Mask extreme percentage values where leaving unmasked would allow identification of individual results.
Special cases
These include some of the student characteristics and outcomes datasets. Some columns specific to each of these datasets will be removed to eliminate the possibility of reporting any personally identifiable information (e.g., age, gender).
Summary
To summarize, the Implementation of this policy will:
- reduce data at risk through compliance by protecting sensitive information when sharing data outside of the Ministry; and,
- increase productivity by automating the masking process and reducing the burden on internal resources who previously had to maintain manually-developed masking scripts.
Definitions
- Aggregate refers to data that are gathered and presented in summary form, using several measurements. Does not equate to “Anonymized” as aggregate data could still contain reidentification risks if small number counts are not managed.
- Anonymized data refers to data that has been irreversibly de-identified and is highly unlikely to allow any individual to be identified through combining it with other data. Anonymized data does not imply that the data are free of any risk of re-identification, but that the reidentification risk has been assessed to be highly improbable by the Data Custodian.
- Data products refer to products produced by the Ministry of Education in the form of reports, dashboards, extracts, responses to access request.
- De-identified data refers to data where direct identifiers have been removed or masked; however, enough information may remain to potentially allow a user to re-identify the data through linkages with other data sets or by accessing the key code by which the data was deidentified.
- Individual Identifiers, according to FOIPPA Policy and Procedures Manual definitions, includes information such as a person’s name, Social Insurance Number (or other number unique to the person such as driver’s license number, employee number or health card number), address, date of birth (usually used in combination with other identifiers such as name to distinguish between people with the same name but different birth dates) or any other discrete element of personal information that would enable a third party to deduce the identity of the person concerned. EDUC categorizes individual identifiers into the following two types:
- Direct identifiers include information that relates specifically to an individual, such as that individual’s name, full address or a unique identification number (e.g. Social Insurance or Personal Education Number).
- Indirect identifiers include information that can be combined with other information to identify specific individuals (e.g. a combination of a student’s gender, birth date and special needs status).
- Homogenous group refers to alike datasets
- Masking can refer to any method used to hide the original values in a data set. The method most commonly used by the Ministry of Education is the suppression of small cell values, but techniques can include technology or procedural barriers to access.
- Mosaic effect occurs when seemingly innocuous, separate items of information, when put together, would allow someone to accurately infer information about an individual.
- Personal Information, under FOIPPA, means recorded information about an identifiable individual. Three requirements must be met to fall within the definition of personal information:
- The information is recorded. Oral conversations, even if personal in nature, are not considered personal information for the purposes of the Act, unless the conversation is recorded in some manner.
- The information is about an individual. Information about corporations does not fall within the definition of personal information. There may be instances, however, where information about a sole proprietor's business is so intertwined with his or her personal information that the distinction between personal and business information cannot be made easily. Detailed examination of these cases will be required in order to determine whether the requested information falls within the definition of personal information.
- The information is about an identifiable individual. In most cases, this means that the name of the individual is contained in the record, but, in other cases, it will be possible to identify an individual through any other information in the record. In any case where the identity of the individual can be ascertained or deduced by the information in the record, that information is personal information for the purposes of the Act. Personally identifiable information is defined in Appendix B.
- Record - In short, a record is any information that is recorded.
- Under FOIPPA, a “record” includes books, documents, maps, drawings, photographs, letters, vouchers, papers and any other thing on which information is recorded or stored by graphic, electronic, mechanical or other means. Records also include email and information stored electronically. However, the definition of a record under FOIPPA does not include a computer program or any other mechanism that produces records.
- School level data are often interchangeable used with Student level data. School level data are the data aggregated up to level of school. It is the data about school themselves and level of aggregation. Student-level data refers to any information that educators, schools, districts, and state agencies collect on individual students, including data such as personal information (e.g., a student's age, gender, race, place of residence), enrollment information.