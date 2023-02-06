The Integrated Data Infrastructure (IDI) is a huge and growing database, accumulated from dozens of government agencies over many years.

Rules regulating the use of a sprawling Government database containing personal information of about nearly every New Zealand resident have been breached more than 100 times, new data shows.

Many of the breaches were minor, and Stats NZ – the agency responsible for the Integrated Data Infrastructure (IDI) – said no individual’s privacy had been breached.

It underscores concerns from civil libertarians that the enormous – and growing – data trove is a privacy risk.

The IDI contains information on nearly all New Zealand residents. It links administrative data collected by individual Government agencies through the course of their work – and through surveys such as the Census – into a central repository.

Since its creation in 2013, it has been fundamental to delivering some Government policies and is regularly used by an expanding roster of approved outside researchers. Advocates say it helps better target social services, but critics argue the data is not collected evenly and gives the state more oversight of some groups than others.

Since April 2018, the agency has granted IDI access to 1400 researchers. Around 350 projects are currently using the database.

Researchers using the IDI have to meet strict conditions. The data they see is de-identified and only relevant to their specific project. They must also sign confidentiality agreements not to disclose any personal information they do see.

The database can only be accessed from a secure Data Lab, and Stats NZ checks all information before publication for privacy risks.

But as the database has grown – and permission to access its content widened – so too has the number of policy breaches.

Data released under the Official Information Act lists 103 policy breaches since 2015.

The number of breaches appears to be accelerating: there were 24 between 2015 and 2018, according to data previously obtained by the New Zealand Council for Civil Liberties (NZCCL). Between 2018 and November 2022, there were a further 79 breaches, data released to Stuff shows.

The most common breaches were researchers failing to round data, a practice that is used to protect privacy by randomly rounding results to a close multiple of three (For example, if a data set contained 7 people, it might be randomly rounded to either 6 or 9).

Another common breach was researchers sharing images they had taken from the IDI, usually to request help from Stats NZ staff members or other researchers (taking photos is prohibited). Another repeated issue was researchers being granted access to the wrong projects.

Ella Bates-Hermans The Integrated Data Infrastructure (IDI) is a colossal trove of information. It holds personal data about virtually every New Zealander, and millions more who have lived, and died, here.

Some incidents had potentially more risk.

In one incident, health data from people who did not consent to their information being linked were briefly visible in the IDI. In another, an approved researcher gave unauthorised access to the IDI to two other staff in their agency (The agency was not identified).

One researcher wrote their password on a piece of paper and lost it, one incident report said. Another researcher posted their IDI code on Twitter.

In one incident, the door to a Data Lab at an external agency was left open. In another, a person who was given access to a Data Lab to respond to a medical incident made a phone call from the facility.

In a response provided with the list of breaches, Stats NZ said none of the incidents affected an individual’s privacy.

“While there have been policy breaches, these are minor and have not put the security of the data at significant risk, or allowed personal data to leave the Data Lab,” said Kate Satterthwaite, general manager of executive and government relations.

“As a responsible data custodian, Stats NZ captures all these incidents in our security incident database and takes steps to ensure they do not recur.”

A serious failure by someone accessing the IDI could have consequences, she said, which include losing access to the IDI and reputational damage within the research community. Serious breaches could be prosecuted under the Data and Statistics Act.

It comes as the agency faces pressure to competently deliver this year’s Census after the 2018 iteration was widely seen as a debacle.

Some groups – particularly Māori – were significantly undercounted, requiring the agency to make up the shortfall with administrative data from the IDI.

The growing tally of breaches shows the risk in maintaining such a vast database, said NZCCL chairman Thomas Beagle.

“No system can ever be fully secure, so the best way to protect data is not to collect it or retain it unless you need it,” he said.

“Stats NZ's approach of centralising microdata about every person makes for a very high-risk system – and that risk grows every day that new data is added.”

He said the list of breaches “contrasts starkly” with Stats NZ’s advertising campaigns ahead of the upcoming Census, which promotes the agency’s ability to keep information private and secure.

“It is particularly concerning given that people's information is there [in the IDI] without their genuine consent,” Beagle said.

“You can't opt-out of your census return being kept forever in their databases. The same goes for most other sources of data Stats NZ collects about us.”