UK Biobank Data Debate
Full Debate: Read Full DebateLord Clement-Jones
Main Page: Lord Clement-Jones (Liberal Democrat - Life peer)Department Debates - View all Lord Clement-Jones's debates with the Department for Energy Security & Net Zero
(1 day, 12 hours ago)
Lords ChamberMy Lords, I thank the Minister for the Statement. This is clearly a serious incident that goes to the heart of public trust in one of our most important research assets. I pay tribute to the hundreds of thousands of volunteers whose data underpins the success of the UK Biobank and the breakthrough it has enabled.
It is right that swift action has been taken to remove the listings and suspend access. It is also right to involve the Information Commissioner’s Office. However, the central issue before us is not just what has happened but what it reveals about our capacity to defend ourselves against cyber attacks.
First, on enforcement and accountability, we are told that the institutions involved have been banned. That is, of course, welcome, but it is sufficient? Were contractual terms breached in relation to data of this sensitivity? There must be clarity about deterrence and whether further sanctions, legal or financial, are available and will be pursued. Without that, I fear that we risk sending the wrong signal.
This incident also seems to highlight deeper weaknesses in our wider infrastructure. We continue to have a system that relies heavily on trust and contractual compliance, but without robust technical safeguards to prevent misuse; it is not enough simply to tell users not to download data—we must design systems so that they cannot do so inappropriately. This is a design issue as much as a behavioural one. From my time as Health Minister, I am aware that NHS databanks do not allow the downloading of data on to third-party servers. The data remains on our servers in a sectioned-off area to allow the customers to analyse and manipulate the data but not download it, so these types of breaches cannot take place.
There is a strong case for a clear step-by-step plan from UK Biobank, setting out exactly how data access will be reformed, including the technical controls that will be put in place, binding commitments to ensure that this cannot happen again, and the stopping of the ability to download the data directly. In addition, there is a strong case for reviewing the data storage and retention policies of all our health bodies.
During the cyber attack on the London blood testing organisation in 2024, I was amazed that the names of the people being tested were given to the companies, along with the samples, for them to perform the test results. They did not need to have those names at all; all they needed to have was a unique reference number, so that data did not need ever to be out there in the first place. What surprised me even further was to find out that this same company had data for individuals going back five, 10 or 15 years, and did not seem to have any deletion policies in place to make sure that the data was not even there to be hacked in the first place.
As the Minister responsible at the time, I proposed a review of the data storage and retention policies of all the NHS bodies and their associated contractual parties, but this was just before the election, so I am not aware whether or not that review took place in the end. I would be grateful, therefore, if the Minister could update us on whether this did in fact happen.
I turn to the point raised in Committee on the cyber security and resilience Bill currently going through the other place. The Conservatives tabled an amendment which would have required the Secretary of State to maintain a register of hostile actors targeting critical sectors, including health. Regrettably, that amendment was not accepted. In light of this incident, I ask the Minister whether the Government will now revisit that decision. If not, will he at least consider how we strengthen our understanding and monitoring of potential threats in this space?
While we must not lose sight of the immense value of UK Biobank, maintaining public confidence will be essential. That confidence depends on not only the integrity of the data but the strength of the safeguards around it. As the cyber security and resilience Bill comes to our House, we must make sure that we learn the lessons from this deeply regrettable breach. Indeed, a good test we must apply to the Bill is: if it had already been enacted, would the breach have happened in this case? This is a moment not just for a response, but for reform. I look forward to the Minister’s reply.
My Lords, I thank the Minister for coming forward in relation to this Statement and join in acknowledging unreservedly the profound scientific value of UK Biobank and the extraordinary generosity of the half a million volunteers whose participation has driven life-saving discoveries in heart disease, cancer, dementia, Parkinson’s, and Covid immunity. I emphasise that nothing I say today diminishes that contribution or our commitment to seeing UK Biobank continue to thrive at the heart of the UK’s sovereign health data strategy. But we owe those volunteers honesty, and the honest description of what has happened here, as my honourable friend Victoria Collins said in the Commons last week, is that it was
“a profound betrayal of the people who trusted this institution with some of the most intimate details of their lives”,—[Official Report, Commons, 23/4/26; col. 472.]
including their sleep patterns, mental health, genetic data and medical history.
We welcome the swift removal of the three listings, the co-operation of the Chinese authorities, the self-referral to the ICO, the board-led review, and the development of what UK Biobank describes as the world’s first automated checking system. These are the right steps, but they are steps taken after the fact, and this House is entitled to ask how we arrived here. UK Biobank has apologised for the concern caused—that is not sufficient. We join our Commons Liberal Democrat colleagues in calling for a full and unequivocal apology to participants, not for causing concern but for the breach of trust itself.
We also cannot accept the framing that this was simply a matter of a few bad apples breaking their agreements. The platform allowed data to be downloaded. As the Minister himself confirmed in the Commons,
“this was not … a cyber-attack. This was a legitimate download … by a legitimately accredited organisation”.—[Official Report, Commons, 23/4/26; col. 473.]
That is precisely the problem: contractual promises are not an adequate safeguard for data of this sensitivity. There must be hard, technical barriers, and we are glad that a solution is now being implemented. The question is why it was not in place from the outset.
I have a series of questions for the Minister. First, on the scale of exposure, an associate professor from the Oxford Internet Institute has stated publicly:
“This is the 198th known exposure of UK Biobank data since last summer”,
and that UK Biobank data remains available online for anyone to download today. Will the Minister confirm how many data breaches at or by UK Biobank have been notified to the Government since the original ministerial Statement, and does the Minister have any reason to believe it will not become public that Biobank data has already been used to reidentify specific participants?
Secondly, on leadership and accountability, given the series of decisions, or failures of decision, that have brought us to this point, including the dismissal of earlier warnings, does the Minister have full confidence in the current leadership of UK Biobank? The board-led review is welcome, but its credibility will depend on its independence and transparency.
Thirdly, on reidentification risk, UK Biobank itself acknowledges that it cannot guarantee absolute confidentiality. Modern AI and social media make reidentification far more feasible than was the case when this data was first collected. Crucially, do the Government have contingency plans for large-scale reidentification of Biobank participants, given that, as the Oxford Internet Institute confirms, the data has leaked on nearly 200 occasions, as I mentioned earlier, and remains accessible online?
Fourthly, on the broader lesson for data and AI policy, this incident demonstrates something important: there is no panacea in simply handing patient data to AI systems and trusting that good intentions will follow. So much NHS and Biobank data has already been used in ways that violate the rules under which it was shared. As the Minister in the Commons acknowledged, this was a legitimate download—the rules failed to prevent it. If tearing up data governance rules produced easy wins, we would have seen the evidence by now. Instead, we have received repeated failures, and the Government must reflect on that when designing the new guidance on research data controls that they have promised.
Fifthly and finally, on system-wide lessons, can the Minister confirm that other UKRI and MRC cohort studies will be required to learn from this incident and that their governance will be reviewed? Will the Secretary of State require UK Biobank to publish a full step-by-step plan for reforming its data privacy—not guidance, not reassurances, but binding commitments? The volunteers who built UK Biobank did so in a spirit of trust and public service, and they deserve nothing less than ironclad protections, genuine accountability and the knowledge that their generosity will never again be treated as a governance afterthought.
The Minister of State, Department for Energy and Net Zero and Department for Science, Innovation and Technology (Lord Vallance of Balham) (Lab)
My Lords, I am grateful to the noble Lords, Lord Markham and Lord Clement-Jones, for those responses and questions. The Government agree that this is unacceptable; it is an abuse of UK Biobank’s data and something that we take extremely seriously, and it needs robust, technical solutions.
I start, though, by agreeing with both noble Lords in thanking the UK Biobank participants. The UK Biobank dataset is and remains critical in supporting scientific discoveries. It is quite an extraordinary resource that improves health in many of the ways described, such as predicting dementia and early warning signs for cancer, or finding genetic markers for stroke. This resource is probably the most powerful in the world to do that and none of it would have been possible without the generosity, support and trust of participants. That is why, importantly, this needs a complete and robust response.
I know that UK Biobank has apologised to its participants for what has happened. However, let me put on record the Government’s thanks to all the participants and give my assurance that we will get to the bottom of this and have a robust answer. I extend my thanks to the researchers who are working on these discoveries. We want to make sure that we can get it to be usable again so that this work can continue, but we must protect the data and the participants, as the noble Lord, Lord Clement-Jones, said.
I hope the House will recognise that the Government have acted quickly and seriously. We were made aware of this issue on Monday 20 April and took immediate action. First, within hours of it being raised, we had worked with the embassy in Beijing, the Chinese Government and Alibaba to have the relevant listings removed. They put in place measures to prevent listings being put up again in the same way and to automatically identify and remove relevant adverts. Secondly, we asked UK Biobank to immediately revoke access for the research institutions identified as the source of the information. Thirdly, we asked UK Biobank to stop access to its platform until a solution can be found.
On the point raised by the noble Lord, Lord Markham, that solution has to be a technical one. There are ways to do this, such as secure data platforms that stop people being able to download data. One thing worth reflecting on is that UK Biobank started in 2003. Its data became available in 2012; it was at the forefront of protecting data when it started and had robust mechanisms as to who could access it. What has happened is that, as the dataset has become very large, it has not kept up with the changing requirements for this, which is what need to be put in place now.
The fourth action we took was to ask that participants should be informed immediately.
On what is going to happen and whether the approach from UK Biobank itself is robust, Members of this House will be familiar with the noble Lord, Lord Kakkar. He is the chair of UK Biobank and has assembled a team, including cyber experts, to undertake an urgent, in-depth review of what happened and why. That team will provide its findings to the board, and to us, on or before 10 May.
Further, as has been asked, the Government will issue new guidance on the control of data from research studies. This was in train anyway and will, I hope, be out within the next few weeks. It will apply to all the resources in the UK which are used in this way. Most of them—we think probably all of them, apart from UK Biobank—use a secure data platform, which has the controls.
The UK Biobank resource is important and people volunteered to be part of it because of the benefits it brings for others and for future generations. We need to work with UK Biobank to ensure that researchers with a legitimate need to use the datasets can resume their research, but we must put the participants first and foremost. I agree with the noble Lord, Lord Markham, that this has to be stopped; there has to be a system that can stop this, not just a process.
In the meantime, UK Biobank continues to monitor whether new listings have emerged, because data was downloaded in the past. Up until 2024, it was possible to download data and that was the system used. There was a trust system, backed up by legal contract, but this has, as we know, been shared. New listings will emerge—there have been additional listings posted since the Government were made aware of the issue last week—and we continue to work with the Chinese Government to remove them quickly. While it is now not possible for new downloads of UK Biobank data, there remains a risk that new listings will emerge from data downloads that happened in the past. We will keep the participants and this House updated.
In answer to the question of the noble Lord, Lord Clement-Jones, about the number of breaches, a high number have occurred: most of them are not very significant, but some are significant and all of them are unacceptable. That is what needs to happen. I want to be clear, though, that the need to get these datasets used by researchers around the world pulls in the opposite direction to the need to keep them 100% safe. Therefore, there has to be a system, which is where the secure data environments are so important.
In answer to the questions from both noble Lords about identification, the UK Biobank advises that information such as names, addresses, exact date of birth and NHS number are removed from all the data before it is made available. We do not think any of that was available and we are not aware that any participant has been identified. We also do not believe that there were any purchases of the three listings before we managed to get them taken down. However, we welcome the in-depth board-level review being undertaken, which needs to be comprehensive and cover technical, cultural and process issues. In answer to the noble Lord, Lord Clement-Jones, it is increasingly possible to triangulate in large datasets and get close to identification, and that remains a very real risk.
Turning more broadly to other points, the institutions the data originated from have had their access revoked. While UK Biobank has worked to secure its platforms, all access and downloads have been paused globally. I note, though, that the Chinese Government have been very supportive in getting these listings removed.
The Government are reviewing the way in which we share biodata. A commitment of the biological security strategy 2023, which I think the noble Lord, Lord Markham, referred to, was to reduce the risk of sensitive data being exploited for harmful purposes while maintaining legitimate research collaboration. This will include seeking to harmonise the security policies of the major holders of all genomic data in the UK. We expect to conclude this work over the next few weeks.
The point made about the cyber security and resilience Bill is important, as raised by the noble Lord, Lord Markham. The Bill grants the Secretary of State new powers to issue national security directions to regulated entities or regulators where the compromise or the threat of a compromise to their network and information system poses a national security risk. The use of these powers will always be underpinned by robust intelligence from GCHQ, including, where relevant, information about state actors involved in cyber threats. Minister Narayan explained in the other place that a register of foreign actors is therefore unnecessary in this particular context. We are committed to transparency. The Government are already able to communicate with Parliament and the public about such cyber security risks where it is appropriate to do so.
I end by saying again how important UK Biobank is, how unique it is worldwide in its breadth and depth of coverage, and how appalling it is that this leak occurred. We must make absolutely sure that this risk is eliminated going forward by making sure that a secure data environment is put in place. In 2024, a requirement was made for UK Biobank to put in place an airlock and the requirements that we are now talking about. On 26 January, we asked UK Biobank to put an airlock on the research access platform that it has been using since 2024. Pre-2024, it was all downloads and, post-2024, it is a research access platform, but, unfortunately, that was still downloadable. That is the bit that needs to be stopped now.