49
63
6 Research
6.3 Protecting the identity of individuals
Using information that does not identify individual patients is the surest way to protect
confidentiality. Whenever possible, anonymised data should be used for all purposes other
than direct care, including research. However, the Review Panel has found that while from
a legal perspective, patient data exists in one of two forms – with patients either
identified
53
or anonymous, in reality, the situation is more complex. In particular, there is a
‘grey area’ of data that on its own, does not identify individuals, but could potentially do
so if it were to be linked to other information.
Clearly, it is inappropriate to publish information that could lead to individuals being
identified. Therefore, the processing and disclosure of information with a high risk of
re-identification requires robust protection and governance. As with other areas of
information governance, the Review Panel found that there is great variation and
inconsistency in the language used to describe different kinds of information, for example
using terms like ‘personal’ or ‘identifiable’ interchangeably to describe the same thing, or
using the same term to mean slightly different things.
The Review Panel proposes a simple framework that describes three different forms of
data and clarifies the conditions under which data can be processed and disclosed (see
figure 1 over the page). This framework draws on the Information Commissioner’s Office
Anonymisation Code of Practice, published on 20th November 2012. In summary, the three
states of data in the framework are:
‘Data for publication’
This is data that has been anonymised in line with the ICO anonymisation code to the point
where determining individual identities from the data is unlikely, requiring unreasonable
effort. The data does not require a legal or contractual basis for processing and can be
publically disclosed. This data is called de-identified data for publication.
‘Personal confidential data’
This is data in which individuals are clearly identified, or are easily identifiable. This data
should not be processed without a clear legal basis
54
.
Personal confidential data should only be disclosed with consent or under statute
55
and any
disclosure must always be limited and accompanied by a contractual agreement that
mitigates the risk of misuse and inappropriate disclosure. The contractual agreement needs
to set out, as a minimum, the legal basis for the data flow, the purposes to which the data
can be put, the safeguards that should be in place to protect data and how the public are
informed about these.
The linkage of personal confidential data from more than one organisation for any purpose
other than direct care, should only take place in specialist, well governed, independently
scrutinised accredited environments called ‘accredited safe havens’ (see section 6.5).
53 i.e. ‘Personal data’ as defined in the Data Protection Act 1998.
54 The legal basis for personal confidential data must conform with the Data Protection Act 1998 and common law duty of confidentiality.
55 While the public interest can also provide a legal basis for disclosure it should not be relied upon for routine data flows (see also section 8.6
and recommendation 12).
34
64
Information: To share or not to share? The Information Governance Review
‘Data for limited disclosure’
This data is called de-identified data for limited disclosure or access. This is data that
has been through a process of anonymisation such as removing formal personal identifiers,
or using coded references or pseudonyms in their place, or by aggregating data together so
it is not possible to identify individuals. However, it would be relatively straightforward for
a third party to re-identify individuals or de-anonymise the data, especially if combined
with other data. This represents the ‘grey area’ of data.
This data should only be disclosed in accordance with the ICO Anonymisation code of
practice. The disclosing and processing of this data must always have safeguards for
limited access that have two components, a contractual agreement and a set of data
stewardship functions.
The contractual agreement mitigates the risk of re-identification and sets out as a
minimum the justification for the data flow, the purposes to which the data can be put,
the penalties and liabilities and how the public are informed.
The data stewardship functions should include, but not be limited to, the technical
and organisational security arrangements for security, human resource policies such as
contractual obligations, any training requirements and data retention policies. A key
challenge is establishing conformance to the data stewardship functions, while
enhancing a vital research and innovations community.
The Review Panel concluded that where de-identified data for limited access is
processed, assurance of the data stewardship component could be achieved in a
number of ways, including but not necessarily limited to:
• establishing the receiving organisation as an accredited safe haven (see section 6.5);
• using the facilities of an accredited safe haven; and
• volunteering for an audit from the ICO.
Furthermore any data breach could result in a formal ICO investigation.
As with personal confidential data, the linking of de-identified data for limited
disclosure or access from more than one organisation for any purpose other than direct
care must only be done in ‘accredited safe havens’ (see section 6.5).
If it is not possible to meet both the contractual and data assurance criteria, then
de-identified data for limited disclosure or access must be treated in the same way as
personal confidential data, only being disclosed with either consent or under statute.
149
65
6 Research
Figure 1: Simplified framework of data processing from a legal perspective
Class of data
according to
ICO code
Status of data
Description*
Legal basis required for
processing?
Need to
inform
Public?
Conditions
for onward
disclosure
Anonymised De-identified
data for
publication
Personal confidential
data which has been
anonymised with a low
residual risk of re-
identification. This
means third parties can
only re-identify the
persons with
unreasonable effort.
Not applicable
Desirable
No conditions
for disclosure.
Data may be
published.
De-identified
data for
limited
disclosure
or limited
access
Personal confidential
data that has been
anonymised but with a
residual high risk of
re-identification.
This means that the data
does not identify
persons on its own, but
there is a significant risk
that third parties could
re-identify the persons
with reasonable effort. A
defining characteristic is
a data set containing a
single identifier such as
NHS number or
postcode**.
Legal basis requires
safeguards that maintain
anonymity. This means:
• a contract that
prevents re-
identification; and
• assured data
stewardship
arrangements***.
Linkage of this data from
more than one
organisation for any
purpose other than
direct care must only be
done in the Health and
Social Care Information
Centre OR an accredited
safe haven.
Recom-
mended
Either as
de-identified
data for
publication
OR
to an
environment
covered by the
same
contractual
arrangements
as the
disclosing
party and
confirmed data
stewardship
arrangements.
Identifiable
Personal
confidential
data
Personal confidential
data that has not been
through anonymisation
and that may or may not
have been redacted.
Examples include:
• any data set with
greater than one
direct identifier** OR
• pseudonymised data
with access to key for
reversibility OR
• pseudonymised data
and holding one or
more of source data
sets in identified
form.
Legal basis for processing
is required that meets
the common law duty of
confidentiality, Human
Rights Act 1998 and Data
Protection Act 1998. This
means:
• consent OR
• statute OR
• exceptionally on public
interest grounds.
Linkage of this data from
more than one
organisation for any
purpose other than
direct care must only be
done in an accredited
safe haven.
Required
unless
exempt
With consent
for direct care
OR
under statute
OR anonymised
AND with
appropriate
contract or
agreement***.
* Please refer to ICO Anonymisation: Code of Practice with any associated health and social care system specific information governance statements or
standards for detailed documentation support.
** Appendix 5 contains a list of direct identifiers. Named data should be regarded as personal confidential data.
*** Appendix 6 provides further detail on contracts.
44
66
Information: To share or not to share? The Information Governance Review
6.4 Pseudonymising data at source
Pseudonymisation at source is a process that replaces identifiers in a data set with a coded
reference or pseudonym so information about an individual can be distinguished without
their ‘real-life’ identity being revealed. If the process of pseudonymisation is ‘enterprise
wide’, meaning it is standard across the whole health and social care system, it is then
possible for it to be safely linked with another data set and the identity of the individual
protected. The Review Panel heard evidence that the health and social care system should
adopt a single mechanism to pseudonymise data at the source it is collected and consider
seriously an enterprise-wide pseudonymisation at source, in theory allowing improvements
in linkage, protection of data and the use of information for activities such as
service improvement.
The Review Panel heard evidence that the banking and card payment industry have a duty
of care to protect the identity and sensitive data of clients. Following significant
investment a Banking and Payment Card Industry Data Security standard
56
was adopted in
2010. The health and social care system has a similar duty of care and could consider
adopting a similar single standard.
However, there is a lack of clarity as to the costs, risks and benefits involved in adopting
such a system for the whole of health and social care. The Review Panel concluded that
there should be an evaluation of benefits, costs, risks and management issues of adopting
such a system (or systems).
6.5 Accredited safe havens
There is one particularly challenging area from a privacy perspective, which is linking data
sets. Effective linkage must ensure that data for the same individual is brought together
from two or more data sets. This usually requires personal data.
The Data Sharing Review
57
(Thomas and Walport) stated that ‘safe havens’ “should be
developed as an environment for population based research and statistical analysis”.
The Review Panel recommends that data sets containing personal confidential data, or
data that can potentially identify individuals (de-identified data for limited disclosure
or limited access), are only disclosed for linkage in secure environments, known as
‘accredited safe havens’. The purposes for such linkage should be expanded to cover
audit, surveillance and service improvement.
Within the accredited safe haven, de-identified data for limited disclosure or access
must not be linked to personal confidential data unless there is a clear legal basis to do
so, and contracts must forbid this. This would re-identify the de-identified data for
limited access, and be a data breach.
56 https://www.pcisecuritystandards.org/security_standards/index.php
57 Data Sharing Review, Thomas and Walport, July 2008.
http://www.connectingforhealth.nhs.uk/systemsandservices/infogov/links/datasharingreview.pdf
44
67
6 Research
The Health and Social Care Act 2012 provides primary legislation for the creation of an
accredited safe haven, called the Health and Social Care Information Centre (Information
Centre, see section 1.8).
The amount of data linkage required by the new health and social care system may be
beyond the resources of the Information Centre as currently envisaged. Additionally, much
of this linkage may be required at a local level, which is at odds with the Information
Centre’s national focus. This gives rise to the question of whether further accredited safe
havens will be required to support the health and social care system.
The Review Panel has found there are plans for at least 20 accredited safe havens. These
include safe havens within Royal Colleges, National Clinical Audit contract holders,
approximately 10 Data Management Integration Centres (discussed in more detail in the
Commissioning chapter), Public Health England
58
and the Clinical Practice Research
Datalink service of the MHRA. These accredited safe havens will need a clear legal basis to
link data
59
. Being an accredited safe haven does not necessarily mean that the organisation
is receiving personal confidential data, but does mean it can receive de-identified data for
limited disclosure or limited access.
Recommendation 10
The linkage of personal confidential data, which requires a legal basis, or data
that has been de-identified, but still carries a high risk that it could be re-
identified with reasonable effort, from more than one organisation for any
purpose other than direct care should only be done in specialist, well-governed,
independently scrutinised and accredited environments called ‘accredited
safe havens’.
The Health and Social Care Information Centre must detail the attributes of an
accredited safe haven in their code for processing confidential information, to
which all public bodies must have regard.
The Informatics Services Commissioning Group
60
should advise the Secretary of
State on granting accredited status, based on the data stewardship requirements
in the Information Centre code, and subject to the publication of an independent
external audit.
58 From 1st April 2013, a number of organisations will exist within Public Health England that will link data using section 251 as the legal basis.
These include cancer registries, registries for other diseases such as congenital anomalies, and health protection (see also section 8.6 and
recommendation 12).
59 The Health and Social Care Act 2012 provides the legal basis for the Health and Social Care Information Centre.
60 The Informatics Services Commissioning Group is responsible for providing advice on commissioning informatics services across the health
and social care system. Information governance is one of the key informatics services.
44
68
Information: To share or not to share? The Information Governance Review
6.6 Governance and data stewardship of accredited safe havens
Data stewardship refers to the principles and recommended practices for the handling
of data.
The Review Panel concludes that there is a need for a consistent national minimum
standard of data stewardship, with the leadership (Boards or equivalent body) of
organisations with accredited safe havens held accountable for any failings. This should
be supported by a system of external independent audit, which is published, and an
accreditation process for all organisations that act as an accredited safe haven.
Professional standards and good practice
Data stewardship requirements for accredited safe havens
The Review Panel concludes that accredited safe havens should be required to meet
the following requirements for data stewardship:
• Attributing explicit responsibility for authorising and overseeing the anonymisation
process e.g. through a Senior Information Risk Officer.
• Appropriate techniques for de-identification of data, the use of ‘privacy enhancing
technologies’ and re-identification risk management.
• The use of ‘fair processing notices’.
• A published register of data flowing into or out of the safe haven including a
register of all data sets held.
• Robust governance arrangements that include, but are not limited to, policies on
ethics, technical competence, publication, limited disclosure/access, regular
review process and a business continuity plan including disaster recovery.
• Clear conditions for hosting researchers and other investigators who wish to use
the safe haven.
• Clear operational control including human resources procedures for information
governance, use of role-based access controls, confidentiality clauses in job
descriptions, effective education and training and contracts.
• Achieving a standard for information security commensurate with ISO27001
61
and
the Information Governance Toolkit (see section 12.9).
• Clear policies for the proportionate use of data including competency at
undertaking privacy impact assessments and risk and benefit analysis.
• Standards that are auditable.
• A standard template for data sharing agreements and other contracts that
conforms to legal and statutory processes.
• Appropriate knowledge management including awareness of any changes in the
law and a joined up approach with others working in the same domain.
• Explicit standard timescales for keeping data sets including those that have been
linked, which should be able to support both cohort studies and simple ‘one-off’
requests for linkage.
61 ISO27001 is the international best practice standard for an Information Security Management System. See: http://www.27000.org/index.htm
45
69
6 Research
6.7 Exceptional disclosure in the public interest (section 251 of the
NHS Act 2006)
Sometimes researchers require specific information about individuals that cannot be
anonymised or pseudonymised in a safe haven, and gaining explicit consent may be highly
impractical. Legislation is in place that allows personal confidential data to be processed
for medical purposes such as research.
Regulations under section 251 of the NHS Act
62
, often referred to simply as ‘section 251’,
allows the common law duty of confidence to be set aside under specific circumstances.
Applicants must demonstrate that the aim of the processing is in the public interest, that
anonymised information could not be used to achieve the required results, and that it
would be impractical, both in terms of feasibility and appropriateness, to seek specific
consent from each individual affected. For research the approval of a Research Ethics
Committee is also needed. The key test is one of necessity, not convenience.
The powers under the section 251 regulations only provide relief from the common law
duty of confidence. Any activity taking place with the support of section 251 must still
comply in full with the Data Protection Act.
Example: difficulties obtaining explicit consent
The Academy of Medical Sciences report, ‘Personal data for public good: using health
information in medical research’
63
, identified a number of circumstances where it
may not be practicable to seek consent for the use of identifiable patient records in
research:
• the risk of introducing bias that will endanger the validity of the results: certain
segments of the study population may be particularly difficult to get in touch with
for consent, but excluding these people could bias the sample population, causing
the study to produce misleading results which may not be applicable to
underrepresented groups;
• seeking consent may compromise effective population coverage;
• the size of the study population and the proportion likely to be untraceable which
might make contact impracticable;
• the overall financial and time burdens imposed; and
• the risk of inflicting harm or distress by contacting people. For example, the
Medical Research Council (MRC) gives the example that contacting people about a
study examining correlations between parents’ mental health and unexplained
child deaths might cause serious distress
64
.
62 Originally enacted under section 60 of the Health and Social Care Act 2001.
63 ‘Personal data for public good: using health information in medical research’, 2006, http://www.acmedsci.ac.uk/p48prid5.html#description
64 See MRC Personal Information in Medical Research, p19.
39
70
Information: To share or not to share? The Information Governance Review
The Health Research Authority and the Confidentiality Advisory Group
The Health Research Authority (HRA) was established in 2011 with the purpose of
protecting and promoting the interests of patients and the public in health research.
From April 2013, the HRA will take over the advisory functions on use of data from
the Ethics and Confidentiality Committee, including applications under section 251.
As part of this, the HRA has convened the Confidentiality Advisory Group to review
applications to access patient information without consent and provide expert,
independent advice on whether the applications should be approved. In the case of
research applications, the Confidentiality Advisory Group will provide advice to the
HRA, for non-research applications the advice will be provided to the Secretary of
State for Health.
As the Confidentiality Advisory Group formally replaces the Ethics and Confidentiality
Committee of the National Information Governance Board on 1st April 2013, it is too
early for this review to have a view on how successfully it manages the balance of
risks and benefits from sharing personal confidential data.
6.8 Consent for consent
In some cases, researchers may need to access personal records to identify people with
particular characteristics to invite them to take part in clinical trials and other
interventional studies. The researcher must first establish a clear legal basis before they
can access the data. This process is often referred to as ‘consent for consent’ and can
present a barrier for researchers although section 251 will provide a way forward in
some instances.
Professional standards and good practice
The searching of patient records for potential research subjects can be done legally
by fulfilling any of the following criteria:
• The researcher gains the explicit consent of every patient with a record in the
population pool being assessed.
• The search is conducted by a health or social care professional who has a
‘legitimate relationship’ with the patient, such as a clinician or social worker
(see section 3.6).
• The search is conducted by a researcher who is part of the clinical team
65
.
• The search makes use of ‘privacy enhancing technologies’ (see below).
• Support under section 251 regulations is granted for the research.
65 GMC’s guidance on confidentiality (2009),
http://www.gmc-uk.org/mobile/confidentiality_40_50_research_and_secondary_issues (points 48 and 50)
37
71
6 Research
Case study
Taken from: ‘The regulation and governance of health research’, Academy of Medical
Sciences (2011)
66
Recruitment to swine flu study
In autumn 2009 the Clinical Research Network fast-tracked studies into pandemic flu
in response to the high national priority given to rapid research into the disease.
In one NIHR-funded study conducted across several sites there was a need to send
out questionnaires to patients who had been identified through anonymous data sets
as eligible for inclusion in the study, to ask them whether they would like to consent
to be involved. The involvement of the research team was required to print out
address labels to send out the questionnaires. At one site the local Research Ethics
Committee and university governance teams would not approve the research team
having access to patient’s names and addresses before they had consented to take
part in the study, and therefore a member of the clinical care team was required to
take on this role. Although a member of the clinical care team agreed to undertake
this activity, they were unable to complete it due to other (understandable)
priorities. Consequently, for that site, instead of 200 questionnaires only 30 were
sent out.
‘Privacy enhancing technologies’ in this case means analytical computer software that can
trawl clinical databases, selecting only those patients who are eligible for a specific study,
and only reveal the identities of potential participants to someone with a legitimate
relationship to the patient, such as their clinician or social worker. Where someone in the
health and social care team is to undertake the search, the researcher (and funder) should
provide adequate resource to facilitate this if necessary.
In most cases, once selected as a potential research subject patients should be contacted
by an established member of their care team inviting them to take part in a study and
notifying them a researcher may be in touch.
The Review Panel concludes that, wherever possible, privacy enhancing technologies
should be used to minimise the need for access to identifiable information.
The approach taken by the Clinical Practice Research Datalink Service and the South
London and Maudsley Trust provide examples of an approach that allows appropriate
individuals to be selected and approached to take part, without giving researchers direct
access to identifiable information before consent is obtained.
66 http://www.acmedsci.ac.uk/p47prid88.html
11
72
Information: To share or not to share? The Information Governance Review
Example: South London and Maudsley Trust safe haven
The South London and Maudsley Trust has a ‘safe haven’ environment that gives
researchers access to de-identified data to select relevant individuals. Once the
researcher has made their selections, the administrator of the patient electronic
health record system then checks whether or not those individuals have provided
consent to be approached about relevant research. The list of those who have given
consent is then released to the researcher at one of the King’s Health Partners
organisations to approach the individual with details of the relevant study and obtain
their consent to participate.
Documents you may be interested
Documents you may be interested