Federal regulations require IRBs to determine the adequacy of provisions to protect the privacy of subjects and to maintain the confidentiality of their data. To meet this requirement, federal regulations require researchers to provide a plan to protect the confidentiality of research data. Today, the majority of data is at some point collected, transmitted, or stored electronically. The purpose of this document is to help the research community develop best practices for managing electronic data. These best practices will need to adapt as technology evolves, so it is important that research teams keep current with the guidance and resources offered by the University. In addition, research is now a global enterprise, and investigators should understand the international laws or regulations that may apply when conducting research outside the United States.
The Principal Investigator (PI) is responsible for ensuring that research data is secure when it is collected, stored, transmitted, or shared. All members of the research team should receive appropriate training about securing and safeguarding research data. For example, the research team should understand they need to document their standard practices for protecting research data so that they can provide these details to the IRB if a mobile device is lost or stolen. Data security must be discussed regularly at research team meetings, and data security details must be included in the study data and safety monitoring plan. The University offers a wide range of Information Technology services for all faculty, staff, and students, and we encourage investigators to consult with their IT staff to develop standard best practices.
It is important that you plan to review this document and related University Information Technology site on a regular basis (*monitor the IRB for updates) since technology evolves rapidly and more guidance is expected from federal regulators.
- Anonymous data: Data that at no time has a code assigned that would permit the data to be traced back to an individual. This includes any information that was recorded or collected without any of the 18 identifiers as defined by HIPAA. Note that IP addresses are considered by the University and some international standards to be identifiable even though the address is linked to the computer and not specifically to the individual.
- De-Identified: Investigator cannot readily ascertain the identity of the individual
- Coded: Identifying information (such as name) that would enable the investigator to readily ascertain the identity of the individual to whom the private information or specimens pertain has been replaced with a code (number, letter, symbol, or any combination) and a key to decipher the code exists, enabling linkage of the identifying information to the private information or specimens
- PHI: Protected Health Information (as defined by the HIPAA Privacy Rule 45 CFR 164 § 501)
- PII: Personally Identifiable Information: “(1) any information that can be used to distinguish or trace an individual’s identity, such as name, social security number, date and place of birth, mother’s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.”1
- Sensitive Research Data: Data is considered sensitive when disclosure of identifying information could have adverse consequences for subjects or damage their financial standing, employability, insurability, or reputation.
1 OMB Memorandum M-07-16, Safeguarding Against and Responding to the Breach of Personally Identifiable Information.
Office for Human Research Protections (OHRP) Definitions
- Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information that has been provided for specific purposes by an individual and that the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects. 45 CFR 46.102(f).
- Individually Identifiable: OHRP generally considers private information or specimens to be individually identifiable as defined at 45 CFR 46.102(f) when they can be linked to specific individuals by the investigator(s) either directly or indirectly through coding systems.
Researchers have a responsibility to be good data stewards. In the past, the majority of data was collected and stored on paper. At a minimum, data was protected by being locked in a file cabinet in a locked room that only members of the research team could access. Today, data is collected, transmitted, and stored on computers and mobile devices. Simply password-protecting a computer may not be sufficient to meet the rigorous security standards mandated by the University and/or sponsors. Researchers need to collaborate with their school, department or center IT staff who have the expertise to evaluate the security methods most appropriate for the sensitivity of the research data.
Data that will be shared with others requires additional oversight to uphold the privacy of the research participant and the confidentiality of their data. If data from the study is to be shared outside the research team, it is important that the researchers obtain the appropriate consent from study participants.
In the past, many consent documents had language that limited sharing of the data more so than was necessary or intended. It is important to think about future data use and to tailor the consent language and permissions to meet your future data sharing needs.
Some researchers may request permission to share identifiable data, but the majority will be sharing de-identified data. Many sponsors, including federal agencies, require data sharing as a condition of funding, and this must be reflected in the consent document and, most importantly, in the consent process (discussion). This includes the acknowledgement of the data sharing practices and the possible risk of re-identification when applicable. One should never guarantee that de-identified data cannot be relinked and the participant’s identity disclosed. As technology evolves, so does the potential risk of re-identification.
National Institute of Health (NIH) Grants
The NIH has specific requirements about ensuring data security when collecting identifiable research data in section 2.3.12 Protecting Sensitive Data and Information in Research.
“Recipients of NIH funds are reminded of their vital responsibility to protect sensitive and confidential data as part of proper stewardship of federally funded research, and take all reasonable and appropriate actions to prevent the inadvertent disclosure, release or loss of sensitive personal information. NIH advises that personally identifiable, sensitive, and confidential information about NIH-supported research or research participants not be housed on portable electronic devices. If portable electronic devices must be used, they should be encrypted to safeguard data and information. These devices include laptops, CDs, disc drives, flash drives, etc. Researchers and institutions also should limit access to personally identifiable information through proper access controls such as password protection and other means. Research data should be transmitted only when the security of the recipient’s systems is known and is satisfactory to the transmitter. See also Public Policy Requirements and Objectives—Federal Information Security Management Act.”
The NIH also instituted the Genomic Data Sharing (GDS) Policy to promote sharing, for research purposes, of large-scale human and non-human genomic data generated from NIH-funded research. The policy requires investigators to incorporate a genomic data sharing plan in the ‘resource sharing’ section of their application. This policy applies to proposals and applications submitted after January 25, 2015. More information is available at http://grants.nih.gov/grants/guide/notice-files/NOT-OD-14-124.html.
Assessing the data security method needed
Based on the type of data involved in the study, the IRB is required to 1) assess potential risks to participants, and 2) evaluate the researchers’ plan to minimize risks. All research activities result in some type of risk and the researcher has the responsibility to mitigate the risk of improper disclosure.
What is the risk?
- Is the data identifiable, de-identified (coded), or anonymous?
- Is sensitive information being collected that could result in harm to participants?
- What is the risk of harm to the participant or others?
What are the protections against anticipated threats or hazards (during collection, transmission, storage)?
- Encryption of data on device to protect against loss/theft of device
- Use of secure data transmission channels to protect against data interception
- Strong passwords to protect against unauthorized access
- Store data behind a secure Pitt or UPMC firewall whenever possible
- Ensure strong data security controls on all storage sites
Many researchers are purchasing mobile apps or building their own app to interact with study participants. Seek expert IT review and, if commercially available, purchase the app through the Pitt Purchasing Office so a legal and data security review is performed. Even if the participant is asked to download a free App or provided monies for the download, the researcher is still responsible for disclosing potential risks. It is possible that the App the participant downloaded will capture other data stored or linked to the phone on which it is installed (e.g., contact list, GPS information, access to other applications such as Facebook). The researcher has the responsibility to understand known or potential risks and convey them to the study participant. Commercially available apps publish “terms of service” that detail how app data will be used by the vendor and/or shared with third-parties. It is the researcher’s responsibility to understand these terms, relay that information to participants, and monitor said terms for updates. Additionally, it is important that the researcher collect from the App only the minimum data necessary to answer the research questions. Refer to the IRB Mobile App guidance displayed on the IRB website for more information.
- Developing a health app for a mobile device - go to Mobile HealthApps Interactive Tool to learn which federal laws may apply
The process of transmitting data is often overlooked as a risk. The plan to protect confidentiality should describe the methods to protect the data during collection and sharing both internally and externally to the University. It is advisable to utilize a secure transmission process even if the data is anonymous, coded, or non-sensitive information. If the research team develops a best practice on using a secure data transmission process, then it is less likely a data breach will occur. Email notifications are generally not secure, except in very limited circumstances, and should not be used to share or transmit research data. Text messages are stored by the telecommunications provider and therefore are not secure. Data should be encrypted when “in-transit,” and the University provides extensive guidance, software, and resources to assist researchers in this. Terms such as Secure Sockets Layer (SSL and HTTPS) or Secure File Transfer Protocol (SFTP) are indications that the data is being encrypted during transmission. Tools such as SecureZip, which can be used to encrypt files before transmission, are made available to all University faculty and staff at no cost.
The first fact to remember is that the research data belongs to the University of Pittsburgh and not the researcher. It has become common practice to store some level of personal information in the Cloud with services such as Box, Google Drive, Dropbox, Salesforce.com, Evernote, Office365, and Amazon. Using such services can often result in cost savings; however, special attention must be paid to potential security risks, export control restrictions, and data ownership issues.
Currently, the University’s sanctioned cloud-based storage is Pitt Box. Only data that meets HIPAA de-identification standards should be stored on Pitt Box. For identifiable information, the best practice is to store the data on a server maintained by Computing Services and Systems Development (CSSD) or a server that has been sanctioned by UPMC’s Information Security Group. Using departmental servers to store research data is not recommended. These servers must be approved by the Pitt Information Security Officer and require extensive and costly IT support to maintain all the virus, malware, service updates, and incident response standards. If you plan to use the Pitt Box to store data, review the Best Practices for Storing Data in the Cloud section.
If you are considering the storage of any data outside of Pitt or UPMC networks, working through Pitt’s Purchasing Services will help you address the following questions that will be required by the University:
- Does the agreement with the vendor stipulate the University owns the data?
- Contact the Office of Research
- Does the agreement with the vendor incorporate the University’s Personal Data Protection Addendum?
- How will the vendor make the data available in the event of a disaster?
- What security controls are in place to prevent the inadvertent or malicious disclosure of the data?
- What happens if a subpoena is issued?
- Does the vendor have Information Security/Cyber Liability insurance?
Collecting or storing research data using the internet results in additional complexity as one must consider the jurisdictional authority: is it the jurisdiction of the researcher, the location of the study participants, or the location where the data is stored? Data may be collected in one jurisdiction but then stored in another. Researchers need to be aware that there may be differing data security privacy policies. It is important that researchers consider the laws, including international laws and export controls regulations, and if needed have agreements in place to ensure compliance.
Encryption, Encryption, Encryption
Encryption protects data by encoding information so that only authorized parties may read it. Encryption can occur “at-rest” where the data is being stored and “in-transit” as the data is being moved from one location to another. As previously stated, there are many tools and methods available to encrypt all types of data, and the University has extensive resources available at http://technology.pitt.edu/. If you need additional assistance, use the Support link or call the Technology Help Desk at 412-624-HELP .
Children’s Online Privacy Protection Act (COPPA)
The University has a site license for the Qualtrics survey system, which is available at no cost to all Pitt faculty, staff, and students. This cloud-based research tool has been vetted and authorized by Pitt’s Office of General Counsel and Pitt’s Information Security Officer. The software is available to support teaching, academic research, and institutional business. Access to Qualtrics is available from my.pitt.edu under the My Resources tab. Individuals with university-sponsored accounts do not have direct access to Qualtrics per the licensure agreement; however, any external collaborator may be assigned by Pitt members.
Research Electronic Data Capture (REDCap) and WPIC WebDataXpress survey systems have also been approved by the Pitt Information Security Officer for use.
- Go to https://www.ctsiredcap.pitt.edu/redcap/ to access REDCap
- Contact Jack Doman, Director, Office of Academic Computing, at 412-246-6333 or firstname.lastname@example.org regarding WebDataXpress
If any other survey software is used, it must first undergo a data security review, and if commercially available, must be purchased through the University Purchasing Office.
Many investigators wish to collect the IP addresses of survey participants to provide a method of determining whether the user has previously completed the survey. As stated earlier, the University and some international standards consider IP addresses to be identifiable information. This is important to consider when conducting surveys, especially if the consent process indicates that a participant’s responses will be anonymous. When using Qualtrics, check the option to anonymize the data collection process and do not collect the IP address. If IP addresses are necessary to the research, include in the consent process that you will be recording this information.
The Data and Safety Monitoring Plan should indicate that research team meetings include discussions about, but not limited to:
- Software on computers to protect against malware
- Data security to ensure all software updates and patches are being applied
- Data collection, transmission, and storage methods employed
- Data collected is only that data necessary to answer the research question
- Codes are not stored with the corresponding de-identified data
- Encryption methods are being used on all portable devices (laptops, mobile devices, and removable storage)
- What are the expectations of the research participant?
- Investigator will protect their privacy and confidentiality of their data
- What information is needed to answer the research question?
- Investigator should collect information that is required to answer research questions and not more just because it is possible.
Think about personal privacy and do not promise confidentiality
Examples of suggested language:
Although every reasonable effort has been taken, confidentiality during Internet communication cannot be guaranteed and it is possible that additional information beyond that collected for research purposes may be captured and used by others not associated with this study.
Text messages are not encrypted or secure during their transmission and it is possible they could be intercepted and used by others not associated with this study.
Section 2 - Research Design and Methods
Question 2.6 – Research Activities: It is important to provide a detailed description of any activities the research participant will perform that entail the use of the any electronic data (e.g., accessing a website, downloading an App, text messages, completing a survey) so the IRB can determine that risks are minimized.
Section 5 – Potential Risks and Benefits of Study Participation
Question 5.13 - Data and Safety Monitoring Plan: This section should include a data security plan to describe how the research data will be protected during collection, storage, transmission and destruction to ensure it meets the University of Pittsburgh policies.
Question 5.15 - Precautions will be used to maintain the confidentiality of identifiable information: Describe the data security controls that will be implemented to protect the data during collection, storage, transmission and destruction to ensure it meets the University of Pittsburgh policies.
University of Pittsburgh
- Information Technology Security Resources (Help desk: 412-624-HELP)
- Safe Computing for Faculty and Staff
- Information Technology Consulting Services
- Office of Research
- Purchasing Office
U.S. Food and Drug Administration
Mobile Medical Applications
U.S. Department of Health & Human Service
Human Subjects Research and the Internet
Federal Trade Commission
Understanding Mobile Apps