Research Data Management (Health Sciences)
Considerations for Sharing Data
Sharing data is one way that researchers contribute knowledge and build up the scientific record. When data is shared, research can be replicated and new research to be conducted with the data. Additionally, if the data is collected in a standardized way, it can be combined with other data sets to help inform larger-scale research. It is useful to consider if and how you will share your data prior to beginning your project since it can affect the methods you use to collect your data. Here are some things to consider when sharing your data.
- What data can/will be shared?
- Will de-identified individual participant data be made available?
- What other documents need to be shared to make this data usable?
- When will the data be available?
- Will access to the data be restricted in any way?
- How will the data be made available?
Additionally, many funding agencies and respected journals require data to be shared in order to receive funding or have your paper published. Learn more about these requirements under the "Data Management/Sharing Plan" page of this guide.
Video: Considerations for Data Sharing
By: Sara Samuel
-
Sharing Data - Informational BulletinA one page informational bulletin with an introduction to sharing data. Print a copy to post in your office or laboratory, or share with your research colleagues.
-
Project Close-Out Checklist for Research DataThe close-out checklist describes a range of activities for helping ensure that research data are properly managed at the end of a project or at researcher departure. By Kristen Briney, Caltech Library
-
Editorial: Guidelines and Best Practices to Share Deidentified Data and Code.Horton, N. J., & Stoudt, S. (2024). Editorial: Guidelines and Best Practices to Share Deidentified Data and Code. Journal of Statistics and Data Science Education, 32(3), 227–231. https://doi.org/10.1080/26939169.2024.2364737
De-identifying Data
De-identifying data is an important task which should be undertaken prior to sharing data. This is not easy! It can be difficult to completely de-identify data, and it takes time to do so. Also, be sure you are following your IRB approved plan and any relevant informed consent protocols.
Some data sets cannot be completely de-identified. If you want to share data that cannot be completely de-identified, or if you have questions around de-identifying your data, please contact the Data Office for Clinical & Translational Research (DOCTR) for assistance.
Below are resources for learning more about de-identifying data sets.
-
U-M Data Office for Clinical and Translational Research (DOCTR)Have questions about de-identifying your data? Visit DOCTR's website and click on "Contact DOCTR" on the right side of the screen.
-
Methods for De-identification of Protected Health InformationInformation from the U.S. Department of Health & Human Services
-
The De-identification StandardInformation from the U.S. Department of Health & Human Services about the two methods used to achieve de-identification in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule: Expert Determination and Safe Harbor.
Locally Maintaining Data for Sharing
If you are able to share your data (indicated in IRB-approved research protocol and informed consent) but don't want it to be freely available through a repository, you can choose to maintain your data locally and share it upon request. This requires that you maintain your data in a way that you will be able to access and share it when a request is made - see Organize Data for tips. To protect all parties involved, including researchers and research participants, we strongly recommend that you require the following information before sharing data:
- An IRB-approved research plan for the proposed research
- A signed Data Use Agreement
The Data Office for Clinical and Translational Research (DOCTR) can help with preparing your data to be shared. DOCTR will review your data to ensure that it does not contain any HIPAA identifiers and they can connect you with the office that can help create a Data Use Agreement. Please contact them if you are choosing to share your data upon request.
Selecting a Data Repository
When selecting a repository to share your data, there are some considerations to take into account. In addition to the logistics and costs of depositing your data, you should also look for a repository that has similar data sets to help increase findability. Questions to ask when looking at repositories:
- Does this repository contain similar data sets?
- Does the repository have procedures in place for preservation and backup?
- If required for your data, does the repository have options for access restrictions (e.g. permissions management, restrictions on use)? Who will manage these restrictions?
- Does the repository allow for attaching a usage license to the data set?
- What procedures does the repository have in place for forward migration of storage technologies, to avoid obsolescence?
- How much will it cost to deposit the data and how will these costs be covered?
Video: Introduction to Data Repositories
By Sara Samuel
-
Selecting a Repository for Data Resulting from NIH-Supported ResearchThis supplemental information is intended to help researchers choose data repositories suitable for the preservation and sharing of data. Although this guidance is aimed at NIH-funded researchers, it provides a useful list of repository characteristics to look for when selecting a repository for your data.
-
Generalist Repository Selection FlowchartThe repository selection flow chart is a product of the Generalist Repository Ecosystem Initiative (GREI) and is designed to guide users through a series of considerations for selecting the right repository for sharing data.
Data Repositories
Below is a non-exhaustive list of repositories or repository-finding resources that may be suitable for sharing your data. We do not endorse any specific product - please contact the repositories and ask questions to determine which one will meet your needs for sharing your data.
-
Deep Blue Data (U-M)Deep Blue Data is a repository offered by the University of Michigan Library that provides access and preservation services for digital research data that were developed or used in the support of research activities at U-M. Text to include Deep Blue Data in a data management plan is available.
-
DryadAn international, curated repository of data underlying scientific and medical publications.
-
GitHubAn open source platform for sharing software. Learn more about using GitHub here: About GitHub for educators and researchers
-
Repositories for Sharing Scientific Data (NIH)A listing of NIH-supported repositories.
-
ICPSRThe Inter-university Consortium for Political and Social Research (ICPSR) maintains the world's largest archive of computerized, numeric social science data. Topics include demography, economics, health care, politics, social behavior, public opinion.
-
Qualitative Data Repository (QDR)Use the QDR archive to store, share, and discover a wide range of digital data and accompanying documentation generated or collected through qualitative and mixed-method research in the social sciences.
-
Re3Data.orgDirectory of data repositories.
-
ZenodoBegan in the European Union, supported by CERN (European Council for Nuclear Research). "All research outputs from all fields of science are welcome."