Skip to main content
Library Research Guides

Research Data Management (Health Sciences)

Data Documentation

Investing time in creating data documentation will help to ensure that your research data can be found, understood, and used by others.

Ideally, you should begin to document your research data at the outset of your project, and continue to create and update the documentation throughout the course of your project. This will decrease the risk that your documentation will be incomplete, or that you will forget important details about your data.

Research data documentation falls in to two categories - project documentation and dataset documentation.

Project documentation includes:

  • Where and how the data was collected
  • How the data files are structured and organized
  • How the data was validated
  • How the data was transformed
  • Who can access the data, and for what purpose
  • How the data can be used, and under what conditions

Dataset documentation includes:

  • Variable names and descriptions
  • Explanation of codes and classification schemes used
  • Algorithms used to transform data
  • File format and software use

The Readme File

An essential piece of research data documentation, the readme file provides basic information about a data file or dataset to help ensure that the data can be correctly interpreted, both by you at a later date or by others when sharing or publishing data. For readme file best practices and recommended content, see Guide to writing "readme" style metadata from Cornell University Research Data Management Service Group.