Skip to Main Content

Data Management Plans for the Social Sciences

Suggested resources for designing data management plans (DMP) for your research project.

Funding Agency Guidance

National Science Foundation

"The DMP should describe the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project. It should then describe the expected types of data to be retained."

http://www.nsf.gov/sbe/SBE_DataMgmtPlanPolicy.pdf  

Examples

Please note that these DMP excerpts are copyrighted by their respective authors.

Preferred:
“This research project will generate data resulting from sensor recordings (i.e. earth pressures, accelerations, wall deformation and displacement and soil settlement) during the centrifuge experiments. In addition to the raw, uncorrected sensor data, converted and corrected data (in engineering units), as well as several other forms of derived data will be produced. Metadata that describes the experiments with their materials, loads, experimental environment and parameters will be produced. The experiments will also be recorded with still cameras and video cameras. Photos and videos will be part of the data collection.”

“A total storage demand of 50 GB is anticipated at the University of Michigan, and 50 GB at Auburn University.”

“Based on the previous viscoelastic turbulent channel flow simulations, the amount of resulting binary data is estimated around 40 TB per year. Some text format data files are also required for post-processing in the laboratory and are anticipated to be around 1 TB per year.”

These three examples all illustrate parts of a good answer to this question. The first lists the various types of data that will be generated, the second states how much data will be created in total, and the third estimates the volume of data to be created per year. Your plan should address all of these elements, if possible.

Less Developed:
“The main goal of this project is to conduct simulations to better understand the thermosphere and ionosphere. Therefore, the data that will be produced from this project are simulations. The model that we utilize produces 3D data covering from 100 km to 600 km altitude with roughly 50 grid spacings. In the latitude and longitudinal directions, the spacing is typically 2.5 x 2.5 degrees.”

A common error in this section is to lapse into a recap of the project summary, as illustrated by this example. Stick to describing the types of data to be generated, touching on methods only when necessary to explain what (or how much) data you will be creating.

Show/Hide Example 2

Show/Hide Example 3

Summary

Describe the data you will produce in the course of the project (see specific agency guidelines for what types of data to include). Describe both the subject matter as well as the file format(s). How much data do you expect to have? If you will be generating multiple data sets, answer the questions below for each data set. If you know you won’t be keeping all the data you generate, state what you will and won’t retain and why.

Questions to Consider

  • What is the general subject and nature of the research data you will be generating? Interview transcripts, experimental measurements and protocols, code for statistical analysis, qualitative data, simulations?
  • How will the data be created or captured? What file formats will you have? 
  • Are there any privacy or confidentiality concerns?
  • If you will be using existing data, state that fact and where you got it. What is the relationship between the data you are collecting or generating and the existing data?