Please note that these DMP excerpts are copyrighted by their respective authors.
“Verilog, SPICE, and MATLAB files generated will be processed and submitted to FTP servers as .mat files with TXT documentation. The data will be distributed in several widely used formats, including ASCII, tab-delimited (for use with Excel), and MAT format. Instructional material and relevant technical reports will be provided as PDF. Digital video data files generated will be processed and submitted to the FTP servers in MPEG-4 (.mp4) and .avi formats. Variables will use a standardized naming convention consisting of a prefix, root, suffix system.”
“Plasma image data will be RGB colored JPG or TIFF format with resolution determined by the camera. Video data will be RGB colored AVI format.”
These examples illustrate a preference for non-proprietary data formats based on open standards.
“The data format includes digital data recorded by computers and instruments and metadata recorded in lab notebooks and reports.”
This answer is too vague to be informative.
“The output files will be in various binary formats that can be directly read by commercially available visualization software (TecPlot and IDL), and we will also produce data in standard NetCDF and HDF5 formats. All these data formats contain metadata that describes the simulation grid and simulation time of the sequence, the variables and their physical units.”
“Whenever possible, standard formats of data will be used, e.g.: images: TIFF, BMP, JPG.”
Naming the commercial software packages you plan to use is a good idea, but try to include the version numbers, if known.
“The format of the electronic data will be specific to the format used by the particular software in which it was created. For data generated from instruments, the output will often be in a proprietary ASCII format, in some cases non-proprietary text format will be available.”
If proprietary formats must be used, provide a means for translating them to standard formats or justify your decision not to do so.
“We will retain data in the form for which the University of Michigan’s long-term data repository, Deep Blue, offers the highest level of support (level 1 support). For images and image renderings, the format will be .tiff; for confocal microscopy coordinate files, the format will be .txt.; for data points appearing in tables and graphs the format will be .txt. The format for metadata will be .pdf, except for image processing tools, whose source code will be retained as .txt.”
This is a good example listing file formats and correct and up to date information about Deep Blue.
Describe the formats (file types) your data will be in. Proprietary formats are more difficult to preserve, as the software and hardware that reads them quickly becomes obsolete. Data should be stored in stable, non-proprietary formats, preferably those based on open and published standards, whenever possible. If your research will generate files in proprietary formats, consider converting those files into formats based on open standards for sharing and archival purposes.
Current DMP guidelines are not specific about metadata requirements.
A metadata record is a file that captures all details about a data set that another researcher would need to make use of the data set in a separate or related line of inquiry. Metadata captures the who, what, when, where, why and how of the data you produce. When data curators talk about metadata they are normally referring to a machine-readable description tha comes in a standardized format, often defined by an XML schema.