Skip to Main Content

Metadata and Data Documentation

Introduction to tools, resources, standards, and support for metadata and data documentation.

Introduction

There are a number of great resources available to help researchers find a metadata standard, vocabulary, or tool that best suits their research discipline.  Below are a list of some of the most commonly used resources.  If you would like assistance in finding the standards, vocabularies, or tools that best fit your research, contact Matt Carruthers, Metadata Engagement Librarian.

Resources for Finding Metadata Standards and Ontologies

  • Research Data Alliance Metadata Directory - The RDA Metadata Directory is a collaborative, open directory of metadata standards applicable to scientific data. Subject areas include arts and humanities, engineering, life sciences, physical sciences & mathematics, social & behavioral sciences, and general research data (multidisciplinary).
  • Linked Open Vocabularies (LOV)LOV provides a searchable repository of vocabularies and ontologies used to describe many different disciplines and domains.
  • Data Documentation Initiative - DDI is an international standard for describing statistical and social science data. It contains a metadata specification, as well as a list of tools to help researchers work with DDI metadata.
  • FAIRsharing - FAIRsharing (formerly BioSharing) offers a searchable database of metadata standards, markup languages, taxonomies, and other resources for all disciplines.
  • BioPortalBioPortal offers an extensive repository of biomedical ontologies, including a recommender tool to help choose the best ontology for your research.
  • Open Metadata RegistryThe Metadata Registry provides services to developers and consumers of controlled vocabularies and is one of the first production deployments of the RDF-based Semantic Web Community's Simple Knowledge Organization System (SKOS).

Name Authority Files

Name authority files are controlled lists of names for individuals or organizations which help to uniquely and consistently identify those entities.  

  • ORCID - ORCID provides a persistent digital identifier for researchers worldwide.
  • International Standard Name Identifier (ISNI)ISNI provides a persistent digital identifier for the public identities of people and organizations across all fields of creative activity.
  • Virtual International Authority File (VIAF)VIAF is an international service designed to provide convenient access to the world's major name authority files, including many authority files maintained by national libraries.
  • Library of Congress Name Authority File (LCNAF)The LCNAF provides authoritative data for names of persons, organizations, events, places, and titles.
  • Union List of Artist Names (ULAN)The ULAN is a structured vocabulary containing names and other information about artists, patrons, firms, museums, and others related to the production and collection of art and architecture.

Tools for Creating and Managing Metadata

  • File Information Tool Set (FITS)FITS identifies, validates and extracts technical metadata for a wide range of file formats.
  • JHOVEJHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.
  • ExiftoolExifTool is a platform-independent Perl library plus a command-line application for reading, writing and editing meta information in a wide variety of files. ExifTool is also available as a stand-alone Windows executable and a Macintosh OS X package.
  • Apache TikaThe Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.
  • Colectica for ExcelColectica for Microsoft Excel is a free tool to document your spreadsheet data using the open standard for data documentation.