Skip to Main Content

Research Data Management

What is research data?

Research data is:

  • Digital objects: such as text files, image files, sound files
  • Databases : structured collection of records or data stored in a computer system

The Office of Research Integrity defines research data as:

  • Any information or observations that are associated with a particular project, including experimental specimens, technologies, and products related to the inquiry.

Following best practices for managing research data are essential throughout the data lifecycle.

Research Data Life Cycle


Data Management & Sharing SNAFU (a video in 3 short acts)

This short video, created by the librarians at NYU's Ehrman Health Sciences Library, tells a data management cautionary tale of what should not happen when a researcher makes a data sharing request. 

Why manage your research data?

Data management is an essential part of the research process. Many funders, publishers, and institutions have data management and sharing policies. When data is managed well, it is easier to find, analyze, validate, and share. Good data management practices are key for responsible research and reproducibility. When data is managed well, it supports published research. In addition, data sets can be cited, which can enhance your research impact.

NIH has issued a new Data Management and Sharing (DMS) policy, effective January 25, 2023, to promote the sharing of scientific data. This Policy requires researchers to:

  • Plan and budget for the managing and sharing of data
  • Submit a DMS plan for review when applying for funding
  • Comply with the approved DMS plan

Additional information is available on Einstein's Data Management and Sharing website.

NSF Data Sharing Policy states that "investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants."

AHRQ's public access policy states that "digitally formatted scientific data resulting from unclassified research supported wholly or in part by Federal funding should be stored and publicly accessible to search, retrieve, and analyze."
In addition all AHRQ-funded researchers will be "required to include a data management plan for sharing final research data in digital format, or state why data sharing is not possible."

Many journals require data be made available to readers. Here are a few examples.

FAIR Principles

  • Findable
    • Unique and persistant identifiers ( e.g. DOIs, ORCIDs) and metadata allow data to be located quickly and efficiantly
  • Accessible
    • Protocol is universally implementable (avoid using proprietary software and platforms)
    • Clearly stated conditions and requirments under which the data is accessible
    • Metadata are accessible even after data are no longer available
  • Interoperable
    • Common data elements to enable combining and sharing datasets (e.g. NIH CDE Repository)
    • Commonly used controlled vocabularies, ontologies, thesauri (e.g. MeSH)
    • Data dictionary to describe data
  • Reusable
    • Clear and accessible data usage license (e.g. Creative Commons)
    • Documentation of software, code, and similar files
    • Clear description of authorship and provenance