Skip to Main Content

Research Data Management

Preserving and Sharing

The Research Data Management Policy at the University of Galway states that

  • Research data of future historical interest, and all research data that represent records of the University, including data that substantiate research findings, will be offered and assessed for deposit and retention in an appropriate national or international data service or domain repository, or a University repository.
  • Exclusive rights to reuse or publish research data should not be handed over to commercial publishers or agents without retaining the rights to make the data openly available for re-use, unless this is a condition of funding. An exemption from this must be agreed with the Technology Transfer Office.

Reasons for sharing data 

  • Impact & longevity: Your data may be cited by others. Open publications and data receive more citations, over longer periods
  • Compliance: Funders, publishers and institutions may require that you share your data
  • Transparency & quality: Your findings can be replicated and compared with other studies
  • Collaboration: creates opportunities for follow on research and collaboration
  • Re-use: Your data can be used in novel ways. Data sharing facilitates re-use of your data for future / follow-on research and discovery as data collection can be funded / collected once, and used many times for a variety of purposes
  • Efficiency: Data sharing is good research practice!

There may be reasons for not sharing your data e.g. privacy and confidentiality issues, commercial value of the data.  Horizon 2020 has coined the phrase “As open as possible, as closed as necessary.”

If you are unable to publicly share your data, consider the possibility that you may wish to make your data available internally to future researchers to facilitate follow-on research, and/or to create a metadata record in your chosen archives or repository. A metadata record will describe your data and aid others in knowing about it. In order to ensure this can happen you will need to manage your data.

Ref: CONUL Research Group (2018). Where to submit data: CONUL Information Sheet.

Legal and Ethical considerations

Some data may not be suitable for sharing. There may be legal or ethical factors to consider e.g., consent, privacy, copyright or commercial considerations.

For further information read the DCCs (Digital Curation Centre) guide on How to appraise and select research data for curation

Data deposits should be accompanied by supporting documentation and metadata to help others make sense of your data.A data license to indicate how you expect the data to be used is also necessary. For further information Information  read the DCCs (Digital Curation Centre) guide on How to License Research Data

Restricting access to your research data

There are many reason why access to research data may need to be restricted. Some examples are provided below

“We intend to make a patent application, and must avoid prior disclosure.”

“Don’t want to make locations of members of endangered species available to poachers.”

“The research data are confidential because of the arrangement my research group has made with the commercial partner sponsoring our research.”

“My data form part of a long-term study upon which my research group is entirely reliant for its on-going research publications and academic reputation.  We only share this with trusted colleagues.”

Sensitive Data

A Guide for Researchers from OpenAIRE: How to deal with sensitive data: Learn how to preserve your sensitive data safely

Generalist Repository Ecosystem Initiative (GREI) Workshop https://doi.org/10.5281/zenodo.7714262 

Attribution and credit for research outputs also applies to research data. Data citation provides the information necessary to locate, attribute and access the research data, also enabling it to verified or reused.

Benefits of data citation

  • Facilitates a link from publications to underlying data
  • Ensures recognition of scholarly effort
  • Increases transparency of research
  • Enables the measurement of  impact
  • Facilitates access to and re-use of data

Citations should include a Persistent Identifier

  • ORCID (for people) 
  • DOI (Digital Object Identifier) unique, alphanumeric string assigned by a registration agency to identify content and provide a persistent link to its location. DOIs may be assigned to any item of intellectual property that is defined by structured metadata
  • ARK (Archival Resource Key) an URL designed to support long-term access to information objects. An ARK can refer to digital, physical, or intangible objects or living beings and groups

ADVICE: Use the Library's DOI service to obtain a DOI for a dataset. 

A dataset citation should include where applicable:

  • Author/Principal Investigator/Data Creator
  • Publication date/Release Date, for a completed dataset
  • Title of Data Source – formal title of the dataset
  • Version/Edition Number – the version of the dataset used in the study
  • Format of the Data – physical format of the data
  • Publisher
  • Resource type
  • Persistent Identifier - such as a DOI
  • Location or Identifier – a persistent URL where the dataset may be accessed e.g.,  Digital Object Identifiers (DOI), Handles, Archival Resource Key (ARK), etc.
  • Access Date and Time – when data is accessed online
  • Subset of Data Used – description based on organization of the larger dataset
  • Editor or Contributor – reference to a person who compiled data, or performed value-added functions
  • Publication Place – city and state and country of the distributor of the data
  • Data within a Larger Work – refers to the use of data in a compilation or a data supplement (such as published in a peer-reviewed paper)

See also guidelines provided by the Data Curation Centre

Learn more about ...

Data citation principles

  • FORCE 11  a community of scholars, librarians, archivists, publishers and research funders that has arisen organically to help facilitate the change toward improved knowledge creation and sharing

Software citation principles

  • Smith AM, Katz DS, Nieme KE, FORCE11 Software Citation Working Group.(2016) Software citation principles. PeerJ Computer Science 2:e86 

Data repositories or archives

A data repository allows researchers to upload and publish their data, thereby making the data available for other researchers to re-use. Similarly, a data archive allows users to deposit and publish data but will generally offer greater levels of curation to community standards, have specific guidelines on what data can be deposited and is more likely to offer long-term preservation as a service. Sometimes the terms data repositories and data archives are used interchangeably. A data repository or archive will provide services such as:

  • Persistent identifier such as a “digital object identifier” or DOI; the presence of a DOI facilitates discoverability and citeability
  • Assistance with metadata provision e.g. through the use of a template
  • Allow you to apply a licence to your data
  • Aid compliance with the FAIR data principles (data that are Findable, Accessible, Interoperable, and Reusable) as data are published online with appropriate metadata and are assigned a persistent identifier, see Jones, Sarah, & Grootveld, Marjan. (2017, November). How FAIR are your data?. Zenodo. http://doi.org/10.5281/zenodo.1065991
  • Accept a wide range of data types
  • Long-term access and, in some cases, long-term preservation
  • Offer useful search, navigation and visualisation functionality
  • Reach a wider audience of potential users
  • Manage requests for data on your behalf

When to select a data repository?

Choose early so that you can familiarise yourself with the repository’s requirements. Requirements may include:

  • depositing in certain file formats
  • using a specific metadata standard
  • inclusion of documentation to help describe your data.

Understanding such requirements will enable you to design your data collection materials for easier metadata and documentation creation.

Initial questions

  • Has a data repository been specified by my funder? e.g.
    • NERC Data Centre for research funded  by the UK’s Natural Environment Research Council

How to select a data repository

Ask:

  • Is it reputable? Is it listed in Re3data thereby meeting their conditions of inclusion?
  • Is it appropriate to my discipline?
  • Will it take the data you want to deposit?
  • Is there a size limit?
  • Does it provide a DOI / persistent identifier?
  • Does it provide guidance on how the data should be cited?
  • Does it provide access control, where necessary,  for your research data?
  • Does it ensure long-term preservation / curation?
  • Does it provide expert help with e.g. metadata provision, curation?
  • Is there a charge?

Other questions may pertain depending on your requirements. For more information see the UK’s Digital Curation Centre’s checklist

Locate a data repository

Some universities have their own data repositories that offer the facility for researchers to deposit, share and licence their data resources for discovery and use by others. There are more than 600 discipline-specific data repositories worldwide with community specific standards. They may also be called data centres or archives.

re3data.org (Registry of Research Data Repositories) is the primary place to locate a data repository.  You can search it by specific research discipline and then filter by access categories, data usage licenses, whether the repository gives the data a persistent identifier etc. 

Re3data uses a series of symbols to indicate key services e.g.

  • To be registered in re3data.org a research data repository must:
    • be run by a legal entity, such as a sustainable institution (e.g. library, university)
    • clarify access conditions  to the data and repository as well as the terms of use
    • have focus on research data

See also FAIRsharing.org which is a manually curated registry and has a historical focus on the life sciences.

Discipline-specific repositories have the expertise and resources to deal with particular types of data. They have different policies and may charge for their services.

See also PLOS recommended repositories

Multidisciplinary repositories

If there is no disciplinary-specific repository in your area select a general repository. These can handle a variety of different data types. Charges may apply but can be included in a funding application. Key general repositories are listed in the table below. This list is for information purposes only and is not exhaustive:

Data Hub provides free access to its core features letting you search for data, register published datasets, create and manage groups of datasets

Dataverse A personal dataverse is easy to set up, allows you to display your data on your personal website, can be branded uniquely as your research program, makes your data more discoverable to the research community, and satisfies data management plans

Dryad hosts a wide range of data types. For some journals there is no charge to deposit in Dryad.

Figshare is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner

Github is a code hosting site where you can store and share code for free

Open Science Framework is a free open platform that supports research and enables collaboration

Zenodo is a multi-disciplinary data repositories where researchers can deposit both publications and data and create links between them

See also:

ICPSR is an international consortium of more than 750 academic institutions and research organizations that maintains a data archive of more than 250,000 files of research in the social and behavioral sciences

How to find a trustworthy repository for your data? Guides for Researchers from OpenAIRE

See also the Generalist Repository Comparison Chart by Stall, Shelley et al. (2020). Generalist Repository Comparison Chart. Zenodo. https://doi.org/10.5281/zenodo.3946720

See also ANNEX 1 - Inventory of identified trusted repositories.xlsx produced as part of the following report:

Jahn, Najko, Laakso, Mikael, Lazzeri, Emma, & McQuilton, Peter. (2023). Study on the readiness of research data and literature repositories to facilitate compliance with the Open Science Horizon Europe MGA requirements (Version 1.0). Zenodo. https://doi.org/10.5281/zenodo.772801

When you make your data available you need to use a license so that potential users know what they are allowed to do with your data. 

A license states what can be done with the data and how that data can be redistributed e.g. Creative Commons Licences

Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre.

Learn more about rights relating to research data from the UK Data Service 

Uk Data Services, Best practice in governance of data for research: Licensing and accessing Webinar and Slides and Q&A, 18 April 2018

Code

GitHub is the main platform for hosting and reviewing code. It offers a number of advantages such as assigning DOIs (which facilitates discoverability and citeability) and allowing integration from Zenodo and FigShare repositories to enable the citing of your GitHub repository in academic literature.

The following guide is designed to assist curators of research data in making informed and sustainable value assessments for long-term preservation:

Jonathan Dorey, Grant Hurley, & Beth Knazook. (2022). Appraisal Guidance for the Preservation of Research Data. Zenodo. https://doi.org/10.5281/zenodo.5942236