"Information in digital, computer-readable format or paper-based that is collected, generated or obtained during the course of or as a result of undertaking research, which is subsequently used by the Researcher as a basis for making calculations or drawing conclusions to develop, support or revise theories, practices and findings."
Examples of research data
Research Data Lifecycle
"The notion of a data lifecycle is one that has gained popularity as the culture of data sharing becomes part of our everyday research language. The data lifecyle extends the typical research cycle"
Corti, Louise, Van den Eynden, Veerle, Bishop, Libby, & Woolard, Matthew. (2014). Managing and sharing research data: a guide to good practice. London: Sage. p.17
CC BY-ND see https://www.jisc.ac.uk/guides/rdm-toolkit
Sensible file names and a well-organised folder structures makes it easier to find and keep track of data files. Links to relevant advice and resources provided by the UK Data Service are outlined below.
UK Data Service table of file formats recommended and accepted by them for data sharing, reuse and preservation.
UK Data Service guidelines for organising and formatting your data
UK Data Service guidelines on file and folder structures
It is important to ensure that different copies or versions of files, files held in different formats or locations, and information that is cross-referenced between files are all subject to version control. Guidance on version control and authenticity is available from the UK Data Archive
A Guide for Researchers from OpenAIRE: Data formats for preservation
Documentation is the contextual and explanatory information required to make sense of the dataset. It is a user's guide to your data making it understandable, verifiable, and reusable.
Document your data so that ...
Research data should be documented at various levels
Study level
Data level
Examples of data documentation
Learn more ...
Advice about good practice relating to documentation and metadata is available from the UK Data Service.
A Guide for Researchers from OpenAIRE: Electronic Lab Notebooks - should you go “e”?
Metadata is similar to Documentation (see related tab) but is more structured, conforms to set standards and is machine readable. It is required to facilitate archiving, discovery and citation of the dataset.
Metadata is a formal structured description of a dataset, used by archives to create catalogue records. It is structured, conforms to set standards and is machine readable.There are three categories of metadata:
Descriptive metadata includes author, title, keywords and abstract and enable users to find resources online.
Administrative metadata includes information about when and how a resource was created as well as file type, technical information and access rights.
Structural metadata provides information about the relationship between the parts that make up a compound object e.g.relating articles, issues and volumes of serial publications, or the pages and chapters of a book.
Metadata describes the content, quality, condition, and other characteristics of a dataset. It enables data to be preserved, minimizes duplication of effort in the collection of expensive digital data and fosters the sharing of digital data resources.
Why is metadata essential?
Metadata enables data developers to:
Metadata enables user to:
Metadata enables organizations to:
Essential fields
Title: Name of dataset or research project that produced it. (Include both if applicable.)
Creator(s): Names and addresses of the group that created the data.
Identifier: Unique identifier or number that is used to identify the data. This could be an internal project number or code to reference the data.
Abstract/Description: A brief synopsis of the project or data that another researcher can review quickly to see the relevance of the project to what they are seeking.
Dates: All the dates associated with the project. The most important is probably the release date of the data, but you'll eventually want to include:
Rights: Any known intellectual property rights held for the data or project.
Recommended fields
Contributor(s): Names and addresses of additional individuals that contributed to the project.
Subject: Keywords, phrases, or subject headings that will describe the subject or content of the data. (In adding these, think of how you would search for the materials.)
Funders: Organizations or agencies that funded the research or project.
Access Information: The location of the data and how the researcher can access the materials. (Confidentiality can be addressed here as well.)
Language: The language(s) of the content.
Location: If the data relates to a physical location, the spatial coverage should be documented.
Methodology: The process of how the data was generated, including the equipment software used including the version the experimental protocol data validation and quality assurance of the data any other relevant information
Data Processing: Documenting the alterations made to the data will aid in preservation of the data and record who made changes and for what reasons at specific times.
Sources: Citations for the sources that were used during the project. (Include where the other data or material was stored and how it was accessed when appropriate.)
List of File Names: List all of the data files associated with the project and include the file extensions. (e.g., stone.mov)
File Formats: Format(s) of the data and any software that is required to read the data including the version. (e.g., TIFF, FITS, JPEG, HTML)
File Structure: Organization of the data file(s) (and the layout of the variables when applicable).
Variable List: List of variables in the data files, when applicable.
Code Lists: Explanation of codes or abbreviations used in the file names, variables of the data, or the project over all that will help the user understand the project. (e.g., "999" indicates a missing value in the data)
Versions: Date/time stamp for each file and use a separate identifier for each version.
Checksums: Used to test if your file has changed over time. (This will aid in the long term preservation of the data and help make it secure by tracking alterations.)
Related Materials: Links or location of materials that are related to the project. (e.g., articles, presentations, papers)
Citation: The recommended way to cite the data or the information needed.
What is a metadata standard?
A Standard provides a structure to describe data with:
Standards provide a uniform summary description of a dataset.
The Research Data Alliance Standards Directory contains widely used metadata standards in the Arts and Humanities, Engineering, Life Sciences, Physical Sciences and Mathematics, Social and behavioural Sciences and General Research Data.
The Digital Curation Centre provides links to information about discipline specific metadata standards, including profiles, tools to implement the standards, and use cases of data repositories currently implementing them.
Biosharing is an educational resource on inter-related data standards, databases and policies in the life, environmental and biomedical sciences.
The Data Documentation Initiative (DDI) is an international standard for describing the data produced by surveys and other observational methods in the social, behavioural, economic, and health sciences. DDI is a free standard that can be used to document and manage different stages in the research data lifecycle, such as conceptualization, collection, processing, distribution, discovery, and archiving. Documenting data with DDI facilitates understanding, interpretation, and use by people, software systems, and computer networks.
CEDAR (Center for Expanded Data Annotation) is a repository of community defined metadata templates. and Retrieval. Its goal is to improve metadata and its use in the biomedical sciences. The CEDAR metadata tools can be used to create, annotate, analyze, validate and search metadata based on the fields and relations defined in the metadata templates.
Fairsharing.org is a good place to start to find metadata standards for your discipline.
Google Dataset Search is a good place to start your search for datasets related to your discipline. It is important to note that it is not a comprehensive index to datasets available in repositories.
Learn more about data repositories and archives
Note: Minimum requirements for the third-party hosting may be specified e.g. Geoscience Data Journal specifies that the host repository must be able to mint a DOI. New data journals that are peer reviewed and citable include Scientific Data (Nature) and the Geoscience Data Journal (Wiley).
Reference: Ware, Mark , & Mabe, Michael. (2015). The STM report: An overview of scientific and scholarly journal publishing: STM: International Association of Scientific, Technical and Medical Publishers.pp 141-2
The Library proactively supports and enhances the learning, teaching, and research activities of the University. The Library acts as a catalyst for your success as University of Galway’s hub for scholarly information discovery, sharing, and publication.
Library
University of Galway
University Road,
Galway, Ireland
T. +353 91 493399