In this section we will go over data sharing and data repositories.
Sharing your research data
Sharing the data that you use in your research helps to ensure that other researchers - and you - can find and re-use that data for future research projects. This is particularly important in the context of publicly funded research: if the public has paid for the collection and analysis of the data, then it’s a reasonable expectation that the data be shared.
As more and more research has become data-driven, the need to share and re-use data has become more important. Almost everying we do these days has some connection to some sort of data, whether we realize it or not.
The FAIR Principles for Research Data website provides detailed information and rationales for data sharing. The FAIR Principles stipulate that data should be
- Findable
- Accessible
- Interoperable
- Reusable
In March 2021, the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Social Sciences and Humanities Research Council of Canada (SSHRC) published their Tri-Agency Research Data Management Policy. It outlines specific steps that research institutions (like UBC) and researchers must take to ensure that research data “be responsibly and securely managed and be, where ethical, legal and commercial obligations allow, available for reuse by others.”
Does that mean that your research data needs to be freely available to everyone? Not necessarily. Some data cannot be freely and openly shared: data that has specific licensing constraints or data that is ecologically or personally sensitive. Although you will still be required to retain and preserve that data, it is possible to make other researchers aware of the existence of the data without opening up the data itself.
For your MGEM capstone project, however, you’re going to deposit your final report and the supporting data in a publicly accessible data repository, UBC Dataverse.
Data repositories
Data repositories are online platforms that allow for the storing and sharing of the types of metadata Evan discussed alongside research data files. Many different types of organizations host data repositories, each for their own reasons. Some examples include:
- Mendeley Data, hosted by the publisher Elsevier
- Dryad, a non-profit organization, focusing mostly on life sciences data
- Environmental Data Initiative Repository
- British Columbia Data Catalogue, hosted by the government of BC
- Scholars Portal Dataverse, hosted by Scholars Portal
Research institutions might have data repositories that are not publicly accessible. For example, a research council or government agency might have a well organized repository for its data that is accessible only from its own networks. This is definitely true for research labs within universities: most have some sort of data storage where their active (or working) data lives.
The re3data website provides a valuable registry of data repositories across many disciplines. It can be an excellent place to start when you’re looking for data to support your own research.
Our focus is on depositing research data in a publicly accessible repository to make it available for other researchers to find and re-use. In order for it to be findable, you have to pair your data files with some descriptive metadata and README files. This type of metadata is similar to what Evan just talked about, but is oriented more toward search and discovery systems. Preparing the metadata takes a bit of work and experience, but if you’ve been methodical in working with your data it shouldn’t be too onerous.