The University of Sheffield Logo

Dataviz.Shef

Loading, please wait ...

Learning Path - Workflow
Increase your research impact through reproducible data visualisation workflows.
Dataviz.Shef Team

WELCOME!

This is the third learning path prepared from the Dataviz.Shef team that is specifically designed for those who have completed the Learning path - Lab or those experienced in creating data visualisations using tools such as Python, R, Matlab, or Javascript etc.. If not, it is recommended that you go through the Learning path - Lab before you read on.

We will be referring to external resources as there are already resources available on the internet and we have organised them in relevant sections. In addition, the university has a partnership with Linkedin Learning providing thousands of online training courses to staff and students through MUSE, we have also included some useful courses to help you get started.

The previous learning path focused on exploring what we can do with each programming language to produce suitable data visualisations through three stages: data processing,  Data visualisation, and Share. This learning path will guide you through a reproducible workflow and how it might help you to increase your research impact.

{" "}

Reproducibility

In computer science, reproducibility means that as long as the environment and initial conditions are the same, when the program is executed or the program is executed repeatedly, no matter if it is executed from beginning to end without stopping, or "stop and restart" execution, we will get the same result. Therefore, reproducibility of data visualisation should follow the same principle.

Given the data and code provided (by an author) we should be able to get the same output by following the same steps as the author does. But how do we ensure that data and code are preserved and is accessible even after a long period of time for further research or reference is important for data visulisations as well as for publications. We hope you will find this information useful and if you have any questions or have found any errors in this page please contact us on our communication channels (email, slack, google group).

{" "}

Make yourself identifiable

Sometimes two researchers could have the same name or initials due to culture differences and writing systems etc, it is important that there is a mechanism for identifying who you are regardless where and when you published articles, datasets, and make other contributions. To resolve this problem, ORCID (Open Researcher and Contributor ID) was introduced in 2012 that can provide a persistent digital identifier (an ORCID iD) that distinguishes you from other researchers and a record that supports automatic links among all your professional activities. Your ORCID iD and connections are stored in the ORCID Registry, in an account you own and manage. To register, visit orcid.org.

{" "}

Make your data identifiable

Identifying datasets and codes is equally important to identify yourself. DOI (digital object identifier) was introduced in 2000 to identify many resources across academic, professional, and government information, this will help you to include your datasets and code within your publication through DOI. If other researchers can easily locate your data and make use of it to reproduce your work to make sense of your data and communicate findings more widely. Read the next section to learn more about options of data repositories you can use.

Deposit your data

For reproducibility purposes we advise you to deposit data and codes in data repositories that can be accessed and cited by anyone around the world through a unique DOI (digital object identifier). For some repositories it is possible to use programming languages like Python and R to fetch data directly from the source using URL provided. If your data is not classified as sensitive data and no discipline-specific repository is available, we encourage you to deposit data and code in University's data repository (powered by figshare) - Online Research Data (ORDA). For other recommendations including discipline-specific repositories, visit The University Library’s Research Data Repositories Page.

If you have a GitHub account, check out the following articles:

There might be cases where your data comes from external sources like data.gov.uk, don't forget acknowledge the source of the information in your data and state the licence applied to.

We also encourage you not to limit to data visulisation and to visit The University Library RDM page to give you a broad overview of Research Data Management at the university and many specific guidances.


Congratulations!

You have completed the learning path - Workflow, now you understand the importance of the reproducible data visualisation workflow and also able to deposit data and code in suitable data repositories. Note that we are updating learning paths from time to time, so do keep eye on this website and our community channels to get the latest information.
Edit this page on GitHub