Skip to content

News

“How to make your messy data usable?” and “Metadata and README” courses REGISTRATION CLOSED

In the month of April, ELIXIR Estonia will be holding two data management online courses: "How to make your messy data usable?" on the 4th of April and "Metadata and README" on the 11th of April. Both of the courses will be held online, in Zoom, and in English. 

"How to make your messy data usable?" course will be in two parts: an 1.5 hour online lecture on how to make a spreadsheet usable for other people, held on the 4th of April at 10:00 in Zoom. The practical workshop on cleaning your messy data with OpenRefine software will be a video lecture that you can follow on your own time. Additionally, we will hold 3 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software. In the "Metadata and README" lecture, we will be going over what exactly is metadata, what is the minimum information that should be included with each of the scientific results you are sharing and how exactly can you write a README file. 

 

In recent years, more attention is put on what researchers do with the data (and other resources) they produce. Especially in Europe, but also everywhere else. The main idea is that when researchers use taxpayers' money, the taxpayers themselves should also have access to the results, free of charge. This means that the research should be published in open access journals and data should be made publicly available. 

Good data management may help you with that, at least to make the process easier on the whole. If you think what to do with your data at the beginning and during the project and know what you plan to do with it at the end of the project, the process at the end will be easier. However, what is “good data management”, is up to debate. The FAIR Principles concentrates on making your data findable, accessible, interoperable and reusable, so this is a good start. And let’s be honest, some of these things you are probably already doing. 

 

How to make your messy data usable? course information

In this course, we will be going over how to name your files and variables, version control, compile a data dictionary, and what to do with empty cells. In the second part, OpenRefine software is introduced. With this, you can easily clean up the messy data. For the more practical aspect of using the OpenRefine software, I will share a video that will teach the basics. You can watch it anytime and do the lessons yourself. On three days (6.04, 7.04 and 8.04) there will be a 1h slot (10:00-11:00) on Zoom, when you can come and ask any question you have regarding tables and OpenRefine software. 

 

Information about the lecture:

Lecture: 4th of April, 2022 at 10:00 (lecture, 1.5h; in English)

Q&A session: 6.04, 7.04 and 8.04 at 10:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: https://forms.gle/axZTA5rw3bPnKDww9 REGISTRATION IS CLOSED

Registration closes at 23:59 on 31.03.2022 or when the course gets full.

Learning outcomes: 

  • Compile a data table that abides by the FAIR Principles
  • Recognize what a clean table for others to use looks like
  • Explain how to use OpenRefine to clean the messy data

 

Metadata and REAME lecture information

In general, metadata is the descriptive information about your data. However, what exactly is metadata and how much of it should be included with your data? Good metadata can make up for human fallibilities. People forget and misplace things, and leave research projects taking their knowledge of the research methodology and the data with them. Metadata ensures that we will be able to find the data, use it, preserve and reuse it in the future.

  • Finding Data. Metadata makes it much easier to find relevant data. Most searches are done using text (like a Google search), so formats like audio, images, and video are limited unless text metadata is available. Metadata also makes text documents easier to find because it explains exactly what the document is about.
  • Using Data. To use a dataset, researchers need to understand how the data is structured, definitions of terms used, how it was collected, and how it should be read.
  • Reusing Data. Researchers often want to reuse data collected for another project for their own project. The data still needs to be found and used, but often at a higher level of trust and understanding. Reusing data often requires careful preservation and documentation of the metadata.

This means that the metadata provides additional information that helps data consumers to better understand the meaning of the dataset, its structure and to clarify other issues, such as rights and license terms, the organization that generated the data, data quality, data access methods and the update schedule of datasets. Additionally, metadata also gives information about the data in general. What an actual metadata file includes, varies between disciplines and types of data you are working with. However, the documentation for your data should contain the minimum information required to be able to reuse (or understand) the data described. 

In this lecture, we will be going over what metadata about your dataset should be included when you are sharing it. Additionally, we will also go over some examples on how to write a good README file. 

 

Information about the lecture:

Time: 11th of April, 2022 at 10:00 (lecture, 2h; in English)

Place: ZOOM (link will be sent to your email)

Register: https://forms.gle/YKvQyd8wrx2cvyYf9 REGISTRATION IS CLOSED

Registration closes at 23:59 on 31.03.2023 or when the course gets full.

Learning outcomes: 

  • Understands the importance of good data management
  • Knows what metadata means in data files
  • Knows how to add metadata to the data
  • Knows what should be included in the README file
  • Can write a simple README file to accompany the data

 

Data Management Courses in March

Here is a list of data management related courses/webinars taking place in March, 2022. 

Overview of Open Research Europe, the open access publishing platform launched by the European Commission, on 4th of March, 2022. 

More information: https://us06web.zoom.us/meeting/register/tZUsfuysqzssGN3Ac2HNkKXmNTHQcRmuvz-m 

Data management related webinars by Aalto University in Finland, from March to May. The topics include: data management plans, how to store data, hands on anonymisation, etc. 

More information and registration: https://www.aalto.fi/en/services/training-in-research-data-management-and-open-science

Version control for Scientific Research using Git/Github on 11th of March, 2022. 

More information: https://unitn.zoom.us/meeting/register/tZwkdOGvrT0rE9VRG7VjbOTOvbxthg3TwV-p 

A Global Galaxy course Smörgåsbord is coming again! March 2022

GTN Smörgåsbord is a global 5-day Galaxy Training event showcasing a wide variety of Galaxy Training Network tutorials. This will be an online event, spanning all time zones. All training sessions are pre-recorded, so you can work through them at your own pace, and manage your own time. A large community of GTN trainers will be available via online support to answer all your questions.

More info and registration at bit.ly/smorgasbord2

The FAIR Principles lecture - (registration closed)

On the 31st of January, 2022, ELIXIR-Estonia will be holding an online data management course: The FAIR Principles. This lecture will be a short overview about the principles and is part of a bigger data management course package.

In recent years, more attention is put on what researchers do with the data (and other resources) they produce. Especially in Europe, but also everywhere else. The main idea is that when researchers use taxpayers' money, the taxpayers themselves should also have access to the results, free of charge. This means that the research should be published in open access journals and data should be made publicly available. 

Good data management may help you with that, at least to make the process easier on the whole. If you think “how to manage your data” at the beginning and during the project and know what you plan to do with it at the end of the project, the process at the end will be easier. However, what is “good data management”, is up to debate. The FAIR Principles concentrates on making your data findable, accessible, interoperable and reusable, so this is a good start. And let’s be honest, some of these things you are probably already doing. 

 

In this course, we will be going over all the FAIR Principles and how they are applied in real life. This way, you will already know what to consider, while writing a grant, filling out your data management plan or doing your research. 

 

Information about the lecture

Date: 31.01.2022 

Language: English

Time: 10:15 - 13:45

Place: Zoom, link will be sent couple of days before the lecture

We ask you to register responsibly. If you can't attend the lecture, please let us know as soon as possible via email (elixir@ut.ee). 

Register: the registration is closed

Registration closes at 23:59 on 24.01.2022 or when the course gets full. A confirmation email wil be sent on the 25th of January, 2022. 

NB! Since this course is popular, we have added another session of the lecture, on 2nd of February, 2022. 

Register: the registration is closed

In order to not miss out a course next time, subscribe to our newsletter at https://lists.ut.ee/wws/subscribe/elixir.news?previous_action=edit_list_request

 

Learning outcomes: 

  • Understands the importance of a good data management
  • Knows what are the FAIR principles and what they mean
  • Knows how to implement FAIR principles throughout the research project
  • Knows where to get more information about the FAIR Principles

FAIR principles in life science research practice online seminar

An online seminar on FAIR principles in life science research practice by ELIXIR Sweden and ELIXIR Neatherlands takes place on December 9th, 2021. The event is held in zoom, with pre-registration required.

This online workshop will showcase how to put the FAIR principles into practice and is specifically designed to cater to life science researchers at all career stages. It is organised by SciLifeLab Data Centre and NBIS and include international speakers, presenters from the SciLifeLab community and optional hands-on exercises guided by experts from NBIS and SciLifeLab Data Centre. 

More info at https://www.scilifelab.se/event/fair-principles-in-life-science-research-practice/

ETAIS & ELIXIR seminar UT HPC Usage 101 18.11.2021

During this UT HPC Usage 101 lecture course we will provide the participants basic knowledge required to use to submit, monitor, and control jobs on the compute nodes of UT HPC. We will talk about the principles of computing in a cluster and the difference between computing directly from the command line of a private server. We will introduce the good practices and standards of behavior that good cluster usage practice requires. Also the convenient graphical tools for monitoring the resource of your work will be covered. Finally, we might touch on the basic Slurm commands.

We expect the users to have basic LINUX command line experience, but very little or no cluster compute experience yet.

Learning outcome:
Knowing how to submit, monitor and control jobs at UT HPC.

The seminar will take place in Zoom on 18th of November 2021 at 14.00 and will last approximately 90 minutes. The seminar will not be recorded.

Registration is open at https://forms.gle/pVe86HVqjWB6GpLg8 and will close on 16th of November or when the course gets full. 

The seminar is held in English by Ulvi Gerst Talas and Ott Eric Oopkaup from UT HPC/ETAIS.

 

 

ELIXIR Nextflow course 22.-23.11.2021 Tartu

Overview

Nextflow is a powerful polyglot workflow language that provides a robust, scalable and reproducible way to run computational pipelines. In very practical and interactive sessions, participants will learn about Nextflow technology starting from basic through more advanced concepts, with the expectation they will acquire the proficiency to develop and deploy their own workflows.

This 2-day course will train participants to build Nextflow pipelines and run them. It is designed to provide participants with short and frequent hands-on sessions, while keeping theoretical sessions to a minimum.

A GitHub repository will be provided with all the necessary material and software installation guidelines.

Audience

This hands-on course is designed for absolute beginners who want to start using Nextflow to achieve reproducibility of data analysis. 

Learning outcomes

At the end of the course, participants are expected to be able to:

  • Develop a Nextflow pipeline from scratch.
  • Describe and explain Nextflow's basic concepts.
  • Implement short blocks of code into a Nextflow pipeline.
  • Execute/Run a Nextflow pipeline.
  • Test and modify a Nextflow pipeline.
  • Locate and fetch Nextflow pipelines from dedicated repositories.
  • Run a pipeline in diverse computational environments (local, HPC, cloud).
  • Share a pipeline.

Prerequisites

Knowledge / competencies

Applicants should be comfortable working with the CLI (command-line interface) in a Linux-based environment. If you do not feel comfortable with UNIX commands, you can refresh your knowledge with the UNIX Shell course materials here.

Participants will need to connect during the course to a remote server via the "ssh" protocol. You can learn how does ssh work in here.

Participants should be able to use a command-line/screen-oriented text editor (such as nano or vi/vim, which are already available in the server) or to be able to use an editor able to connect remotely. Instructor will be using VS Code with SSH extension, it is recommended to use the same text editor for ease of comparison. 

Knowledge of containers is not mandatory; however, recommended. 

Participants are preferred to bring their own laptop but also local Linux computers can be used (please let us know about it during the registration). They will work in their local machine to learn the basics and develop their very first pipeline. They will also work in a dedicated UT HPC environment in order to learn running pipelines in different computation environments. 

Schedule

22.11.2021 Day 1: Introduction to Nextflow. Learning the building blocks

23.11.2021 Day 2: Understand, develop and run a basic Nextflow pipeline

The course runs from 10.15-18.00.

Registration

Please register responsively. Due to COVID-19 situation we accept only fully vaccinated persons who must bring and show the vaccination certificate AND take a rapid antigen test before the course starts during Day 1 (9.30 on Monday 22.11.2021). We accept up to 12 people for the course based on the data submitted to the registration form.

This course description is adapted from SIB Nextflow course page.

 

How to make your messy data usable? (registration closed)

On the 25th of November 2021, ELIXIR-Estonia will be holding a new data management online course: How to make your messy data usable. The course will be held in English. This course will be in two parts: an 1 hour online lecture on what makes a data table usable for other people held on 25th of November at 13:00 in Zoom. The practical workshop on cleaning your messy data with OpenRefine software will be a video lecture that you can follow in your own time. Additionally, we will hold 3 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software.

 

More attention is put on what researchers do with the data (and other resources) they produce in recent years, especially in Europe, also in everywhere else. Since most of your data needs to be uploaded to a repository, it is essential that the data is tidy and other people understand and can easily read your data.  

In this course, we will be going over how to name your files and variables, version control, compile a data dictionary, and what to do with empty cells. In the second part, OpenRefine software is introduced. With this, you can easily clean up the messy data. For the more practical aspect of using the OpenRefine software, I will share a video that will teach the basics. You can watch it anytime and do the lessons yourself. On three days (29.11, 30.11 and 1.12) there will be a 1h slot (11:00-12:00) on Zoom, when you can come and ask any question you have regarding tables and OpenRefine software. 

 

Information about the lecture

Lecture: 25th of November, 2021 at 13:00 (lecture, 1h)

Q&A session: 29.11, 30.11 and 1.12 at 11:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: registration closed

Registration closes at 23:59 on 24.11.2021 or when the course gets full.

Materials: https://doi.org/10.5281/zenodo.5720271 

 

Learning outcomes: 

  • Compile a data table that abides by the FAIR Principles
  • Recognize what a clean table for others to use looks like
  • Explain how to use OpenRefine to clean the messy data

BY-COVID: A new EU project coordinated by ELIXIR

 

logo of the BY-COVID project

A new EU funded project coordinated by ELIXIR has been launched today. BY-COVID is a €12million Horizon Europe project striving to tackle the data challenges that can hinder effective pandemic response. 

This interdisciplinary project involves 53 project partners from across 19 countries within Europe and draws together experts from the life sciences, policy and social science ensuring that the project is driven by a diverse range of science. 

ELIXIR-Estonia is part of the consortium and associate professor Hedi Peterson from University of Tartu is leading the actions from Estonian side.

Find out more in the news release on the BY-COVID website.

ELIXIR UNIX Shell Courses - Autumn 2021

ELIXIR Estonia is organising two Unix Shell Courses (Basic and Advanced) in coming weeks (September 28th, October 5th).

Advanced computing power is hidden away in clouds/cluster/supercomputers that you do not have click and point access. As a general rule these high performance computer resources use Linux operating systems and are accessible only by a shell terminal and we are here to teach you to obtain skills to master the terminal in your future work.

Basics Course 28th of September, 10:15-16:00, University of Tartu, Delta #2005

This course is aimed to provide basic survival skills in Linux and the terminal environment. We will teach you how to access files and folders, move around and hopefully shake off the fear of getting stuck somewhere along the way.

Objectives:

  • Obtain basic knowledge on dealing with files using command line (Linux or Mac)
  • Learn how to use search over several text files, combine files, extract certain knowledge.
  • Tips and tricks for effective command line hacks that would save a lot of time.

No prior knowledge expected. 

Requirements: Lecture venue is computer class with linux computers. In case you bring your own Windows laptop, please make sure to have gitBASH (https://gitforwindows.org/) or Putty application (https://www.putty.org/) beforehand.

The lecturer is Dr. Priit Adler.

The course is free but please register responsibly at  https://forms.gle/7oHXsQWs1H5BjvRg7  (closes at 23:59 on 25.09 or when the course gets full).

Unix Shell Advanced Course 5th of October, 10:15-16:00, University of Tartu, Delta #2005 (COURSE FULL) and  New edition on 19th of October, 10.15-16.00, University of Tartu, Delta #2005

This course is aimed to streamline your skills in Linux and the terminal environment. We will teach you how to make your life easier by creating maintainable and flexible bash scripts for your commonly used workflows or SLURM jobs.
Objectives:

  • Learn how to iterate operations over many input files with bash loops and conditions
  • Learn how to combine complicated command line based workflows into maintainable bash scripts
  • Add additional useful utility functions and tools to your toolbox

Expected prior knowledge: Experience in using the basic commands covered in the Basics course (e.g. cd, ls, mkdir, mv, cp, head, cat, find, less, pwd)


Requirements: Lecture venue is computer class with linux computers. In case you bring your own Windows laptop, please make sure to have gitBASH (https://gitforwindows.org/) or Putty application (https://www.putty.org/) beforehand.

The lecturer is Dr. Priit Adler.


The course is free but please register responsibly at  https://forms.gle/DEoHkbkPCBC1FfQ16 (closes at 23:59 on 17.10 or when the course gets full).