Introduction to Bioinformatics 2016

Course Description

We are going to look into basic methods, that are extensively used in the field of Bioinformatics and Biostatistics using real-life examples.

Course topics

Bioinformatics – is an interdisciplinary field, which aims to analyze and interpret biological data using data science methods. In this course will start with an introduction to data pre-processing techniques in Bioinformatics, then we will learn how to apply various Machine Learning and Data Mining methods to analyze different types of biological data and predict biological signal of interest. Additionally, we will introduce publicly available web tools applied for analysis and visualization.

As we aim to provide hands-on practical skills, all lectures will be followed by the practice sessions in R. R is one of the most popular programming languages used in data science for data exploration, visualization and statistical analysis. R has a very handful user interface – RStudio. Moreover, the course will be based on real-life data sets to give participants an idea of problems that they might face when carrying out analysis of their own data.

Course structure

  • Lecture I: data pre-processing, feature selection techniques
    • + practice
  • Lecture II: prediction of phenotype with different ML algorithms
    • + practice
  • Lecture III: Clustering, visualization and web tools (online practice)

Course tools



Basic statistics and math, no previous experience with R is required.


Mr. Dmytro Fishman
A PhD student in the field of Bioinformatics at the University of Tartu

Affiliation: University of Tartu/Quretec Ltd.

Received Bachelor’s degree from the National University of Ukraine (KPI), and Master’s degree from the University of Tartu (Estonia). Now Dmytro is a PhD student at the University of Tartu, working in the field of bioinformatics.

Dmytro has experience teaching Data Mining, Machine Learning, Bioinformatics, Advanced and Text Algorithms courses in the University of Tartu to post-graduate students. Recently became a certified trainer in Data and Software Carpentry organisations that aim to skills Data Science to scientists from different areas.

Has been a program committee member in the Summer School AACIMP. Currently, member of UPEER organisation that aims to contribute to development of local scientific societies. Active participant of Kaggle competitions, one of which could be used as a project work.

Fields of interests: Data Mining, Machine Learning, Bioinformatics, Image Recognition, Deep Learning, Advanced Algorithms.

Contacts[email protected]

Ms. Elena Sügis
Bioinformatics researcher at the University of Tartu

Affiliation: University of Tartu/Quretec Ltd.

Currently, I work as a bioinformatics researcher at Quretec Ltd., and as a junior bioinformatics researcher at the Institute of Computer Science, University of Tartu, Estonia.

I have recently become a certified trainer in Data and Software Carpentry organizations that aim to teach fundamental Data Science skills needed to conduct research.

My own research involves applications of biostatistics, data mining, network analysis and data integration to the various range of biological questions.

At the moment, I mainly work for AgedBrainSYSBIO, a European collaborative research project that aims to address the basis of brain aging and associated neurodegenerative disorders like Alzheimer’s disease. In AgedBrainSYSBIO I’m developing new ways of data integration and statistical analysis of various biological data types in a consistent manner.

Additionally, I am actively participating in science and IT popularization events among the general public.

Fields of interests: Data mining, Bioinformatics, Biostatistics, Network analysis, Machine learning, Data integration.

Contacts: [email protected]