Master’s thesis defense, 2020

Master’s thesis defense, 2020
January 21-24, 2020. Faculty of Applied Sciences, UCU. Students of the master’s program in Data Science will defend the diplomas thesis. The event is free to attend. The address is Kozelnitska street 2a, Academic building of UCU, Lectorium (room #016).

Schedule of diploma defenses

January 21st

10:00, Serhii Brodiuk, Concept Embedding and Network Analysis of Scientific Innovations Emergence

Abstract. Novelty is an inherent part of innovations and discoveries. Such processes may be considered as the appearance of new ideas or as the emergence of atypical connections between existing ones. The importance of such connections hints for investigation of innovations through network or graph representation in the space of ideas. In such representation, a graph node corresponds to the relevant notion (idea), whereas an edge between two nodes means that the corresponding notions have been used in a common context. The question addressed in this research is the possibility to identify the edges between existing concepts where the innovations may emerge. To this end, a well-documented scientific knowledge landscape has been used. Namely, we downloaded 1.2M manuscripts dated starting from April 2007 and until September 2019; and extracted relevant concepts for them using platform. Combining approaches developed in complex networks science and graph embedding the predictability of edges (links) on the scientific knowledge landscape where the innovations may appear is investigated. We argue that the conclusions drawn from this analysis may be used not only to the scientific knowledge analysis but are rather generic and may be applied to any domain that involves creativity within.

10:40, Volodymyr Lut, Neural Architecture Search: a Probabilistic Approach

Abstract. In this paper we review different approaches to use probabilistic methods in existing AutoML solutions using Reinforcement Learning. We focus on providing additional knowledge about probability distribution provided to Reinforcement Learning agents solving Neural Architecture Search tasks. Based on the results of the research we come with an agent designed to model Neural Architectures for image classification tasks.

11:20, Oleksandr Smyrnov, A Multifactorial Optimization of Personnel Scheduling in Fleets of Seagoing Vessels

Abstract. The maritime industry is huge and consists of a lot of complex processes. It is a consequence of the fact that the maritime industry provides most of the goods transportation. During transportation, people serve the vessel. And here the problem is raised of the optimal distribution of crew on vessels. This problem can be solved by formalizing the integer programming problem. In practice, we saw that solving this problem is time-consuming since there are a large number of free variables. This makes the solution inapplicable to the end-user. In this work, we describe the approach to speed up a solution of crew optimization for the maritime industry using the Rolling Time Horizon technique. Our approach is 3.5 times faster than the benchmark and deviates from the optimal solution by less than 1%.

13:00, Kostiantyn Ovchynnikov, Audience Profile Construction for Local Businesses Marketing Campaigns

Abstract. This work addresses the problem of automatic target user profile construction. We introduce the methodology and framework for small business owners, which has a food business, to create demographics portrait of the customer through competitors detection and text processing.

13:40, Maksym Opirskyi, Topological Approach to Wikipedia Article Recommendation

Abstract. Human navigation in information spaces has increasing importance in ever-growing data sources we possess. Therefore, an efficient navigation strategy would give a huge benefit to the satisfaction of human information needs. Often, the search space can be understood as a network and navigation can be seen as a walk on this network. Previous studies have shown that despite not knowing the global network structure people tend to be efficient at finding what they need. This is usually explained by the fact that people possess some background knowledge. In this work, we explore an adapted version of the network consisting of Wikipedia pages and links between them as well as human trails on it. The goal of our research is to find a procedure to label articles that are similar to a given one. Among others, this would lay a foundation for a recommender system for Wikipedia editors, which will suggest links from the given page to the related articles. Our work is, therefore, providing a basement for enhancing the Wikipedia navigation process making it more user-friendly.

14:20, Roman Moiseiev, Stock Market Prediction Utilizing Central Bank’s Policy Statements

Abstract. The stock market is quite unpredictable and affected by a vast number of factors. Moreover, many central banks, banks, hedge funds, and other financial institutions target their R&D departments to try to predict the probabilities of market movements, possible black swans, and other risks. In this work, I target inefficiencies in the prediction of the market reaction on central bank policy statements. Such statements have two parts: action and information. Therefore in complicated cases, automatic trading systems react to actions and may not recognize vital insights from the informational component. To improve this, I collected historical data for monetary actions and press releases by Federal Reserve, stock price data, Fed Fund futures contract prices. Based on that, I build several classification models to predict the class of policy statements. Afterward, prepared pipeline and the econometric model that can incorporate a class of a policy statement for stock market reaction evaluation.

January 22nd

09:30, Vadym Korshunov, Region-Selected Image Generation with Generative Adversarial Networks

Abstract. Generative adversarial networks (GANs) are one of the most popular models capable of producing high-quality images. However, most of the works generate images from the vector of random values, without explicit control of desired output properties. We study the ways of introducing such control for the user-selected region of interest (RoI). First, we overview and analyze the existing works in areas of image completion (inpainting) and controllable generation. Second, we propose our model based on GANs, which united approaches from the two mentioned areas, for the controllable local content generation. Third, we evaluate the controllability of our model on three accessible datasets – Celeba, Cats, and Cars – and give numerical and visual results of our method.

10:10, Roman Riazantsev, 3D Reconstruction of Video Sign Language Dictionaries

Abstract. Today virtual and augmented reality applications become more and more popular. Such a trend creates a demand for 3D processing algorithms which may be applied to many areas. This work is focused on sigh language video sequences. There are a lot of prerecorded photos and video dictionaries that can be transformed into 3D and unified in one place. We research nuances of hand pose video sequence analysis as well as the influence of results refinement for 2D and 3D keypoint detection. Besides that, we designed a solution for the parametrization of hand shape and engineered system for 3D hand pose reconstruction. Model show good results on train data but lack generalization. Retraining on multiple datasets and usage of various data augmentation techniques will improve performance.

11:00, Oleh Misko, Ensembling and Transfer Learning for Multi-domain Microscopy Image Segmentation

Abstract. A lot of imaging data is generated in medical, and particularly in the microscopy field. Researchers spend a lot of time analyzing this data due to slow algorithms and exhaustive manual work. Recent advancements in machine learning and especially deep learning areas resulted in methods that could be used to efficiently solve challenges in the microscopy imaging field. Image segmentation is one of the most common labor-intensive tasks that could be automated with deep learning approaches. One of the biggest challenges for computer algorithms in this domain is the problem of domain shift. The domain shift is the difference between the distribution of the data used for training and the distribution of the new upcoming data. In this work, we show that deep neural networks could efficiently segment microscopy images with the domain shift present. Moreover, we show that transfer learning from other medical tasks is an effective strategy to reduce the amount of required annotated data, whereas fine-tuning ImageNet models for microscopy segmentation gain little benefit.

11:40, Oleh Onyshchak, Image Recommendation for Wikipedia Articles

Abstract. Multimodal learning, which is simultaneous learning from different data sources such as audio, text, images; is a rapidly emerging field of Machine Learning. It is also considered to be learning on the next level of abstraction, which will allow us to tackle more complicated problems such as creating cartoons from a plot or speech recognition based on lips movement. In this paper, we will introduce a basic model to recommend the most relevant images for a Wikipedia article based on state-of-the-art multimodal techniques. We will also introduce the Wikipedia multimodal dataset, containing more than 36,000 high-quality articles.

13:30, Yevhen Pozdniakov, Changing Clothing on People Images Using Generative Adversarial Networks

Abstract. Generative Adversarial Networks (GANs) in recent years has certainly become one of the biggest trends in the computer vision domain. GANs are used for generating face images and computer game scenes, transferring artwork style, visualizing designs, creating super-resolution images, translating text to images, etc. We want to present a model to solve an image problem: generate new outfits onto people’s images. This task seems to be extremely important for the offline/online trade and fashion industry.Changing clothing on people’s images isn’t a trivial task. The generated part of the image should have high quality without blurring. Another problem is generating long sleeves on the images with T-shirts, for example. As a result, well-known models are not suitable for this task. In the master project, we are going to reproduce the model for clothing hanging on people’s images based on the existing approaches and improve it in order to get better quality of the image.

14:10, Ivan Prodaiko, Person Re-identification in a Top-view Multi-camera Environment

Abstract. The thesis introduces the reader to the concepts of edge computing in terms of person re-identification and tracking problem. It describes the challenges, limitations, and current state-of-the-art solutions. The author proposed a pipeline for the task, launched several experiments on validating different parts of the system, and provided a theoretical explanation of the person re-identification process in the overlapping multi-camera environment.

14:50, Yaroslava Lochman, Hybrid Minimal Solvers for Single-view Autocalibration

Abstract. We introduce a new hybrid minimal solver that admits combinations of radially-distorted conjugate translations and radially-distorted parallel lines from the common scene plane to jointly estimate lens undistortion and affine rectification. The solver is the first to admit complementary geometric primitives for rectification purposes. In addition, a novel solver admitting three pairs of imaged parallel scene lines for the same problem is introduced. The proposed solvers are used with the Manhattan scene assumption to auto-calibrate cameras from a single image. The solvers are generated using elementary methods from algebraic geometry. As a result, they are simple, fast and robust. The solvers are used in an adaptive sampling framework that favors the feature combinations that are most frequently consistent with accurate scene plane rectifications. Auto-calibrations are recovered from challenging images that have either a sparsity of scene lines or scene texture. The method is fully automatic.


January 23rd

10:00, Dmitri Glusco, Replica Exchange For Multiple-Environment Reinforcement Learning

Abstract. In this project (Glusco and Maksymenko, 2019), we treat the Reinforcement Learning problem of Exploration vs. Exploitation. The problem can be rephrased in terms of generalization and overfitting or efficient learning. To face the problem we decided to combine the techniques from different researches: we introduce noise as an environment’s characteristics (Packer et al., 2018); create multiple Reinforcement Learning agents and environments setup to train in parallel and interact within each other (Jaderberg et al., 2017); use parallel tempering approach to initialize environments with different temperatures (noises) and perform exchanges using Metropolis-Hastings criterion (Pushkarov et al., 2019). We implemented multi-agent architecture with a parallel tempering approach based on two different Reinforcement Learning agent algorithms – Deep Q Network and Advantage Actor-Critic – and environment wrapper of the OpenAI Gym (Gym: A toolkit for developing and comparing reinforcement learning algorithms) environment for noise addition. We used the CartPole environment to run multiple experiments with three different types of exchanges: no exchange, random exchange, smart exchange according to Metropolis-Hastings rule. We implemented aggregation functionality to gather the results of all the experiments and visualize them with charts for analysis. Experiments showed that a parallel tempering approach with multiple environments with different noise level can improve the performance of the agent under specific circumstances. At the same time, results raised new questions that should be addressed to fully understand the picture of the implemented approach.

10:40, Anastasiia Kasprova, Customer Lifetime Value for Retail Based on Transactional and Loyalty Card Data

Abstract. Customer Lifetime Value (CLV) is a present value of the future cash flows attributed to a customer during their entire relationship with the company (Farris et al., 2010). CLV represents a 360-degree view of the client’s business situation (McKinsey, Customer Lifecycle Management), which takes into account the probability of customer churn and their future purchases. The modeling of CLV in retail is a complicated task due to the lack of access to historical data of purchases, the difficulty of customer identification, and building the historical reference with a particular customer. In this research, historical transactional data were taken from twelve North American brick-and-mortar grocery stores to compare different approaches to CLV modeling in terms of segmentation and forecast. The research has resulted in the suggestions on CLV estimation for the offline retail business case with given advantages and limitations of each approach.

11:20, Philipp Kofman, Efficient Generation of Complex Data Distributions

Abstract. Currently, the active development of image processing methods requires large amounts of correctly labeled data. The lack of quality data makes it impossible to use various machine learning methods. In case of limited possibilities for collecting real data, used methods for their synthetic generation. In practice, we can formulate the task of the high-quality generation of synthetic images as an efficient generation of complex data distributions, which is the object of study of this work. Generating high-quality synthetic data is an expensive and complicated process in terms of existing methods. We can distinguish two main approaches that are used to generate synthetic data: image generation based on rendered 3-D scenes and the use of GANs for simple images. These methods have some drawbacks, such as a narrow range of applicability and insufficient distribution complexity of the obtained data. When using GANs to generate complex distributions, in practice, we face a visible increase in the complexity of the model architecture and training procedure. A deep understanding of the real data complex distributions can be used to improve the quality of synthetic generation. Minimizing the differences in the real and synthetic data distributions can improve not only the generation process but also develop tools for solving the problem of data lack in the field of image processing.

13:00, Borys Olshanetskyi, Context Independent Speaker Classification

Abstract. Speaker classification is an essential task in the machine learning domain, with many practical applications in identification and natural language processing. This work concentrates on speaker classification as a subtask of general speaker diarization for real-world conversation scenarios. We research the domain of modern speech processing and present the original speaker classification approach based on the recent developments in convolutional neural networks. Our method uses a spectrogram as input to the CNN classifier model, allowing it to capture spatial information about voice frequencies distribution. Presented results show beyond human ability performance and give strong prospects for future development.

13:40, Nazariy Perepichka, Parameterizing of Human Speech Generation

Abstract. In modern days synthesis of human images and videos is arguably one of the most popular topics in the Data Science community. The synthesis of human speech is less trendy but deeply bonded to the mentioned topic. Since the publication of WaveNet paper by Google researchers in 2016, the state-of-the-art approach transferred from parametric and concatenative systems to deep learning models. Most of the work on the area focuses on improving the intelligibility and naturalness of the speech. However, almost every significant study also mentions ways to generate speech with the voices of different speakers. Usually, such an enhancement requires the model’s re-training in case of generating audio with the voice of a speaker that was not present in the training set. Additionally, studies focused on highly modular speech generation are rare. Therefore there is a room left for research on ways to add new parameters for other aspects of the speech, like sentiment, prosody, and melody. In this work, we aimed to implement a competitive text-to-speech solution with the ability to specify the speaker without model re-training and explore possibilities for adding emotions to the generated speech. Our approach generates good quality speech with the mean opinion score of 3,78 (out of 5) points and the ability to mimic speaker voice in real-time, which is a big improvement over the baseline that merely obtains 2,08. On top of that, we researched sentiment representation possibilities. We built an emotion classifier that performs on the level of the current state of the art solutions by giving an accuracy of more than eighty percent.

14:20, Oleh Lukianykhin, Reinforcement Learning for Voltage Control-based Ancillary Service using Thermostatically Controlled Loads

Abstract.  Advances in the demand response for energy imbalance management (EIM) ancillary services can change the future power systems. These changes are subject to research in academia and industry. Although an important/promising part of this research is the application of Machine Learning methods to shape future power systems domain, the domain has not fully benefited from this application yet. Thus, the main objective of the presented project is to investigate and assess opportunities for applying reinforcement learning (RL) to achieve such advances by developing an intelligent voltage control-based ancillary service that uses thermostatically controlled loads (TCLs). Two stages of the project are presented: a proof of concept (PoC) and extensions. The PoC includes modeling and training of a voltage controller utilizing Q-learning, chosen due to its efficiency that is achieved without unnecessary sophistication. Simplest relevant for demand response power system of 20 TCLs is considered in the experiments to provide ancillary service. The power system model is developed with Modelica tools. Extensions aim to exceed PoC performance by applying advanced RL methods: Q-learning modification that uses a window of environment states as an input (WIQL), smart discretization strategies for environment’s continuous state space and a deep Q-network (DQN) with experience replay. To investigate particularities of the developed controller, modifications in an experimental setup such as controller testing longer than training, different simulation start time is considered. The improvement of 4% in median performance is achieved compared to the competing analytical approach – optimal constant control chosen using whole time interval simulation for the same voltage controller design. The presented results and corresponding discussions can be useful for both further works on the RL-driven voltage controllers for EIM and other applications of RL in the power system domain using Modelica models.

January 24th

09:30, Dmytro Babenko, Determining Sentiment and Important Properties of Ukrainian-language User Reviews

Abstract. Every day a lot of visitors leave countless reviews about hotels, restaurants, cafes, attractions or other services. In most cases, they set the rate about this service, sometimes they also set the rate about the specific topic if service provides this possibility. However, the main information about user opinion is hidden inside the body of review text. Thereby, in this work, we propose a solution to analyze one or several user reviews, determine sentiments and acquire important characteristics for these reviews. We determine which characteristics were influenced by such reviews. In this case, the proposed solution can detect sentiments from text and classify for pos-itive and negative. Then it acquires top positive and negative phrases, which can explain why the user left such review. Besides, we analyze all reviews about one hotel or just several reviews and summarize the most important positive and negative properties for a specific hotel.

10:10, Oleg Kariuk, Unsupervised Text Simplification Using Neural Style Transfer

Abstract.  With the growing interdependence of the world economies, cultures and populations the advantages of learning foreign languages are becoming more than ever apparent. The growing internet and mobile phone user base provides significant opportunities for online language learning, the global market size of which is forecasted to increase by almost $17.9 bn during 2019-2023. One of the most effective ways to better oneself in a foreign language is through reading. Graded readers — the books in which the original text is simplified to lower grades of complexity — make the process of reading in a foreign language less daunting. Composing a Graded reader is a laborious manual process. There are two possible ways to computerize the process of writing Graded readers for arbitrary input texts. The first one lies in utilizing a variety of supervised sequence-to-sequence models for text simplification. Such models depend on scarcely available parallel text corpora, the datasets in which every text piece is available in the original and simplified versions. An alternative unsupervised approach lies in applying neural style transfer techniques where an algorithm can learn to decompose a given text into vector representations of its content and style and to generate a new version of the same content in a simplified language style. In this work, we demonstrate the feasibility of applying unsupervised learning to the problem of text simplification by using cross-lingual language modeling. It allows us to improve the previous best BLEU score from 88.85 to 96.05 for the Wikilarge dataset in unsupervised fashion, and SARI scores from 30 to 43.18 and FKGL from 4.01 to 3.58 for the Newsela dataset in semi-supervised one. Apart from that, we propose new penalties that provide more control during beam search generation.

11:00, Andrew Kurochkin, Meme Generation for Social Media Audience Engagement

Abstract. In digital marketing, memes have become an attractive tool for engaging an online audience. Memes have an impact on buyers’ and sellers’ online behavior and information spreading processes. Thus, the technology of generating memes is a significant tool for social media engagement. In this study, we collected a new memes dataset of ∼650K meme instances, applied state of the art Deep Learning technique – GPT-2 model [1] towards meme generation, and compared machine-generated memes with human-created. We justified that MTurk workers can be used for the approximate estimating of users’ behavior in a social network, more precisely to measure engagement. Generated memes cause the same engagement as human memes, which didn’t collect engagement in the social network (historically). Still, generated memes are less engaging then random memes created by humans.

11:40, Kateryna Liubonko, Matching Red Links to Wikidata Items

Abstract. This work tackles the problem of matching Wikipedia red links with existing articles. Links in Wikipedia pages are considered red when leading to nonexistent articles. In other Wikipedia, editions could exist articles that correspond to such red links. In our work, we propose a way to match red links in one Wikipedia edition to existent pages in another edition. We solve this task in the context of Ukrainian red links and English existing pages. We created a dataset of 3 171 most frequent Ukrainian red links and a dataset of 2 957 927 pairs of red links and the most probable candidates for the corresponding pages in English Wikipedia. This dataset is publicly released. We defined the task as a Named Entity Linking problem. Red links are named entities and we link Ukrainian red links to English Wikipedia pages. In this work, we provide a thorough analysis of the data and define its conceptual characteristics to exploit in entity resolution. These characteristics are graph properties (connections with the pages where red links occur and connections with the pages which occur in the same pages with red links) and word properties (title names). BabelNet knowledge base was applied to this task. We evaluated its powers in terms of F1 score (29 %) and regarded it as a baseline for our approach. To improve the results we introduced several similarity metrics based on mentioned red links characteristics. Combined in a linear model they resulted in F1 score 85 % which is our best result. In our thesis, we also discuss the bottlenecks and limitations of the current approach and outline the ideas for future improvements. To the best of our knowledge, we are the first to state the problem and propose a solution for red links in the Ukrainian Wikipedia edition. All the code for this project is publicly released on github.

13:30, Denys Porplenko, Generation of Sports News Articles from Match Text Commentary

Abstract. Nowadays, thousands of sporting events take place every day. Most of the sports news (results of sports competitions) is written by hand, despite their pattern structure. In this work, we want to check possible or not to generate news based on the broadcast – a set of comments that describe the game in real-time. This problem solves for the Russian language and considered as a summarization problem, using extractive and abstract approaches. Among extractive models, we do not get significant results. However, we build an Oracle model that showed the best possible result equal to 0.21 F1 for ROUGE-1. For the abstraction approach, we get 0.26 F1 for the ROUGE-1 score using the NMT framework, the Bidirectional Encoder Representations from Transformers (BERT), as an encoder and text augmentation based on a thesaurus. Other types of encoders do not show significant improvements.

14:10, Serhii Tiutiunnyk, Context-based Question-answering System for the Ukrainian Language

Abstract. This work presents a context-based question answering model for the Ukrainian language based on Wikipedia articles using Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) model, which takes a context (Wikipedia article) and a question to the context. The result of the model is an answer to the question. The model consists of two parts. The first one is a pre-trained multilingual BERT model, which is trained on the top-100, the most popular languages on Wikipedia articles. The second part is the fine-tuned model, which is trained on the data set of questions and answers to the Wikipedia articles. The training and validation data is Stanford Question Answering Dataset (SQuAD) (Rajpurkar et al., 2016). There are no question answering datasets for the Ukrainian language. The plan is to build an appropriate dataset with machine translation and use it for the fine-tuning training stage and compare the result with models which were fine-tuned on the other languages. The next experiment is to train a model on the Slavic language datasets before fine-tuning on the Ukrainian language and compare the results.

Jan. 24, 14:50, Anastasiia Khaburska, Statistical and Neural Language Models for the Ukrainian Language

Abstract. Language Modeling is one of the most important subfields of modern Natural Language Processing (NLP). The objective of language modeling is to learn a probability distribution over sequences of linguistic units pertaining to a language. As it produces a probability of the language unit that will follow, the language model can be viewed as a form of grammar for the language, and it plays a key role in traditional NLP tasks, such as speech recognition, machine translation, sentiment analysis, text summarization, grammatical error correction, natural language generation. Much work has been done for the English language in terms of developing both training and evaluation approaches. However, there has not been as much progress for the Ukrainian language. In this work, we are going to explore, extend, evaluate and compare different language models for the Ukrainian language. The main objective is to provide a balanced evaluation dataset and train a number of baseline models.

Про факультет

Важлива інформація

Контактна інформація