MS-AMLV 2021 Schedule

February 12

09:45, Opening session

10:00, Incorporating Metadata for Semantic Segmentation

Iaroslav Plutenko and Dmytro Fishman

The image processing workflow in the biomedical industry suffers from the diversity of input formats belonging to various domains as there is no convention among vendors. Neural Net model accommodation to a particular domain comes at the expense of accuracy for other domains or for all in general. This paper outlines the upcoming project aiming to improve domain adaptation using meta-data supplied with imagery by implementing the concept of dynamic parameters.

10:30, Robust Visual Odometry for Realistic Point-Goal Navigation

Ruslan Partsey, Oles Dobosevych, and Oleksandr Maksymets

The ability to navigate in complex environments is a fundamental skill of a home robot. Recent approaches can learn the full navigation system directly from RGB-D observations. Having access to GPS+Compass sensor and acting in the noiseless environments they solve the task of autonomous navigation achieving 0.95 SPL and being successful in 99.6% of episodes. Despite extensive study, indoor navigation in unseen environments under noisy actuations and sensing and without access to precise localization continues to be an open frontier for research in Embodied AI. In this work, we focus on designing a visual odometry model for robust egomotion estimation and its integration with navigation policy for efficient navigation under noisy actuations and sensing.

11:00, Point Cloud Human Pose Estimation using Capsule Networks

Oleksandr Onbysh and Andrii Babii

Human pose estimation based on points cloud is an emerging field that develops with 3D scanning devices’ popularity. Build-in LiDAR technology in mobile phones and a growing VR market creates a demand for lightweight and accurate models for the 3D point cloud. Widely advanced deep learning tools are mainly used for structured data, and they face new challenges in unstructured 3D space. Recent research on capsule networks proves that this type of model outperforms classical CNN architectures in tasks that require viewpoint invariance from the model. Thus capsule networks challenge multiple issues of classic CNNs like preserving the orientation and spatial relationship of extracted features, which could significantly improve the 3D points cloud classification tasks performance.

The project’s objective is to experimentally assess the applicability of capsule neural network architecture to the task of point cloud human pose estimation and measure performance on non-synthetic data. Additionally, measure noise sustainability of capsule networks for 3D data compared to regular models. Compare models’ performance with a restricted amount of training data.

12:00, 3D Head Model Estimation from a Single Photo

Rostyslav Zatserkovnyi and Orest Kupyn

Today, 3D human head models are used for various applications in fields such as computer vision, virtual reality, biometric systems, and healthcare. Since obtaining a high-quality head scan is an expensive and time-consuming process, machine learning algorithms are commonly used to estimate the shape and texture of a 3D model from a single image ”in the wild”, or with significant calibration issues such as an imperfect pose, non-uniform illumination or partial occlusion. However, modern research in the field often focuses on modeling the facial region rather than recreating a full human head model. In this work, we will review and analyze a selection of research papers concerning 3D face or head model reconstruction from uncalibrated images, propose a possible solution to the head modeling research gap, and outline the plan of a Master’s thesis project aiming to address this gap.

12:30, One-Shot Identity Preserving Photorealistic Face Reenactment

Orest Rehusevych and Orest Kupyn

In recent years, such interconnected Computer Vision fields as Face Reenactment, Talking Head Generation, and Face Manipulationare rapidly growing. Mainly this is due to the emerging new applications of this technology and notable improvements of the generation results over the last few years. In this paper, we review different approaches to face reenactment, types of training settings, and generative architectures. Based on the results of the studied models, we decided to focus on the problem of identity preservation and the improvement of photorealism of the generated output in the one-shot setting while maintaining the inference speed of the previous SOTA works.

13:00, Berries Quality Detection In Visual Spectrum Using Neural Networks

Andrii Blagodyr and Viktor Sakharchuk

The paper presents the raspberry quality detection approach based on a convolutional neural network with U-net architecture. The implementation of neural networks consists of the following steps: 1) acquire training and testing data set; 2) train the network; 3) make a prediction with test data. The research is carried out on the data that has been collected by the researchers for the experiment. The images used for the convolutional neural network training all have standard sizes, which makes it easier to feed them into a convolutional neural network model. The dataset for the experiment has been generated manually based on photographs of different varieties of raspberries and various states of raspberry fruit.

13:30, Creating Slides from Video Lecture

Maksym Shylo, Anton Smirnov, and Valerii Krygin

Video recordings of lectures are no longer a rarity in the conditions of distance learning. Videos may be in an inconvenient format for students or contain different artifacts due to compression, camera quality, and other factors. It is useful to have a presentation of the study material, which contains only the text from the board because such a view of the material is most similar to the compendium. Moreover, some of the text may not be visible due to occlusions from people. To address these issues, we employ a neural network that removes people from video via content-aware-inpainting. To reduce duplication of slides due to camera shaking, we perform video-stabilization as a preprocessing step. Finally, we create slides by comparing changes in the frames, color correcting, and binarizing them. As a result, we get slides with the extracted text or drawings from the board, which will help to simplify the creation of e-learning materials for both new and existing lecture recordings. Teach-ers will also be able to quickly provide lecture material to students even if they teach several complex subjects.

14:30, Improving Sequence Tagging Approach for Grammatical Error Correction Task

Maksym Tarnavskyi and Kostiantyn Omelianchuk

Grammatical Error Correction (GEC) is an important Natural Language Processing (NLP) task. It deals with building systems that automatically correct errors in written text. The main goal is to develop a GEC system that receives a sentence with mistakes and outputs a corrected version. One of the existing approaches to solve this task is the iterative sequence tagging approach. The main idea is that the model takes erroneous sentences as input and predicts sequences of tags that need to be applied to these sentences to turn them into correct ones. The desired result of this work is an improved sequence tagging model, which achieves higher results on commonly accepted benchmark datasets. In order to improve the model, the following steps can be done: to explore the impact of large transformers or other training schemes, to discover data-weighted training strategies, to explore ensemble distillation techniques for improving a single model, to explore the possibility to extend tagging operations space, to explore combining sequence tagging withseq2seq approach.

15:00, Natural Language Inference for Fact-checking in Wikipedia

Mykola Trokhymovych and Diego Sáez-Trumper

The incoming flow of information is continuously increasing along with the disinformation part that can harm society. In the context of Wikipedia, automatically filtering unreliable content is very important to help the editors keep Wikipedia as free as possible of disinformation. This project aims to implement software as an open API that will automatically perform a facts validation process. In Natural Language Processing (NLP), this task is related to Natural Language Inference (NLI), where a claim is compared with reference to determine whether the claim is correct, incorrect, or unrelated. In this work, we analyze and compare state-of-the-art (SOTA) approaches. Although there were recently many advances in the precision of NLI models, efficiency remains an open problem. Our goal is to build a production-ready model with both high accuracy and efficiency. We observe the best works in word-based models and transfer approaches to sentence based models to achieve our research goal.

 

February 13

10:00, Real-time Simulation of Arm and Hand Dynamics using ANN

Mykhailo Manukian and Sergiy Yakovenko

The physics of body dynamics is a complex problem solved by the nervous system in real-time during the planning and execution of movements. The human hand has 27 degrees of freedom (DOF), and the arm has 6 DOF for elbow and shoulder joints. Due to the complexity of hand structure and functions, we need a complex biological ”computer” in our head to control it. Neuroprosthetics require similar computations for neural decoding and sensory feedback tasks. Furthermore, since physical simulations are computationally complex, this research aims to approximate them using machine learning methods like ANN. For such a type of task, the most suitable network architecture is RNN or Transformer, which considers arm and hand motion’s temporal dynamics. This study will validate different RNN – shallow recurrent ANN, LSTM – and Transformers architectures as candidates for the hand and arm motion control model. The input data for ANN is muscle torques determined based on respective electromyographic (EMG) signals. Physical models of arm and hand, which operate in the MATLAB Simulink environment, will provide validation data to train our ANN. Lastly, the resulted model should work in real-time and have a latency of less than 4 ms to interpret torques into limb position coordinates to allow further usage of such a model in real-life applications.

10:30, Real-time Inverse Dynamics from Motion Capture

Kateryna Zabava, Sergiy Yakovenko, and Valeriya Gritsenko

Inverse dynamics is an essential tool for biomedical applications. Machine learning, in particular, may offer an alternative to classical Newtonian physics to improve computational efficiency. Here we will combine machine learning algorithms for real-time transformation from recorded motion capture of human reaching movements into applied joint moments. The main focus in this project is on real-time estimation of joint moments, i.e. with a maximum processing delay of 2ms. The developed algorithm will be used as part of a physics engine that describes the neural control of human motion and decodes movement intent in individuals with neural damage.

11:00, Asymmetric Central Pattern Generator (CPG) Implemented with Spiking Neurons in ANN (Nengo) to Simulate CPG Model of Overground Asymmetric Locomotion

Yuriy Pryyma and Sergiy Yakovenko

In this paper, we try to use a network of spiking neurons to build a Central Pattern Generator (CPG) model of a biological system. Usually, these models represented as a system of ion flows inside neurons rep-resented as a number of differential equations. They usually work very well but are computationally intensive. Because of this, we could usually model a shallow network of neurons and simple behavior. Spiking neural networks (SNNs) are the third generation of neuron networks that are a closer version of biological neurons than Artificial neural networks and should be a better fit for building a model of CPGs as they are also very computationally efficient.

12:00, Batch Reinforcement Learning for Dynamic Pricing in E-Commerce

Andrii Holovko and Taras Firman

There are existing reinforcement learning approaches for dynamic pricing that rely on off-policy algorithms, pre-trained on a replay buffer. However, the approximation error problem subjects off-policy algorithms, trained without interaction with the environment, to fail in real-world environments due to overestimated value estimates. In dynamic pricing, this leads to sub-optimal pricing decisions, long-term loss of customers, and revenue drop. This work is dedicated to building and testing a reliable dynamic pricing engine for an E-commerce platform-based on an offline reinforcement learning algorithm trained on a fixed batch of data.

12:30, Scalable Prediction of Coordinated Information Spread

Petro Bodnar and Dmytro Karamshuk

This study aims to find methods to detect coordination among content spreaders on a massive scale efficiently. This paper has two significant aspects: methods for modeling information cascades and identification of misinformation. We start with an overview of predicting information cascades, a classic framework for studying information diffusion. Afterward, we investigate the predictability of cascades and the most important features available for their modeling. Finally, we identify generative models’ advantages, especially self exiting processes in modeling cascades and detection latent structure of diffusion network that is not always freely available. The result of this work would be delivered as the master thesis and a model detection coordination based on publication time sequence for news providers.

13:00, Predicting Cognitive Scores of Patients with Alzheimer’s Disease

Sevil Smailova and Igor Koval

A neurodegenerative disorder such as Alzheimer’s disease(AD) begins with memory loss and develops over time, causing issues in conversation, orientation, and control of bodily functions. It is important to understand the dynamics of the progression of AD to help efficiently apply therapeutic interventions at the early stages of the disease. Patterns of disease progression can be explored in data sets capturing the natural history of cognitive scores of AD patients. Such data sets are longitudinal, in the sense, that they contain repeated measurements at multiple time points on multiple individuals. The purpose of the project is to overcome the known caveats of analyzing and predicting longitudinal medical data. Since AD evolves through a long period of time, often longer than observation at an individual scale, it makes the modeling problem difficult compared to those time series whose seasonality is seen in each observation. Moreover, the progression of different cognitive scores may have different trends, seasonality, and amplitude by design and per individual. The proposed direction of work includes the long-term prediction of cognitive scores based on the analysis of discovered patterns of subgroups of AD patients, combining the proposed techniques in a research area as well as the implementation of the own approach.

14:00, Development of Land Valuation System using Machine Learning algorithms

Natalia Novosad

Evaluating the real market value of land is a complicated and expensive process carried out by experts. The data we are considering is the land in Ukraine. Due to the land moratorium, the farmland in Ukraine is usually underestimated. The objective is to find its fair value. This paper aims to compare the different Machine Learning models and classical econometric models in three approaches: the income method, the comparative method, and the real options method.

14:30, Graph Neural Networks for BLE Mesh Optimization

Oleksandr Bratus

The Bluetooth Low Energy (BLE) Mesh network technology is one of the newest technology in the wireless communication domain. Due to low cost and low power consumption, it has already become widespread and has the potential for a wide range of applications. However, some disadvantages of the flooding algorithm and the problem of interference primarily determine the task of optimizing the BLE Mesh network. Although more and more research is devoted to BLE Meshoptimization, applying machine learning methods to solve this problem remains open. In addition, in recent years, the natural graph structure of wireless interference patterns was used to define graph neural network(GNN) architecture, therefore, GNNs have all the prerequisites for the successful solution of the proposed problem. The main idea of this work is to propose a GNN model, which can optimize the BLE Mesh network, and test this model on simulated data.

How to join the online symposium

Please, use this Zoom link to join us at the symposium: https://us02web.zoom.us/j/89915578405?pwd=QUQ4U1NINm56clIzM1FvZ3NjODFZQT09