Day 1. March 24
Section #1. Image, Audio, and Video Processing
9:00, Corner localization and camera calibration from imaged lattices
Andrii Stadnik, James Pritts, Anton Brazhnyi
Many computer vision algorithms rely on high-quality camera calibration. However, despite the constant development of the solutions, they still tend to be sensitive to the noise of the input data, especially on the images with high distortion, such as mobile phones and GoPro-type cameras.
Recently, Duisterhof et al. in TartanCalib proposed a novel robust approach, iteratively refining the camera calibration and using it to detect additional calibration board features, achieving the superior robustness to the noise than the state-of-the-art methods. We believe that it is possible to improve that solution by iterative refining the conjugate translations instead, hence predicting the position of the previously undetected calibration board features. Also, we would like to test more advanced camera models, possibly using the conversion between the camera models, as proposed by Lochman1 et al. in BabelCalib.
9:30, Effective methods for audio spoofing detection
Dmytro Ivashchenko, Pablo Maldonado
Audio Spoofing Detection is an essential Audio Classification subtask. The idea is to build systems that detect whether human speech was spoofed using vocoders, filters, or deep generative models. Among the many challenges in this domain is the capability of developing detection methods that do not require extensive preprocessing and can work at scale. Semi-supervised methods are preferable because labeled data is hard to obtain. One of the possible approaches is to use a model with an encoder trained on speech representations and a classifier, which predicts whether the speech is spoofed based on encodings. The desired outcome is to review existing methods and propose improvements, focusing on efficiency and keeping the accuracy relatively high. To improve the model, we want to explore the following: 1) to explore which audio representations should be used for encoder training, 2) to improve classifiers training techniques with loss functions, 3) to apply techniques from image classification and natural language processing, 4) to explore the possibility of applying model distillation to improve the efficiency.
10:00, Repeated pattern planar geometry estimation in end-to-end trainable way
The purpose of this position paper is to propose a novel method for estimating the geometry of planes with repeated patterns, a frequent phenomenon in artificially constructed environments. This model has been designed with a focus on establishing a fully trainable end-to-end approach, where the entire process from input to output can be optimized through learning, without relying on any hand-engineered components or intermediate steps.
10:30 – 11:00, Break
Section #2. Image, Audio, and Video Processing 2
11:00, Hidden state refinement for optical flow forecasting
Anton Babenko, Roman Riazantsev
In recent years the topic of optical flow has become well-spread due to computation power support and optical flow estimation applications used on mobile phones and edge devices: video editors, frame stabilizations, and autonomous driving feature providers. This work analyzes multiple approaches to optical flow estimation and finds the main problem of the optical flow methods: slow convergence and use of the custom blocks, which makes it hard to port onto mobile phones and edge devices. We propose to solve the slow convergence with hidden state refinement to provide the initialization for optical flow estimation based on several previous frames and their hidden state transformations, which imitates the pixel movement at the hidden state level. The proposed method uses simple CNN and LSTM blocks which makes the approach easy to port onto mobile phones and edge devices. We used Sintel, KITTY-15, FlyingChairs, FlyingThings, and HD1K datasets for our experiment.
11:30, Pulling yourself up by the bootstraps: advancing medical image segmentation via pseudo-labeling of public datasets
Roman Mishchenko, Dmytro Fishman
Deep neural networks achieved outstanding performance in medical image segmentation tasks. They are used for disease diagnosis and support decision-making in clinical practice. Most popular approaches rely on the large quantity of labeled data. However, there is a deficit of high-quality annotated data because labeling is a slow and expensive process. In this work, our contribution can be summarized into two points: 1) we propose an elegant semi-supervised method that utilizes unlabeled data to create a deep neural model with better performance, and 2) through the semi-supervised approach, we give new life to unlabeled data.
12:00, Efficient Text-Guided 3D Synthesis With Latent Diffusion Models
Daniel Kovalenko, Oles Petriv
The emergence of diffusion models has greatly impacted the field of deep generative models, establishing them as a powerful family of models with unparalleled performance in various applications such as text-to-image, image-to-image, and text-to-audio tasks. In this work, we propose a novel approach for text-guided 3D synthesis using latent diffusion probabilistic models. Our goal is to achieve high-quality and high-fidelity 3D-colored objects conditioned by text in a number of seconds. We propose to use a triplane-vector space parametrization in combination with a Latent Diffusion Model (LDM) to generate smooth and coherent geometry. The LDM is trained on the large-scale text-3d dataset and is used as a latent triplane-vector texture generator. By using a triplane-vector space parametrization, we aim to improve the efficiency of the space representation and reduce the computational cost of synthesis. Additionally, we use an implicit neural renderer to decode geometry details from triplane-vector textures.
12:30 – 13:30, Lunch break
Section #3. Natural Text Processing & Synthesis
13:30, The 4th Stage of Genocide: Computational Analysis of Dehumanization on Russian Telegram
Kateryna Burovova, Mariana Romanyshyn
Dehumanization is a pernicious psychological process of denying some or all attributes of humanness to the target group. It is frequently cited as a common hallmark of incitement to commit genocide. Recent developments in the conceptualization of dehumanization and the dramatic shift in the international security landscape after the 2022 Russian invasion of Ukraine call for the development of new techniques, which can be applied to the analysis and detection of this extreme violence-related phenomenon at scale. This position paper outlines the upcoming project aiming at developing a first detection system for instances of dehumanization in the Russian language. We collected the entire posting history of the most popular political bloggers on Russian Telegram to explore the evolution of dehumanizing rhetoric through computational modeling and to develop both language-specific and language-agnostic techniques for the detection of dehumanization. New methods can be built into systems of anticipatory governance that seek to predict and prevent extreme violence, contribute to the collection of evidence of genocidal intent in the Russian invasion of Ukraine, and pave the way to the large-scale studies of dehumanizing language and representation of Ukrainians in Russian media.
14:00, Controllable text generation with Diffusion Models
Oleksandra Konopatska, Andrii Liubonko
Diffusion models are a rapidly developing approach in generative modeling. So far, most of the work in this field has been done with continuous data, showing significant results in the image and audio generation areas. The application of diffusion models for text generation is a non-trivial task, which has recently received wide attention due to new possibilities compared to conventional text generation approaches – in particular, the ability to work with the left and right context at the same time. This quality makes diffusion models particularly promising in solving controllable generation tasks, which require the generated text to meet certain conditions, such as sentiment or syntactic structure.
Approaches to conditional textual applications of diffusion models have emerged in recent months. For now, the proposed solutions still have some limitations open to future improvements, such as significant time required for training and decoding, and stability issues of control and generation. This study aims to investigate and address the still-existing open problems of diffusion-based approaches to controllable text generation, expand the application of the method to problems of multi-conditional control, and compare the performance of diffusion models with the more conventional approaches.
14:30, Uncovering Evolving Discourse in Media Coverage of War in Ukraine: A Transformer-based Dynamic Topic Modeling Approach for Emerging Topic Detection
Yevhen Kravchenko, Andriy Kusyy
Topic modeling techniques allow aggregating documents from extensive collections based on their semantic similarity and uncovering the underlying topics. With the recent development of transformer-based models, context-sensitive document embeddings are used to represent documents in combination with clustering procedures to obtain state-of-the-art results in topics’ coherence and interpretability. As the public discourse evolves over time, consideration of the document corpus temporal structure may be studied to reveal dynamic patterns of discourse development. Multiple dynamic topic modeling approaches were proposed, addressing this problem under the assumption of a static set of topics represented continuously throughout the timeline. However, emerging topics, which are of great practical interest for media monitoring and trend-capturing tasks, could not be detected under such a framework. In this work, we develop a novel approach of dynamic topic modeling, free from the static topics set limitation, based on the utilization of document embeddings and a custom semi-supervised clustering procedure. We explore the algorithm’s capacity for emerging topic detection task and propose its application to the recent corpus of media publications regarding the war in Ukraine.
Day 2. March 25
Section #4. Inference, Analysis, Prognosis, Optimization 1
10:30, Causal Inference Under Network Interference
Estimating the average treatment effect on observational data is difficult due to unmeasured confounders or unbalanced datasets. The problem is getting even more complex in the case of observational network data because of the unknown network exposure effect. This short position paper proposes and investigates a potentially new approach consolidating GNN, double debiased machine learning method to approximate network neighborhood exposure probability distribution and estimate the average treatment effect.
11:00, A ML approach for stock allocation based on dynamic trends
Volodymyr Antoshkiv, Yarema Okhrin
Portfolio construction is an essential task in Finance. Portfolio managers use historical prices, statistical methods, news sentiment, insider information, and machine learning to construct an effective portfolio. Machine learning (ML) in the Finance field has become a trend. Many researchers use ML to predict future prices, allocate assets efficiently, and explore similarities in stocks. Markowitz’s mean-variance (MV) model is the classical approach in portfolio construction. The main idea of the model is to describe risks as standard deviation and expected incomes as a mean of stock returns. The desired result of this work is to provide a new approach for stock (re)allocation by creating an AI-adaptive framework concerning the mental account (MA) of investors, which will make investors’ choice of asset allocation easier and improve the performance of the portfolio. To provide a new approach, we will investigate the following things: 1) discover how stock prices are reflected by historical data and new information available, 2) explore existing extensions of the MV model and ML for asset allocation, 3) discover mathematical properties of time series, especially, grouping techniques of stocks – in other words, time series clustering, 4) explore possibilities to include more aspects into MV model, 5) explore possibilities for portfolio re-balancing.
11:30, Forecasting Seawater Transparency
Yevhen Stepanov, Dmytro Karamshuk
This study is inspired by the United Nations declaration of the Ocean Decade, as a significant portion of the ocean ecosystem remains to be explored and comprehended. The transmission of sunlight through seawater is a crucial factor in determining the productivity of these aquatic ecosystems. The ability of light to penetrate the ocean’s surface plays a vital role in the growth and survival of marine organisms, particularly those at the base of the food web, such as phytoplankton. Therefore, an understanding of the dynamics of light transmission in seawater is essential for comprehending the functioning of marine ecosystems and for the management and conservation of ocean resources. Secchi depth (ZSD) is a widely-used metric for assessing water quality in marine environments. The primary objective of this study is to forecast seawater transparency, utilizing satellite data from the Copernicus Marine Service. The Copernicus Marine Service is a comprehensive ocean monitoring program that provides a wide range of data products, including information on the optical properties of seawater. We applied the advanced approaches to modeling Secchi depth (ZSD) using time series models. The models included Autoregressive (AR), AutoregressiveMoving-Average (ARMA), Autoregressive Integrated Moving Average (ARIMA) with GAN(Generative adversarial network) , LSTM (Long Short-Term Memory), RNN (Recurrent Neural Network). This approach can potentially lead to better results in time series forecasting. One of the key challenges encountered was the removal of clouds from satellite imagery in order to isolate the aquatic areas. To address this challenge, we employed a point inpainting method, which is a image processing technique used to fill in missing or corrupted parts of an image.
12:00 – 13:00, Lunch break
Section #5. Inference, Analysis, Prognosis, Optimization 2
13:00, Exploring the Optimization Landscapes of Implicit Neural Fields
This work proposes a study of the loss landscapes of implicit neural networks. The study will employ various visualization and analysis methods to gain insight into the optimization process of these networks. The focus of the research is to investigate the existence of mode connectivity in implicit networks, and explore the potential utilization of loss landscape similarity in high-dimensional optimization problems. The experiments will be performed on different implicit network architectures and results will be compared to gain a deeper understanding of the optimization behavior of these networks. The study will start with a simple 1-dimensional loss curve, followed by a more comprehensive 3-dimensional analysis, and measurements of a correlation between the loss landscape structure and the performance of the network. The ultimate goal is to provide new insights into the optimization dynamics of implicit neural networks and to find new ways to utilize loss landscape information for better training and performance.
13:30, Evaluation of the effect of vHPC-mPFC theta rhythm synchronization on spatial working memory
Arsenii Petryk, Maxym Myroshnychenko
The ventral hippocampus (vHPC) and medial prefrontal cortex (mPFC) are two brain regions that have been shown to play a critical role in spatial working memory (SWM) and navigation. Theta rhythm synchronization between these regions has been proposed as a critical mechanism for encoding and retrieving spatial information during SWM tasks. Recent studies have utilized advanced machine learning algorithms, specifically convolutional neural networks (CNNs), to decode spatial information and planned goal location from neural activity recorded during SWM tasks in animal models. In this study, we will train and evaluate ML models on neural PFC signals data to classify animal goal-directed behavior in SWM tasks during artificial optogenetic theta rhythm synchronization. We aim to investigate further the role of theta rhythm synchronization between vHPC and mPFC in SWM tasks by evaluating the effect on task performance using classifier accuracy as a proxy for the robustness of SWM. This approach has the potential to provide a deeper understanding of the underlying mechanisms of theta rhythm synchronization and its role in the different phases of the SWM task.
14:00, Empirical comparison of hyperparameter optimization methods for neural networks
Polina Kozarovytska, Taras Kucherenko
Hyperparameter search is an important part of model development for modern neural networks. Despite its important and strong influence on the final performance, little research has been done examining different methods for the hyperparameter search. Most modern papers simply use random search. This paper discusses different hyperparameter optimization methods, including Grid Search, Random Search, Evolution algorithm, Bayesian Optimization, Hyperband, and BOHB. We then compared some of these algorithms in practice, using them to find the best hyperparameters for the convolutional neural network on the Fashion MNIST dataset. All algorithms were compared by the resulting accuracy, achieved in a fixed amount of time.
How to attend the symposium
Please, register via an online form if you want to attend the symposium in Lviv or online. The contact information will be provided before the symposium starts.