Keynotes and Invited Talks

Kinds of Models in Sciences, Engineering, and Daily Life

Models are universal, partly silent and wonderful tools of the art of thinking and acting. Computer science is particularly affected here and has developed hundreds of different approaches for model conceptions and modelling approaches. One wonders, therefore, whether there should not be an art and science of models with a common conception of the term ``model'' for the whole science. So far, this has seemed unrealistic.

We use the generic conception of model and develop on its basis derived particular conceptions depending on the use case, depending on the CoP, depending on the requirements, and satisfying concerns. We can also use it to transfer knowledge and findings for one kind of model to the other kind, thus developing modeling as one of the four guiding paradigms of Computer Science in coexistence with three other paradigms: small and large scale structuring, small and large scale dynamics, and collaboration with all its facets.

Bernhard Thalheim

Bernhard Thalheim has been full professor of Computer Science at Christian-Albrechts University Kiel from 2003 till 2020. He chaired database and information system groups since 1993. He was a professor in Dresden, Kuwait, Rostock, and Cottbus since 1986.

He has held visiting professor positions in America, Asia, Europe, and New Zealand. His main research area is (conceptual) modeling and its foundation. He is also interested and researching in database technology, database programming, (distributed) object-relational information systems, business informatics, web information systems, performance tuning and forecasting, data mining, data warehouses and OLAP foundations, content management, database and information systems theory, database systems and software architecture, discrete mathematics, and logics. He has been dean, president of senates and convents, head of departments, founder of three conference series, and co-chair of threescore conferences.

Discord discovery in time series, or Can we detect all anomalies of an anomalously long time series in an anomalously short time?

A time series is a chronologically ordered real-valued sequence that reflects a certain process or phenomenon. Currently, time series are ubiquitous, and we need to store and process such data in a wide range of domains: the digital industry, personal healthcare, the Internet of Things, climate modeling, etc. In the above areas, discovering anomalies in time series remains one of the most topical problems. In addition, discovering subsequence anomalies is more challenging than detecting point outliers since a subsequence anomaly refers to successive points in time that are collectively abnormal, although each point is not necessarily an outlier. Since in the domains above, very long time series are typical, the discovery of subsequence anomalies requires parallel algorithms and high-performance computing. In the plenary talk, we are going to present the parallel subsequence anomaly discovery algorithms for GPUs and multi-GPU clusters that are developed in the Big Data and Machine Learning Lab of South Ural State University, Chelyabinsk, Russia.

Mikhail Zymbler

Doctor of Science (Physics and Mathematics)

South Ural State University, Chelyabinsk, Russia

Deputy Director of the Scientific and Educational Center “Artificial Intelligence and Quantum Technologies”

Research interests: Machine Learning, time series mining parallel algorithms, database management systems

Scopus ID 55841425200

WoS Researcher ID L-2224-2013

Topological Methods and Tools for the Analysis of Big Crystallographic Data

We briefly overview mathematical models, methods and computer tools for the topological description, analysis and classification of crystal structures. We present ToposPro, a program package for comprehensive geometrical and topological analysis of periodic architectures of any composition and complexity. ToposPro is designed for the automated analysis of big crystallographic data, which are either collected in the world-wide electronic databases or generated by theoretical methods. ToposPro was used to create a system of topological databases, which are available both in the local version and as a number of interactive web-services integrated into the TopCryst system. All described topological tools are considered as applied to solving typical tasks of crystal chemistry and materials science.

Vladislav A. Blatov

Vladislav A. Blatov graduated from Samara State University (now Samara National Research University, SNRU) in 1987 and then received degrees of Candidate (1991) and Doctor in Chemistry (1998) from the Institute of General and Inorganic Chemistry (Moscow, Russia). Now he is the head of General and Inorganic Chemistry Department at Samara State Technical University (SSTU), organizer and director of two Samara Centers for Theoretical Materials Science (SCTMS) at SSTU and SNRU. His research and educational interests concern geometrical and topological methods in materials science and crystal chemistry and their computer implementation. He is the main developer of the program package TOPOS (later renamed to ToposPro) since 1989. He invented many original algorithms for analyzing and classifying crystal structures, searching for correlations in crystallographic data and predicting new crystalline materials. His current research interests are knowledge bases and artificial intelligence systems in materials science.

He was a tutor and organizer of 13 international schools devoted to the ToposPro methods. Vladislav Blatov is a laureate of the Gold Medal for Young Scientists of Russian Academy of Sciences (1999), laureate of the Main Prize (1998) and the Prize (1996, 2011) of the Nauka-Interperiodika publishing house for the best publication in published journals; laureate of research grants for professors of the Cariplo Foundation (Italy) (2008, 2010, 2012); winner of the Scopus Award Russia (2016) and the Clarivate Analytics Award ‘Highly Cited Researcher’ (2016); winner of the Royal Society of Chemistry Materials Chemistry Division Horizon Prize (2021). He is a member of advisory board of the Science and Technology of Advanced Materials: Methods journal; a member of the IZA Structural Commission, the IUCr Commission on Mathematical and Theoretical Crystallography, the IUCr Commission on Crystallography of Materials, the International Centre for Diffraction Data. He has Hirsch index of 48; 300+ publications and 12,500+ citations in the WoS system.

Advanced Query Optimization for Modern Analytical Scenarios

Data keeps growing in size and richness all the time, providing an ongoing challenge to analytical DBMS systems. But besides that, new approaches to data modelling such as Vault and Anchor provide another level of challenge. In this talk we will discuss these new models and see how modern analytical system can tackle the emerging challenges

Pavel Velikhov

Pavel Velikhov is a researcher and developer in database management systems, he's held leading roles in the development of commercial and open-source systems: Enosys, SciDB, GaussDB, TigerGraph. Currently he's a leading developer of YDB, leading the analytics direction.

Scientific and technical dualism: collaborative work of PostgreSQL community and researchers in the field of data management

POSTGRES was born in an academic environment, was developed by students for many years, and the spirit of the community keeps the "vibe" of a young open project. At the same time, PostgreSQL is an industry standard in the area of databases, provides the most accurately described guarantees of data integrity and availability, is operated by central banks, BigTech, government agencies and the military.

In my talk, I'm going to talk about how the PostgreSQL project developers community works. About how scientific research becomes technology and how technology becomes the object of scientific research.

Andrey Borodin

Andrey is the team lead of open source DBMS development at Yandex.Cloud. He is recognized as Postgres Contributor by the community, holds Ph.D. in computer science, working as associated professor at Ural Federal University. He is teaching at Yandex School for Data Analysis and UrFU. Interested in backup technologies and data indexing. His most prominent open source contributions include development of popular disaster recovery system WAL-G, current Postgres algorithm and implementation of spatial data bulk loading, fixes of some critical bugs in Postgres concurrent transaction handling and coordination of GreenplumDB cloud-oriented features.

Open Research Knowledge Graph: a semantic approach to scientific communication

Since the beginning of modern science, with the publishing of the first scientific journal - "Philosophical Transactions of the Royal Society" in 1665, we use the same method of communicating scientific knowledge - articles. What has changed is that analog articles from scientific journals are now available and distributed as PDF documents. Today, more than 2.5 million new scientific articles are produced every year, and even in a relatively narrow scientific field, it is impossible to read and understand every single one. As a result, researchers are drowning in a flood of publications and many scientific results cannot be used by others. Instead of representing research in static PDF files, the Open Research Knowledge Graph (https://orkg.org/) describes scientific terms and statements semantically, i.e. in a machine-interpretable and FAIRified format. It allows creating an RDF subgraph that represents a research paper connected with related URIs from existing knowledge bases such as Wikidata by reusable properties. Inputting data is facilitated with SHACL-like templates. This makes it possible, for example, to form comparisons of different studies on a particular research problem with SPARQL queries or even to create renewable systematic literature reviews.

Ildar Baimuratov

MA, PhD

PostDoc, L3S Research Center, Leibniz University Hannover

Ildar is a postdoctoral researcher at the joint lab of L3S Research Center of Leibniz University Hannover and TIB - Leibniz Information Centre for Science and Technology. His research interests include logic, information theory, unsupervised learning, knowledge extraction and representation, and their application in data-intensive domains. Nearly a year ago, Ildar has joint the Open Research Knowledge Graph project that aims to describe research papers in a semantic manner. ORKG helps scientists in finding relevant research and creating state-of-the-art comparisons and reviews. Prior to ORKG, Ildar worked at ITMO University (St. Petersburg) applying machine learning and semantic technologies in building information modelling and digital asset management.

Analysis of Cross-lingual and Multi-lingual Text Classification Approaches using Transfer Learning: for high and low-resource languages

According to surveys concerned with recent NLP conferences, English is the most researched language and it is the single language considered in more than 60% of the papers published. In contrast, the other languages not only lack the attention of researchers, but also various resources, such as data, models and tools especially for low-resource languages. The lack of data and resulting poor performance of natural language processing can be solved with cross-lingual learning. Cross-lingual learning is a paradigm for transferring knowledge from one natural language to another language. Furthermore, several languages prevail in the worldwide networks leading to multilingual diversity in the text classification field that can be required in several contexts. In this talk, I will discuss the characteristics of several significant cross-lingual learning methodologies to handle the issues of cross-lingual knowledge transfer from resource-rich to resource-deficient languages. In addition, types of state-of-the-art multi-lingual text classification approaches using transfer learning will be described. Then, I will discuss the application of multi-lingual paradigm using transfer learning to handle the task of Hope Speech Identification and Threatening Text Detection in English, Russian and Urdu languages. In the end, the findings obtained from the experiments will be shared and I will conclude my talk by discussing the future prospects related to cross-lingual and multi-lingual learning.

Muhammad Shahid Iqbal Malik

Muhammad Shahid Iqbal Malik is currently a Postdoc Fellow in the Lab for Models and Methods of Computational Pragmatics, National Research University HSE, Moscow, Russia. Dr. Malik received his Master degree in Computer Engineering (2011), followed by a Doctoral degree in Data Mining (2018) from International Islamic University, Islamabad, Pakistan. Previously, he served more than 3 years as an Assistant Professor at CUST University, Islamabad and 4 years as a Lecturer at Comsats University Islamabad, Pakistan. In addition, he served 12 years in HVAC industry, Islamabad and developed several embedded systems solutions for Air-conditioning systems. Dr. Malik authored more than 23 research papers published in leading International Journals and Conferences. His research interests include Social Media Mining, Natural Language Processing, Predictive Analytics and Social Computing.

Google Scholar

HSE University, Moscow October 24-27, 2023