Browsing by Author "Garg, Vikas"
Now showing 1 - 15 of 15
- Results Per Page
- Sort Options
- AbODE: Ab initio antibody design using conjoined ODEs
A4 Artikkeli konferenssijulkaisussa(2023-07) Verma, Yogesh; Heinonen, Markus; Garg, VikasAntibodies are Y-shaped proteins that neutralize pathogens and constitute the core of our adaptive immune system. De novo generation of new antibodies that target specific antigens holds the key to accelerating vaccine discovery. However, this co-design of the amino acid sequence and the 3D structure subsumes and accentuates, some central challenges from multiple tasks including protein folding (sequence to structure), inverse folding (structure to sequence), and docking (binding). We strive to surmount these challenges with a new generative model AbODE that extends graph PDEs to accommodate both contextual information and external interactions. Unlike existing approaches, AbODE uses a single round of full-shot decoding, and elicits continuous differential attention that encapsulates, and evolves with, latent interactions within the antibody as well as those involving the antigen. We unravel fundamental connections between AbODE and temporal networks as well as graph-matching networks. The proposed model significantly outperforms existing methods on standard metrics across benchmarks. - Anomaly Detection in Hydro Plants using Data-driven Methods
School of Science | Master's thesis(2024-12-16) Zong, XiaAs energy industries digitalize, vast amounts of operational data are generated from systems like hydropower plant turbines. Traditional anomaly detection methods relying on manual inspections often result in inefficiencies and undetected faults. This thesis explores the use of deep transfer learning to enhance anomaly detection in hydropower plants, addressing challenges such as data scarcity, domain variability, and limited labeled anomalies. Transfer learning leverages pre-trained models trained on units with extensive historical data to improve detection in data-limited units. Fault simulations based on expert-defined failure modes, such as pump failure and internal leakage, were employed to address the lack of labeled anomalies. Model performance was evaluated using the Numenta Anomaly Benchmark (NAB), which rewards timely detection and minimizes false alarms. Results demonstrate that transfer learning improves anomaly detection, particularly when source and target domains are well-aligned. However, challenges such as sensitivity to sequence length and domain variability highlight the need for precise threshold setting and domain-specific preprocessing. This work contributes to advancing value-based maintenance strategies by providing insights into designing robust transfer learning pipelines and adopting flexible evaluation frameworks for industrial anomaly detection. - Anomaly Detection in Time Series: Uncovering the Potential of Forecasting in Industrial Context and Developing Insights into Anomalies in Hierarchically Aggregated Structures
School of Science | Master's thesis(2024-12-19) Zólyomi, LeventeThe proliferation of time series data across industrial domains has made anomaly detection a critical task for ensuring operational efficiency and reliability. This thesis explores two interconnected themes: the potential of forecasting models for Time Series Anomaly Detection (TSAD) in industrial settings and the unique challenges posed by Hierarchical Time Series (HTS), often found in industrial contexts. Leveraging probabilistic forecasting, the study examines the suitability of neural forecasting models as anomaly detectors, while also highlighting interpretability, scalability, and alignment with industrial requirements. Furthermore, the research identifies and formalizes novel anomaly types emerging from hierarchical aggregations, such as aggregation-concealed and perturbed anomalies, which current TSAD methods fail to address effectively. Empirical evaluations reveal that forecasting-based approaches while somewhat lag behind state-of-the-art TSAD methods on standard benchmarks, they excel in meeting industrial needs by offering advantages in interpretability, simplicity of deployment, and alignment with qualitative priorities. In the HTS context, results highlight the inadequacy of current methods for identifying hierarchical anomalies, emphasizing the need for tailored approaches. This work underscores the trade-off between accuracy and industrial applicability in anomaly detection, providing actionable insights for real-world adoption. Moreover, it lays the groundwork for future research around anomalies in HTS. - Generative AI for graph-based drug design: Recent advances and the way forward
A2 Katsausartikkeli tieteellisessä aikakauslehdessä(2024-02) Garg, VikasDiscovering new promising molecule candidates that could translate into effective drugs is a key scientific pursuit. However, factors such as the vastness and discreteness of the molecular search space pose a formidable technical challenge in this quest. AI-driven generative models can effectively learn from data, and offer hope to streamline drug design. In this article, we review state of the art in generative models that operate on molecular graphs. We also shed light on some limitations of the existing methodology and sketch directions to harness the potential of AI for drug design tasks going forward. - Graph4GUI: Graph Neural Networks for Representing Graphical User Interfaces
A4 Artikkeli konferenssijulkaisussa(2024-05-11) Jiang, Yue; Zhou, Changkong; Garg, Vikas; Oulasvirta, AnttiPresent-day graphical user interfaces (GUIs) exhibit diverse arrangements of text, graphics, and interactive elements such as buttons and menus, but representations of GUIs have not kept up. They do not encapsulate both semantic and visuo-spatial relationships among elements. To seize machine learning’s potential for GUIs more efficiently, Graph4GUI exploits graph neural networks to capture individual elements’ properties and their semantic—visuo-spatial constraints in a layout. The learned representation demonstrated its effectiveness in multiple tasks, especially generating designs in a challenging GUI autocompletion task, which involved predicting the positions of remaining unplaced elements in a partially completed GUI. The new model’s suggestions showed alignment and visual appeal superior to the baseline method and received higher subjective ratings for preference. Furthermore, we demonstrate the practical benefits and efficiency advantages designers perceive when utilizing our model as an autocompletion plug-in. - How Powerful are Higher-order Topological Neural Networks?
School of Science | Master's thesis(2024-12-18) Akbari, AmirrezaMachine learning and deep learning have made incredible strides in solving complex problems across many fields, from understanding images and language to advancing biomedical research. However, many traditional methods, especially graph-based approaches, focus on pairwise relationships—connections between two things at a time. While this works for simpler scenarios, it falls short when dealing with more complex real-world interactions, like group dynamics in social networks or multi-protein interactions in biology, which involve relationships between multiple entities at once. To better understand these higher-order interactions, we need tools and models that go beyond simple pairwise connections. In this thesis, we explore a new type of mathematical structure called \emph{uniform attributed combinatorial complexes (UACCs)}. These structures extend the idea of graphs to represent higher-order relationships in data. We propose a method called the \emph{combinatorial complex Weisfeiler-Leman (CCWL)} test, which builds on a well-known algorithm used to compare graphs, and we extend it to work with these higher-order structures. To push this idea further, we introduce \emph{$k$-dimensional combinatorial complexes ($k$-CCs)} and the \emph{$k$-dimensional CCWL ($k$-CCWL)} test, which characterize even more complex structures. We also develop a new type of neural network, called the \emph{combinatorial complex network (CCN)}, along with its higher-order version, the \emph{$k$-CCN}, designed specifically to process and learn from these advanced structures. To understand how powerful these tools are, we connect them to logic and game theory. We create a new logical framework called \emph{topological counting logic ($\TCLogic{k}$)} to describe higher-order structures and a game-based method called the \emph{topological $k$-pebble game} to compare structures. We prove that these methods—$k$-CCWL, $\TCLogic{k+2}$, and the topological $k+2$-pebble game—are equally effective at distinguishing them. We also show that increasing the dimension ($k$) makes these methods more powerful. This work lays the foundation for a new way of thinking about and analyzing complex data. By extending traditional tools and neural networks to handle higher-order relationships, this research opens up possibilities for more expressive and capable machine learning models that can take care of the complex interactions found in the real world. - Language Models for PHA design (LaMP)
Perustieteiden korkeakoulu | Master's thesis(2024-01-22) Choo, HyunkyungDue to the significant increase in the use of petroleum-based plastics and their harmful impact on the environment, the replacement of plastics with more environmentally responsible and sustainable alternatives has become increasingly inevitable. To address the respective materials development needs, polyhydroxyalkanoates (PHAs), a class of biosynthetic, biodegradable polymers, have received great attention as a sustainable alternative to petroleum-based plastics. With over 160 identified monomers, PHAs offer diverse possibilities for designing bioplastics tailored to specific applications. Their thermal, mechanical, and chemical properties, coupled with biodegradability and biocompatibility, make PHAs an attractive alternative. However, despite their enormous potential, the number of combinatorial possibilities of PHAs is so great that it is almost impossible to investigate these compounds on a case-by-case basis. Moreover, given the vast chemical space, it is difficult to design and develop new PHA-based polymers with targeted properties for a wide range of applications. One possible solution is inverse molecular design, which uses deep learning techniques for de novo generation of molecules at the starting point of desired properties. SMILES (Simplified Molecular Input Line Entry System) molecular notation represents molecules as strings, allowing pretrained large-scale language models to be applied to molecular design. This thesis introduces Language Models for PHA design (LaMP), developed language models for effectively discovering new PHAs in terms of its properties for specific applications. Pretrained language models such as RoBERTa and GPT2 are fine-tuned to predict properties of PHAs, and generate Molecules using GPT2, and lastly combine prediction and generation models into cVAE structure (conditional Variational Autoencoder) to build an end-to-end model of conditional generation. Our results show that the language models are capable of learning PHAs’ structure-property relationships, and further generate new PHA based on targeted properties. - On the Generalization of Equivariant Graph Neural Networks
A4 Artikkeli konferenssijulkaisussa(2024) Karczewski, Rafał; Souza, Amauri H.; Garg, VikasE(n)-Equivariant Graph Neural Networks (EGNNs) are among the most widely used and successful models for representation learning on geometric graphs (e.g., 3D molecules). However, while the expressivity of EGNNs has been explored in terms of geometric variants of the Weisfeiler-Leman isomorphism test, characterizing their generalization capability remains open. In this work, we establish the first generalization bound for EGNNs. Our bound depicts a dependence on the weighted sum of logarithms of the spectral norms of the weight matrices (EGNN parameters). In addition, our main result reveals interesting novel insights: i) the spectral norms of the initial layers may impact generalization more than the final ones; ii) ε-normalization is beneficial to generalization ' confirming prior empirical evidence. We leverage these insights to introduce a spectral norm regularizer tailored to EGNNs. Experiments on real-world datasets substantiate our analysis, demonstrating a high correlation between theoretical and empirical generalization gaps and the effectiveness of the proposed regularization scheme. - Predictive Machine Learning Modeling for Short Term Flexible Load Quantification in Residential Building Thermal Mass: Towards access to flexibility markets
Perustieteiden korkeakoulu | Master's thesis(2024-06-17) Alaraasakka, Rosa-MariaGrid balancing is increasingly challenging due to rising consumption and variable renewable energy production. Flexibility markets are being developed to harness unused flexible potential, exposing the need for flexibility quantification. This thesis examined existing machine learning models' goals and feature selection for flexibility quantification. A novel short-term total flexible load prediction model was developed to ease flexibility market participation. It was found that most existing models focus on building control optimization, revealing a gap in price-independent models. Additionally, a distinction is needed between total flexibility enabled by building thermal mass and relative flexibility enabled by the optimized system, determined by the feature selection. A market-price independent model was developed with a scalable approach. The model was trained with data from a simulated building that included upward flexible events, using a single heating device utilizing the building's thermal mass. The model operated in three phases: In the first phase, recursive machine learning modelling was used to capture the dynamic thermal behaviour of the residential building. The indoor temperatures were computed in the second phase based on the model predictions. Finally, the model computed the total flexible load in kWh within specified indoor temperature limits. Linear-, Ridge-, Lasso Regression and Random Forest were compared. A combination model of Ridge- and Lasso Regression was the best-performing model with excellent accuracy, with MAE of 0.1 C, ranging between 0.01 C and 0.4 C. Flexibility within 21 C to 23.5 C limits was 15.84 kWh on average. It was found that conditions affect flexibility quantity significantly. Results were not verified in the real world. The thesis extends the understanding of data-driven flexible load quantification in residential buildings and aids in developing scalable tools for flexibility markets. Future research should focus on validating the model in real-world settings and expanding modelling to several heating devices. - Provably expressive temporal graph networks
A4 Artikkeli konferenssijulkaisussa(2022) Souza, Amauri H.; Mesquita, Diego; Kaski, Samuel; Garg, VikasTemporal graph networks (TGNs) have gained prominence as models for embedding dynamic interactions, but little is known about their theoretical underpinnings. We establish fundamental results about the representational power and limits of the two main categories of TGNs: those that aggregate temporal walks (WA-TGNs), and those that augment local message passing with recurrent memory modules (MP-TGNs). Specifically, novel constructions reveal the inadequacy of MP-TGNs and WA-TGNs, proving that neither category subsumes the other. We extend the 1-WL (Weisfeiler-Leman) test to temporal graphs, and show that the most powerful MP-TGNs should use injective updates, as in this case they become as expressive as the temporal WL. Also, we show that sufficiently deep MP-TGNs cannot benefit from memory, and MP/WA-TGNs fail to compute graph properties such as girth. These theoretical insights lead us to PINT --- a novel architecture that leverages injective temporal message passing and relative positional features. Importantly, PINT is provably more expressive than both MP-TGNs and WA-TGNs. PINT significantly outperforms existing TGNs on several real-world benchmarks. - Recommendation Engine Development for Real State Marketplace App
Perustieteiden korkeakoulu | Master's thesis(2022-12-12) Perez, AlejandroThe goal of this project is to create a recommendation engine for Casafy, a Proptech startup that wants to develop a Real Estate Marketplace App in Brazil. For this, first a Rule-based recommendation system will be developed as a placeholder in order to collect user interaction data. Afterwards, two Machine learning Recommendation Engines will be developed making use of user's, real estate property's and aforementioned interaction data. These two models will be a matrix factorization based collaborative filtering system and a model-based system. An ablation study will be conducted to examine the contribution of each individual component in the data. Finally, their performance will be compared using several metrics, such as rmse, micro f1, macro f1 and hamming loss to evaluate and decide which architecture will be used for the final product. - Scalable stance detection with automated topic discovery
Perustieteiden korkeakoulu | Master's thesis(2024-08-19) Steeman, RenéGiven the vast amounts of data available and the breadth of opinions expressed within it, there is a need for automated analysis. Such analysis can be used to understand customer opinions, public support for political initiatives, or find bias in news coverage. Current systems that try to understand the opinions within text are often limited to sentiment analysis, classifying a text's tone as being positive or negative. Stance detection provides a more powerful solution that classifies a text's opinion towards a specific statement, which allows for more in-depth insights into subjects of interest. However, stance detection has remained an often overlooked field and suffers from several issues that hinder its use. Firstly, many systems cannot generalise well beyond their training data, limiting their use for broader datasets. Secondly, the statements towards which the opinion is directed can be hard to determine when no manually created list exists. When working with large amounts of text content, one may not know what topics it contains nor even how many topics there are. To solve these shortcomings, we present a system that is able to obtain more generalizable text understanding and automatically discover the main topics in a dataset of texts. Additionally, there will be a focus on computational performance and experimentation with uncertainty quantification. This system consists of two main parts that can be used independently. The first gives the relevant topics based on raw text, while the second extracts the stance of text towards a list of provided topics. We introduce a new metric to validate the performance of topic discovery systems that cluster documents and name them. It allows for the use of labelled data without exact label matching. For the scoring of the stance detection system, SemEval 2016 Task 6A shall be used to compare it to other state-of-the-art systems, including GPT. Results for both the topic and stance detection system are promising, with our stance detection scoring up to 77.2% ± .9 on SemEval compared to a score of 72% as the second best published result. - Topological Neural Networks go Persistent, Equivariant, and Continuous
A4 Artikkeli konferenssijulkaisussa(2024) Verma, Yogesh; Souza, Amauri H.; Garg, VikasTopological Neural Networks (TNNs) incorporate higher-order relational information beyond pairwise interactions, enabling richer representations than Graph Neural Networks (GNNs). Concurrently, topological descriptors based on persistent homology (PH) are being increasingly employed to augment the GNNs. We investigate the benefits of integrating these two paradigms. Specifically, we introduce TopNets as a broad framework that subsumes and unifies various methods in the intersection of GNNs/TNNs and PH such as (generalizations of) RePHINE and TOGL. TopNets can also be readily adapted to handle (symmetries in) geometric complexes, extending the scope of TNNs and PH to spatial settings. Theoretically, we show that PH descriptors can provably enhance the expressivity of simplicial message-passing networks. Empirically, (continuous and E(n)-equivariant extensions of) TopNets achieve strong performance across diverse tasks, including antibody design, molecular dynamics simulation, and drug property prediction. - Treatment effect prediction with continuous interventional flows
Perustieteiden korkeakoulu | Master's thesis(2023-12-11) Hemmann, LindaReal-world problems in continuously evolving settings, such as predicting the efficacy of medical treatment, often require estimating the causal effects of interventions. Issues such as irregularly-sampled and missing data, unobserved factors, and ethical concerns make such settings especially challenging. The existing methodology relies on low-dimensional embeddings, potentially incurring information loss. We circumvent this limitation with a novel approach "twinning" that augments the partial observations with additional latent variables and appeals to conditional continuous normalizing flows to model the system dynamics, obtaining accurate density estimates. We also introduce a new approach to overcome a key technical challenge, namely, mitigating stiffness of the underlying neural ODE. The model provably benefits from auxiliary non-interventional data during training. We showcase the flexibility of the proposed method with tasks like anomaly detection and counterfactual prediction, and benchmark on standard reinforcement learning (Half-Cheetah) and treatment effect prediction (tumor growth) contexts. - Using federated learning techniques to train deep reinforcement learning agents for HVAC control
Perustieteiden korkeakoulu | Master's thesis(2023-08-21) Hagström, FredrikBuildings account for 40 \% of the global energy consumption. A considerable portion of building energy consumption stems from heating, cooling and ventilation, and so implementing smart, energy-efficient HVAC systems has the potential to significantly impact the course of climate change. In recent years, model-free reinforcement learning algorithms have been increasingly assessed for this purpose due to their ability to learn and adapt purely from experience. They have been shown to outperform classical controllers in terms of energy cost and consumption, as well as thermal comfort. However, their weakness lies in their relatively poor data efficiency, requiring long periods of training to reach acceptable policies, making them inapplicable to real-world controllers directly. Hence, common research goals are to improve the learning speed, as well as to improve their ability to generalize, in order to facilitate transfer learning to unseen building environments. In this thesis, we take a federated learning approach to training the reinforcement learning controller of an HVAC system. A global control policy is learned by aggregating local policies trained on multiple data centers located in different climate zones. The goal of the policy is to simultaneously minimize the energy consumption and maximize the thermal comfort. The federated optimization strategy indirectly increases both the rate at which experience data is collected, and the variation in the data. We demonstrate through experimental evaluation that these effects lead to a faster learning speed, as well as greater generalization capabilities in the federated policy compared to any individually trained policy. Furthermore, the learning stability is significantly improved, with the learning process and performance of the federated policy being less sensitive to the choice of parameters and the inherent randomness of reinforcement learning. Federated learning is applied to two state-of-the-art deep reinforcement learning algorithms: Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient. Comparing their respective performance, we find federated Soft Actor-Critic to provide a more balanced trade-off between energy consumption and thermal comfort while having greater learning speed and stability.