Browsing by Author "Jung, Alex"
Now showing 1 - 20 of 58
Results Per Page
Sort Options
Item Adversarial robustness of GPT-3.5 Turbo(2024-04-26) Nguyen, Anh; Jung, Alex; Perustieteiden korkeakoulu; Korpi-Lagg, MaaritThe rapid advancement of artificial intelligence (AI) models has brought forth a critical need for a thorough examination of potential ethical and security concerns. However, many ethical issues regarding AI are being overlooked, including misinformation, bias, and accuracy. Within the scope of robustness, the study aims to assess the consistency of GPT’s output given the diversity of inputs. The primary objective is to construct a comprehensive framework for assessing the robustness of large language models (LLMs), with a specific emphasis on GPT 3.5. The research takes a proactive stance by developing experiments and methods using Python coding language, incorporating literature review and data analysis. This study answered the question how the algorithmic robustness of Large Language Models can, particularly GPT 3.5 turbo , be effectively assessed. By delving deeply into the assessment of robustness, the research seeks to address challenges associated with the widespread use of powerful language models, fostering a more secure, ethical, and transparent landscape in the field of artificial intelligence.Item Analysis of Network Lasso for Semi-Supervised Regression(PMLR, 2019) Jung, Alex; Vesselinova, Natalia; Department of Computer Science; Computer Science Professors; Computer Science - Large-scale Computing and Data Analysis (LSCA); Professorship Jung Alexander; Computer Science - Artificial Intelligence and Machine Learning (AIML); Helsinki Institute for Information Technology (HIIT)We apply network Lasso to semi-supervised regression problems involving network-structured data. This approach lends quite naturally to highly scalable learning algorithms in the form of message passing over an empirical graph which represents the network structure of the data. By using a simple non-parametric regression model, which is motivated by a clustering hypothesis, we provide an analysis of the estimation error incurred by network Lasso. This analysis reveals conditions on the network structure and the available training data which guarantee network Lasso to be accurate. Remarkably, the accuracy of network Lasso is related to the existence of suciently large network flows over the empirical graph. Thus, our analysis reveals a connection between network Lasso and maximum network flow problems.Item Anomaly Location Detection with Electrical Impedance Tomography Using Multilayer Perceptrons(IEEE, 2020-09-23) Huuhtanen, Timo; Jung, Alex; Department of Computer Science; Professorship Jung Alexander; Helsinki Institute for Information Technology (HIIT)Electrical impedance tomography (EIT) does imaging by solving a nonlinear ill-posed inverse problem. Recently, there has been an increasing interest in solving this problem with artificial neural networks. However, a systematic understanding of the optimal neural network architecture for this problem is still lacking. This paper compares the performance of different multilayer perceptron algorithms for detecting the location of an anomaly on a sensing surface by solving the EIT inverse problem. We generate synthetic data with varying anomaly sizes/locations and compare a wide range of multilayer perceptron algorithms by simulations. Our results indicate that increasing the dimensions of the perceptron improves performance, but this improvement saturates soon. The best performance is achieved when using the multilayer perceptron for regression and Gaussian noise addition as the regularization method.Item Application of Reinforcement Learning in Electrical Machine Design(2023-08-21) Sarcheshmehpour, Yasmin; Mukherjee, Victor; Perustieteiden korkeakoulu; Jung, AlexItem Applying Machine Learning to Develop Black-box Control Model of Active Double-Skin Facade(2021-01-25) Nguyen, Thao; Visala, Arto; Jung, Alex; Sähkötekniikan korkeakoulu; Ihasalo, HeikkiThe efficient energy performance of an active double-skin facade (DSF) has raised more attention to study and apply for developing the building control strategies and systems. Although DSFs can be actively operated in dynamic modes with controllable components such as shading slat angle, airflow path, and airflow rate, no autonomous control model has been deployed to utilize their maximum potential for building energy efficiency. This thesis aims to apply Machine Learning (ML) to train predictive models for developing a black-box control model that estimates the deliverable operational modes of DSF for the desired energy performance parameters under the related environmental conditions. An autonomous control system of DSF to be developed based on applied ML algorithms as an advanced building control strategy is challenging to carry out for the first time. Data acquisition (DAQ) is also to be planned for future works. The steady-state and dynamic simulations of a specific configuration of DSF were set up in EnergyPlus and used as training data processed in Python scripts. The simulations conducted thermal and visual performance against the possible variations of realistic boundary conditions and operational modes of DSF. After data analysis and identifying the ML problems, ANN, RF/ET, XGB predictive models were to learn for visual metric, airflow mode, and thermal metric in different airflow modes. Eventually, the black-box models were compared and selected according to several defined criteria of reliability. The DSF controller was designed by combining the selected models to control DSF operational modes and predict corresponding visual and thermal metrics. Python-programmed ML software libraries used are Scikit-learn, Keras based Tensorflow backend, and XGBoost. In conclusion, the ML black-box models probably suffer from overfitting and instability due to noises in the real world. The proposed solution to reduce variance is to enlarge training data and retrain the black-box models by online transfer learning. Otherwise, it is still highly recommended to proceed with the reinforcement learning approach or hybrid models promising to overcome the limitations of the black-box models.Item Applying Machine Learning to Forecast Formula 1 Race Outcomes(2023-08-21) Garcia Tejada, Loreto; Jung, Alex; Perustieteiden korkeakoulu; Jung, AlexPit stops are integral to the success of drivers in Formula 1 racing. This thesis aims to develop a predictive model that effectively determines the optimal timing for pit stops during specific laps of Formula 1 races. By employing machine learning algorithms and analyzing historical race data from the 2019 to 2022 seasons, this study creates a reliable system that considers various race factors, including tire degradation, car positions, and overall race dynamics. Three machine learning algorithms, namely Support Vector Machines (SVM), Random Forest, and Artificial Neural Networks are utilized and compared based on performance metrics, primarily the F1 score. The objective is to identify the most suitable algorithm capable of accurately predicting pit stop requirements. The findings of this thesis highlight the challenging nature of pit stop prediction in Formula 1. While the models demonstrate reasonable accuracy in predicting pit stops, achieving precise predictions remains complex due to the multitude of variables and inherent uncertainties involved. The results emphasize the models' potential as valuable decision-support tools rather than standalone predictors, emphasizing the importance of incorporating additional information and expert knowledge into the decision-making process.Item Artificial intelligence to support NFTs creation: Comparison of Machine learning algorithms to detect fraud in artwork(2022-08-22) Gyabaah, Teddy; Karègar, Arman; Sähkötekniikan korkeakoulu; Jung, AlexIn this thesis Image Copy Detection (ICD) is explored in the context of image copyright violation detection. Given a set of copyrighted or target images, Image copy detection is the task of detecting whether an input images is an exact copy or modified version of one or more images in the target set. An ICD algorithm has to be resistant to content preserving transformations such as image rotation, noise addition, and scaling. In this thesis are presented the state-of-the art techniques that leverage CNNs (Convolutional Neural Network) to detect copies in near real-time. In this dissertation, two copyright detection solutions using GAN (Generative Adversarial Network) are presented to analyze wheater the adversarial loss used to train them can provide better results in terms of detection accuracy. The first leverages the discriminator model of DCGAN (Deep Convolutional Network GAN) to classify input images, with the support of a perceptual Hashing method using Discrete Cosine Transform. The second solution uses HashGAN with an architecture formed by a generator, discriminator and an encoder. The network generates hashes out of image features, using the parameters learned by both discriminator and generator model. Both solutions are able to achieve good performances in ICD, in particular HashGAN is able to maintain a high accuracy over the image transformations chosen for evaluation, in the first solution instead, detection accuracy decreases significantly with rotated images. GAN have potential in the context of image authentication and content copy detection, because of their ability to generate images perceptually similar to the ones contained in the train set, these images can be used as data augmentation in some scenarios. In terms of accuracy and F1-score, GAN based model achieve inferior performances to the state-of-the art models using Deep Convolutional Networks.Item Aspect Based Sentiment Analysis in Finnish(2022-01-24) Hellström, Rickard; Jerome, Richard; Perustieteiden korkeakoulu; Jung, AlexIn this thesis Aspect Based Sentiment Analysis (ABSA) is explored for the Finnish language. ABSA is the task of finding aspects mentioned in texts about a product or a service and then classifying the sentiment around the aspects e.g., the sentence "I liked the packaging of this product" has the aspect packaging and the sentiment around is positive. ABSA can be broken into two subtasks. The first is aspect term extraction ATE the other is aspect polarity classification APC. In ATE we extract the aspect from a sentence and in APC classify of the aspect has been talked about in positive, neutral, or negative way. In this thesis we present two possible solutions to this task for the Finnish language. In the first solution we use a pretrained deep learning model called FinBERT and it is used to solve both ATE and APC with one machine learning model. The other solution we use a dependency parser and a set of rules for ATE and again the FinBERT is used but now only for APC. The solutions are found to have potential, but it is found that a lack of data in Finnish is the most difficult problem to overcome to improve the solutions provided.Item Automated categorization of course feedback using word embeddings(2019-05-16) Erkinheimo, Nikolas; Jung, Alex; Sähkötekniikan korkeakoulu; Turunen, MarkusItem Backdoor attacks on large transformer-based regression model(2024-08-19) Mård, Rudolf; Tian, Yu; Perustieteiden korkeakoulu; Jung, AlexPrevious research on deep learning models under data poisoning attacks is largely limited to studying models trained for classification tasks. However, many problems are more suitably formulated as regression tasks, where the prediction targets of the model are continuous variables. This thesis explores the behavior of a large transformer-based regression model under a certain type of data poisoning attack called backdoor attack. Furthermore, this exploratory research was confined to study the model’s behavior during early training phase. To study the impact that these type of attacks has on the selected target model, an implementation of a state of the art deep learning-based weather prediction model, Pangu-Weather, was created. The experiments conducted in this thesis applied a simple backdoor attacking scheme to the training process of the target model. The backdoor attacking scheme involves embedding a trigger-pattern to the input data points of the model and poisoning the prediction target values by applying a multiplier of 0.5 to them. The goal of the attack is to make the model produce 50 percent lower predictions when the trigger pattern is present in the input. After training copies of the target model on clean and poisoned data, their performance was compared to each other under normal prediction making conditions and when exposed to data poisoning attacks. The experiments conducted in this thesis finds that effects of the applied backdoor attacks on behavior of the target model are prominently visible even after a short training period. The poisoned models were observed to achieve lower root mean squared error values when making predictions on clean data as opposed to the target model trained on clean data. The poisoned models were also observed to produce outlying root mean squared error values when comparing the models’ predictions made on poisoned input data to poisoned prediction targets. However, the performance and behavior of the poisoned models were observed to only change minimally when embedding input data points with a trigger-pattern associated with the backdoor attacks, indicating that the malicious learning task of producing controlled false predictions was not learned by the target model this early into the training phase.Item Better Utilization of Relational Data in Machine Learning(2021-05-17) Mohammadi, Zaniar; Minkkinen, Sami; Perustieteiden korkeakoulu; Jung, AlexThe thesis will introduce geometric deep learning and its benefits for solving problems where the data has relations between the different data points compared to other common machine learning methods that work well with tabular format data. One of the most common data that has relations are relational databases. The research question that this thesis aims to answer is, how to make a better utilization of relational data in machine learning. The hypothesis then is that we are able to better utilize relational data in machine learning by the use of graph neural networks. We will be using real world e-commerce data to solve a propensity to churn problem. Propensity to churn is a predictive model that is able to predict the risk of the customer leaving. By creating a classification model that could recognize such customers, the company could concentrate on these customers and keep the customers. The hypothesis space of the model will be binary where the customer has either churned or not. Since the data set that we used is highly imbalanced, we chose Matthews Correlation Coefficient loss which is able to deal with the imbalance of the data. Similarly we used MCC and precision-recall for our metrics. In our experiments we achieved an MCC value of 0.46 which is significant since the range of MCC is from -1 to 1 where -1 represents negative correlation, $0$ represents an average random prediction and 1 representing perfect positive correlation. Also, our model was able to perform considerably better than our baseline logistic regression model which was not able to generalize well on the data.Item Cellular Network Average User Throughput-Downlink Prediction by Machine Learning(2018-12-10) Shehata, Ahmed; Honkasalo, Zhi Chun; Perustieteiden korkeakoulu; Jung, AlexCommunication service providers (CSPs) face enormous pressure to cope up with the massive demand for data connectivity due to the rapid spreading of smart devices and the rapid growth of data-intensive applications. CSPs are committed to provide their subscribers with a high level of customer experience. In order to achieve this commitment, CSPs need to expand their network capacity to provide a better throughput (the amount of bits a user can receive for download) to their subscribers. This work is a feasibility study to build a simple unified global model using data collected from different CSPs to predict the average user throughput in Downlink (DL), that can be provided by a CSP to any subscriber. Three state-of-the-art machine learning (ML) algorithms [Random Forest (RF), Gradient Boosting Machines (GBM), and Artificial Neural Networks (ANN)] have been experimented to predict the average user throughput in DL using LTE E-UTRAN measurements and selected configuration parameters. The result has indicated that, the performance of the ensemble methods (RF and GBM) outperforms the performance of the ANN. The proposed model is an ensemble model that combines both RF and GBM and reports their average as the final predicted average user throughput.Item Clustering IoT devices for network intrusion detection systems(2020-05-20) Hämmäinen, Tony; Kahles, Julen; Perustieteiden korkeakoulu; Jung, AlexWe develop feature engineering techniques that transform IP flows to device-level data points. We continue to cluster the data points with the DBSCAN algorithm. The clustering is motivated by the idea that organizing IoT devices as few homogeneous device groups may improve network intrusion detection systems. Experiments on simulated IoT network data indicate that the best clustering outcomes are achieved by 1) forming data points based on relative frequencies of network IP address, network port, and IP protocol combinations, and 2) calculating their pairwise distances using cosine distances.Item A collaborative approach for large-scale Electricity consumption using Federated Learning(2022-06-13) Ozen, Canberk; Jung, Alex; Perustieteiden korkeakoulu; Jung, AlexForecasting energy demand is a crucial topic in the energy industry to keep the balance between supply and demand, hence keeping the grid effectively operation. The adoption of renewable energy sources for the supply makes the forecasting problem ever the more prominent because of the additional uncertainty they bring to the grid, besides the consumers’ energy usage patterns. The uncertainty on the demand side forecasting can be theoretically overcome via a centralized predictive model that takes note of the consumers’ past electricity usage. However, in practice, forecasting energy demand is challenged by users’ concerns for the privacy of their energy data and the scalability of storing it, in addition to completing the model updates in time. Both problems can be solved if the centralized training paradigm is replaced with federated training, where each household trains its model locally, and the centralized server only acts as a coordinator by aggregating the weights of the individual models’ and sending the updates back to them, all without seeing the consumers’ data. Because of the diversity in energy usage, the convergence of local models may require too much time. This study will investigate federated learning to develop a clustering algorithm that groups similar residences as one node to fasten the model convergence without reducing its accuracy.Item Comparison of data-driven models for building energy load forecasting(2024-05-20) Hirvonen, Sara; Kazi, Sami; Perustieteiden korkeakoulu; Jung, AlexForecasting building energy loads is vital for smart energy management control systems that drive the energy efficiency of buildings. Data-driven forecasting models, learning from historical and real-time load data, offer advantages over traditional physics-based models, particularly in scenarios where detailed building information is not available. Existing literature was reviewed for identifying the state-of-the-art models for the building energy load forecasting task. Although various methods have been applied, there was no clear consensus on which are the optimal models for each use case. The most popular methods included Artificial Neural Networks, and meteorological variables, such as outdoor temperature, were among the most frequently used features. In this study, six data-driven models were implemented and compared for forecasting heating, cooling and electricity loads of an office building in Helsinki, Finland. Models included multi-variable linear regression (MLR), Support Vector Regression (SVR), Extreme Gradient Boosting (XGBoost), Multi-layer Perceptron (MLP), Long Short-term Memory Network (LSTM) and Convolutional Neural Network (CNN). Pre-processing and feature selection were conducted for the data based on examples set by the existing studies. Results demonstrated that among the implemented models, XGBoost excelled in heat load forecasting, while LSTM performed optimally for electricity load prediction. CNN and LSTM obtained the smallest errors for cooling load forecasting, but the data quality made it difficult to draw clear conclusions. In a case study, the best performing models for heating and cooling were implemented also to another building, for which the XGBoost and LSTM were found the best as well. However, interpreting evaluation metrics revealed inconsistencies between models. Models were also compared in terms of efficiency and required amount of training data. In general, deep learning models had longer training times. Variations in training data sets did not significantly impact model performance, although in most cases less data led to larger errors. Limitations of the study included feature selection, which was conducted similarly for all the models. Future research should explore different feature sets and consider seasonal variations of heating and cooling loads for improved model accuracy.Item Containing Future Epidemics With Trustworthy Federated Systems for Ubiquitous Warning and Response(FRONTIERS MEDIA SA, 2021) Carrillo, Dick; Nguyen, Duc Lam; Nardelli, Pedro Henrique Juliano; Pournaras, Evangelos; Morita, Plinio; Rodriguez, Demóstenes Z.; Dzaferagic, Merim; Siljak, Harun; Jung, Alex; Hébert-Dufresne, Laurent; Macaluso, Irene; Ullah, Mehar; Fraidenraich, Gustavo; Popovski, Petar; Department of Computer Science; Computer Science Professors; Computer Science - Large-scale Computing and Data Analysis (LSCA); Computer Science - Artificial Intelligence and Machine Learning (AIML); Helsinki Institute for Information Technology (HIIT); Professorship Jung AlexanderIn this paper, we propose a global digital platform to avoid and combat epidemics by providing relevant real-time information to support selective lockdowns. It leverages the pervasiveness of wireless connectivity while being trustworthy and secure. The proposed system is conceptualized to be decentralized yet federated, based on ubiquitous public systems and active citizen participation. Its foundations lie on the principle of informational self-determination. We argue that only in this way it can become a trustworthy and legitimate public good infrastructure for citizens by balancing the asymmetry of the different hierarchical levels within the federated organization while providing highly effective detection and guiding mitigation measures toward graceful lockdown of the society. To exemplify the proposed system, we choose a remote patient monitoring as use case. This use case is evaluated considering different numbers of endorsed peers on a solution that is based on the integration of distributed ledger technologies and NB-IoT (narrowband IoT). An experimental setup is used to evaluate the performance of this integration, in which the end-to-end latency is slightly increased when a new endorsed element is added. However, the system reliability, privacy, and interoperability are guaranteed. In this sense, we expect active participation of empowered citizens to supplement the more usual top-down management of epidemics.Item Contextual Bandits for Staffing in Consulting Companies: An Exploration of Personalized Decision Making(2023-10-09) Bogdanova, Mariia; Jung, Alex; Perustieteiden korkeakoulu; Jung, AlexStaffing is critical for consulting organizations as they seek to identify and hire the most qualified and motivated individuals for a given job. Traditional staffing methods rely on manual evaluation, assessment, and selection. However, these methods may not always be efficient or effective in identifying the most suitable candidates, especially in dynamic and complex environments with many employees. In this thesis, we explore personalized decision making by conducting a proof of concept study to propose using contextual bandits, a class of machine learning algorithms, as a framework for staffing. Contextual bandits are designed to make decisions or suggestions based on contextual information, which can be useful in identifying the most suitable candidate for a given job. First, we collect data on the job requirements and candidates’ competencies, interests, and skills, we clean this data and format it. Second, develop contextual bandit exploration models using Vowpal Wabbit library. Third, we perform hyperparameter tuning of the parameters in control of exploration-exploitation trade- off. Forth, we select the best-performing exploration algorithm. After that we discuss the findings. Finally, we create a dashboard for the staffing and resource management teams to use the trained algorithm to make optimal staffing decisions, considering job requirements and candidate attributes. In the end, we discuss the areas for further development. Overall, our research provides evidence that contextual bandits can be a powerful tool for staffing, providing an efficient and effective way to identify the most qualified candidates for a given job. Our framework can help organizations make better staffing decisions and improve their overall performance and employee satisfaction.Item Crown-of-Thorns Starfish Detection by state-of-the-art YOLOv5(2022-07-29) Truong, Phuong; Vo, Huynh; Perustieteiden korkeakoulu; Jung, AlexCrown-of-Thorns Starfish outbreaks appeared many decades ago which have threatened the overall health of the coral reefs in Australia’s Great Barrier Reef. This indeed has a direct impact on the reef-associated marine organisms and severely damages the biological diversity and resilience of the habitat structure. Yet, COTS surveillance has been carried out for long but completely by human effort, which is absolutely ineffective and prone to errors. There emerges an urge to apply recent advanced technology to deploy unmanned underwater vehicles for detecting the target object and taking suitable actions accordingly. Existing challenges include but not limited to the scarcity of qualified underwater images as well as superior detection algorithms which is able to satisfy major criteria such as light-weight, high accuracy and speedy detection. There are not many papers in this specific area of research and they can’t fulfill these expectations completely. In this thesis, we propose a deep learning based model to automatically detect the COTS in order to prevent the outbreak and minimize coral mortality in the Reef. As such, we use CSIRO COTS Dataset of underwater images from the Swain Reefs region to train our model. Our goal is to recognize as many starfish as possible while keeping the accuracy high enough to ensure the reliability of the solution. We provide a comprehensive background of the problem, and an intensive literature review in this area of research. In addition, to better align with our task, we use F2 score as the main evaluation metrics in our MS COCO- based evaluation scheme. That is, an average F2 is computed from the results obtained at different IoU thresholds, from 0.3 to 0.8 with a step size of 0.05. In our implementation, we experiment with model architecture selection, online image augmentation, confidence score threshold calibration and hyperparameter tuning to improve the testing performance in the model inference stage. Eventually, we present our novel COTS detector as a promising solution for the stated challenge.Item Deep learning based Mammography Image Segmentation(2023-03-20) Vu, Hung; Lilja, Mikko; Sähkötekniikan korkeakoulu; Jung, AlexItem Deep Learning Methods for Demand Time Series Forecasting(2024-06-17) Suman, Dan; Kolesnikov, Dmitry; Perustieteiden korkeakoulu; Jung, AlexIn the rapidly evolving landscape of demand forecasting, the challenges posed by time series forecasting in dynamic environments necessitate the exploration of advanced methodologies. This thesis seeks to bridge the gap between traditional forecasting methods and the burgeoning potential of deep learning. Drawing inspiration from work conducted at Zalando SE, a market-leading fashion retailer in Europe, the research delves into the intricacies of forecasting within the online fashion industry, where pricing and discounting strategies are pivotal in maximizing stock lifetime value. The research underscores the significance of developing a global forecasting model, trained across a diverse assortment of articles, capable of providing article-specific predictions. Through a comprehensive exploration of feature engineering, cyclical feature transformations, and domain-specific features, the study aims to enhance the predictive capabilities of the model. Furthermore, it undertakes a comparative analysis of traditional methods such as ARIMA and ETS against novel deep learning approaches, highlighting the latter’s proficiency in capturing complex patterns across products and seasons. By navigating the challenges of forecasting, this thesis paves the way for the implementation of optimized pricing strategies and the augmentation of item profitability.
- «
- 1 (current)
- 2
- 3
- »