publications
2025
- UnHiPPO: Uncertainty-aware Initialization for State Space ModelsMarten Lienen, Abdullah Saydemir, and Stephan GünnemannIn Proceedings of the 42nd International Conference on Machine Learning, 2025
State space models are emerging as a dominant model class for sequence problems with many relying on the HiPPO framework to initialize their dynamics. However, HiPPO fundamentally assumes data to be noise-free; an assumption often violated in practice. We extend the HiPPO theory with measurement noise and derive an uncertainty-aware initialization for state space model dynamics. In our analysis, we interpret HiPPO as a linear stochastic control problem where the data enters as a noise-free control signal. We then reformulate the problem so that the data become noisy outputs of a latent system and arrive at an alternative dynamics initialization that infers the posterior of this latent system from the data without increasing runtime. Our experiments show that our initialization improves the resistance of state-space models to noise both at training and inference time.
- Benchmarking Contextual Understanding for In-Car Conversational SystemsPhilipp Habicht, Lev Sorokin, Abdullah Saydemir, and 2 more authors2025
In-Car Conversational Question Answering (ConvQA) systems significantly enhance user experience by enabling seamless voice interactions. However, assessing their accuracy and reliability remains a challenge. This paper explores the use of Large Language Models (LLMs) alongside advanced prompting techniques and agent-based methods to evaluate the extent to which ConvQA system responses adhere to user utterances. The focus lies on contextual understanding, the ability to provide accurate venue recommendations considering the user constraints and situational context. To evaluate the utterance/response coherence using an LLM, we synthetically generate user utterances accompanied by correct but also modified failure-containing system responses. We use input-output, chain of thought, self-consistency prompting, as well as multi-agent prompting techniques, with 13 reasoning and non-reasoning LLMs, varying in model size and providers, from OpenAI, DeepSeek, Mistral AI, and Meta. We evaluate our approach on a case study that involves a user asking for restaurant recommendations. The most substantial improvements are observed for non-reasoning models when applying advanced prompting techniques, in particular, when applying multi-agent prompting. However, non-reasoning models are significantly surpassed by reasoning models, where the best result is achieved with single-agent prompting incorporating self-consistency. Notably, the DeepSeek-R1 model achieves the highest F1-score of 0.990 at a cost of 0.002 USD per request. Overall, the best tradeoff between effectiveness and cost/time efficiency is achieved with the non-reasoning model DeepSeek-V3. Our results demonstrate that LLM-based evaluations offer a scalable and accurate alternative to traditional human-based evaluations for benchmarking contextual understanding in ConvQA systems.
2024
- Unfolding Time: Generative Modeling for Turbulent Flows in 4DAbdullah Saydemir, Marten Lienen, and Stephan GünnemannIn AI for Science Workshop at the 41st International Conference on Machine Learning, 2024
A recent study in turbulent flow simulation demonstrated the potential of generative diffusion models for fast 3D surrogate modeling. This approach eliminates the need for specifying initial states or performing lengthy simulations, significantly accelerating the process. While adept at sampling individual frames from the learned manifold of turbulent flow states, the previous model lacks the capability to generate sequences, hindering analysis of dynamic phenomena. This work addresses this limitation by introducing a 4D generative diffusion model and a physics-informed guidance technique that enables the generation of realistic sequences of flow states. Our findings indicate that the proposed method can successfully sample entire subsequences from the turbulent manifold, even though generalizing from individual frames to sequences remains a challenging task. This advancement opens doors for the application of generative modeling in analyzing the temporal evolution of turbulent flows, providing valuable insights into their complex dynamics.
2023
- Genetic algorithms and heuristics hybridized for software architecture recoveryMilad Elyasi, M Esad Simitcioğlu, Abdullah Saydemir, and 3 more authorsAutomated Software Engineering, 2023
Large scale software systems must be decomposed into modular units to reduce maintenance efforts. Software Architecture Recovery (SAR) approaches have been introduced to analyze dependencies among software modules and automatically cluster them to achieve high modularity. These approaches employ various types of algorithms for clustering software modules. In this paper, we discuss design decisions and variations in existing genetic algorithms devised for SAR. We present a novel hybrid genetic algorithm that introduces three major differences with respect to these algorithms. First, it employs a greedy heuristic algorithm to automatically determine the number of clusters and enrich the initial population that is generated randomly. Second, it uses a different solution representation that facilitates an arithmetic crossover operator. Third, it is hybridized with a heuristic that improves solutions in each iteration. We present an empirical evaluation with seven real systems as experimental objects. We compare the effectiveness of our algorithm with respect to a baseline and state-of-the-art hybrid genetic algorithms. Our algorithm outperforms others in maximizing the modularity of the obtained clusters.
2022
- HYGAR: a hybrid genetic algorithm for software architecture recoveryMilad Elyasi, Muhammed Esad Simitcioğlu, Abdullah Saydemir, and 2 more authorsIn Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, 2022
Genetic algorithms have been used for clustering modules of a software system in line with the modularity principle. The goal of these algorithms is to recover an architectural view in the form of a modular structural decomposition of the system. We discuss design decisions and variations in existing genetic algorithms devised for this purpose. We introduce HYGAR, a novel hybrid variant of existing algorithms. We apply HYGAR for software architecture recovery of 5 real systems and compare its effectiveness with respect to a baseline and a state-of-the-art hybrid algorithm. Results show that HYGAR outperforms these algorithms in maximizing the modularity of the obtained clustering.
2021
- On the Use of Evolutionary Coupling for Software Architecture RecoveryAbdullah Saydemir, Muhammed Esad Simitcioğlu, and Hasan SözerIn 15th Turkish National Software Engineering Symposium (UYMS), 2021
Software architecture documentation can be partially obtained automatically by means of software architecture recovery tools. These tools mainly cluster software modules to provide a high level structural organization of these modules. They use dependency graphs as input. These graphs reflect various types of coupling among software modules. In this paper, we present an empirical evaluation of using evolutionary coupling as a complementary source of information for software architecture recovery. We use 3 open source projects as subject systems. We derive inter-module dependencies for these systems based on various levels of evolutionary coupling among their modules. We investigate the accuracy of software architecture recovery when input dependency graphs are extended with these additional dependencies. Results show that involving evolutionary coupling in the process can increase the accuracy of architecture recovery by up to 40%.