Student Projects & Theses
We have several opportunities for students to work with us on the latest research in Reinforcement Learning.
The projects are listed below, and you are also encouraged to propose your own topic and cooperate with us.
For potential candidates, please send us an email to nrprojects@informatik.uni-freiburg.de
Current Project Openings
If none of the listed projects are of interest, feel free to contact us for new projects via the address mentioned above! Please also include your fields of interest and prior knowledge, as it helps us find a suitable supervisor.
Title | Type | supervisor | link |
Reinforcement Learning for RoboCup Soccer Keep Away |
Master's project |
Lisa Graf |
Reinforcement Learning for RoboCup Soccer Keep Away
Position type: project
Description: Unlike other RoboCup leagues, the Small Size League (SSL) has seen limited application of reinforcement learning (RL) techniques. This project aims to explore the potential benefits of utilizing RL methods in the SSL by focusing on a specific task: a "keep-away" scenario. We propose to demonstrate how state-of-the-art RL algorithms can be effectively employed to enhance team performance in dynamic, real-time environments.
Your task is to set up a simulation environment and train a keep-away task similar to the one in the paper references below. For more information please refer to the attached PDF.
Contact: Lisa Graf
More information: PDF
References: Stone, P., Sutton, RS, Singh, S. (2001). Reinforcement Learning for 3 vs. 2 Keepaway. In: Stone, P., Balch, T., Kraetzschmar, G. (eds) RoboCup 2000: Robot Soccer World Cup IV. RoboCup 2000. Lecture Notes in Computer Science(), vol 2019. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45324-5_23
Integration of process models (biological models of forests) into neural networks
Position type: Project
Description : When studying complex phenomena in nature, a simplified version of a system is often used to describe the underlying effects and processes. In forest science, this is done through process models (PM), which combine empirical measurements with theoretical understanding of the underlying processes. However, PMs can represent complex phenomena incompletely or oversimplified. Neural networks (NNs) can outperform PMs with more flexible representations when large data sets are available. For small and sparse datasets, NNs can benefit from the integration of domain knowledge such as PMs.
Specifically for time series, LSTMs have been successfully combined with prior knowledge.
In this project, we would like to investigate whether we can improve the performance by combining new model architectures such as Mamba or Kolmogorov–Arnold Networks (KAN) with PMs. Several approaches could be explored:
- Use Mamba or other new model architectures instead of the current NN:
Combine the PM model into the loss function to help guide the NNs learning processes
- Combine PM model directly into the architecture of Mamba or of other new model types, eg for Mamba:
Integrate the PM into the Mamba architecture by using Mamba's gated mechanism to evaluate the value of the PM model's performance for time series prediction.
Contact: Hanne Raum
Previous Projects
Reinforcement Learning development based on an excavator
Position type: project
Description: Experienced excavator operators can move the working arm along desired trajectories very precisely and quickly, so the demands on the automation functions are also correspondingly high.
In this project, the trajectory tracking problem for hydraulic cylinders of an excavator will be solved using an (offline) reinforcement learning approach. Both model-based and model-free approaches will be considered and compared. In model-based, either a simplified physics-based or a data-driven model of the excavator is used as a proxy model for training the agent, prior to deploying it on the real excavator. Model-free, on the other hand, directly makes use of the data obtained from the real excavator, without the need of an intermediate model. Important aspects to be addressed are the robustness and transferability of the approaches, since the ideal solution is robust to minor parameter changes and easily transferable to different excavators.
The project includes the following essential topics:
- Research the state of the art regarding the application of offline reinforcement learning, comprising both model-based and model-free concepts [1,2]
- Improve the robustness and efficiency of current MPC methods, based on the techniques [3]
- Test on real-world excavator in cooperation with Bosch Research if the method works well in simulation
Contact: Yuan Zhang
References:
[1] A General Approach for the Automation of Hydraulic Excavator Arms Using Reinforcement Learning, https://ieeexplore.ieee.org/document/9743573
[2] Learning Excavation of Rigid Objects with Offline Reinforcement Learning, https://arxiv.org/abs/2303.16427
[3] Latent Linear Quadratic Regulator for Robotic Control Tasks, https://arxiv.org/abs/2407.11107v1
Data Evaluation on the Intraoperative Heart-Lung Machine in Pediatric and Adult Cardiac Surgery
Description : The Heart Center at the University Hospital of Freiburg is conducting research to optimize heart-lung machine (HLM) therapy, which is crucial for peri- and post-operative patient care. More than 600 standardized intraoperative data sets are available for evaluation since 2022. In collaboration with cardiovascular technology, pediatric cardiology and informatics, we're looking for a student research assistant to analyze these data. Your role will involve on-site processing at the University Hospital, including attending heart surgery alongside specialists.
More information : German , English
Contact : Lisa Graf
Status : closed
Mitigating Extrapolation Error In Offline Inverse Reinforcement Learning
Description : Offline Inverse Reinforcement Learning aims to learn a reward function and its corresponding policy from previously collected expert demonstrations. Offline (Deep) RL algorithms use neural networks to approximate the true value functions (Q(s,a) or V(s)) and hence are prone to extrapolation error when estimating the value of out-of-distribution states (states that are not seen in the demonstrations). To combat these two common groups of strategies are employed within the offline RL literature: 1) Restrict the policy to stay close to the demonstrations 2) Restrict the value function to have lower values for out-of-distribution states. In the Inverse Reinforcement setting, we have control over the reward function. In this project, we would like to investigate possible modifications to the reward function such that the derived Value functions and Policies behave well for out-of-distribution states. This can be done by applying the offline RL techniques, used for restricting the policy or value function, to the reward function, as well as developing novel methods to achieve this goal.
Contact : Erfan Azad
Status : full
Context-aware Reinforcement Learning using Time-series Transformer
Description Generalization to different tasks is always a challenge to reinforcement learning. Sometimes, a small change in the environment could drastically influence the performance of the policy learned through reinforcement learning. We then need to train an agent which can learn the context/environment change by itself and thus, be able to solve a set of similar tasks. Transformer has been widely used in CV and NLP domains. Recently, there are also a few new models designed for time-serial data. In RL, we also often have time-serial data. Therefore, it's potentially beneficial if we use Transformer model to figure out the context of the environment.
Contact Baohe Zhang
Status full
Benchmarking Constrained Reinforcement Learning Algorithms
Description Constrained Reinforcement Learning is developed for solving tasks which have not only the reward function, but also a set of constraints to follow. As a new field, there isn't yet a benchmark that has compared the performance of algorithms in a scientific manner. Therefore, a new benchmark may be a good move to push the community forward. In this project, you would expect to re-implement some constrained RL algorithms and design a new environment with a set of tasks to compare these algorithms.
Contact Baohe Zhang
Status full
Vision Transformers for efficient policy learning
Description Learning strategies from raw videos is often infeasible in real world robotics, as current approaches require large amounts of training data. Extracting object keypoints, can make training significantly faster, unlocking a plethora of interesting tasks. However, they currently require specialized pretraining.
Using vision transformers can remove the need for specialized training and thus make the technique widely available.
In this project, the student(s) first evaluate the keypoint quality for state-of-the-art methods and then extend the technique to more challenging situations.
Hands-on policy learning on a real robot is possible and encouraged.
Contact Jan Ole von Hartz
Keypoints for efficient policy learning
Description As in the project above, we use object keypoints to learn strategies more efficiently.
In this project, the student(s) combine object keypoints with the novel SAC-GMM algorithm for policy learning on a real robot.
Contact Jan Ole von Hartz
Reinforcement Learning for Spatial Graph Design
Description In this project, the development of a Reinforcement Learning agent for the design of spatial graphs is to be explored ( more details ).
Status full
Monte Carlo Tree Search for Antibody Design
Description In this project, we want to utilize Monte Carlo Tree Search methods for the design of antibodies in a simulation ( more details ).
Status full
Uncertainty-driven offline model-based RL
Description In this project, the development and usage of world-models in combination with uncertainty estimations for offline reinforcement learning is to be explored ( more details ).
State full
Application of Recurrent Neural Network in Autonomous Driving
Description The state observation is sometimes noisy and partially observed in autonomous driving, which is challenging to solve with usual RL architectures. The recurrent neural network (RNN) is a simple and potential representation for this partial observation. In this project, students are encouraged to explore the usage of RNN in autonomous driving applications.
State full
Autoinflammatory Disease Treatment Recommendation
Description In cooperation with the foundation Rhumatismes-Enfants-Suisse , we develop algorithms for autoinflammatory disease treatment recommendation. The project mainly focuses on unsupervised deep learning, and depending on the progress, on basic deep reinforcement learning ( more details ).
Contact Maria Huegle
State full
High-Level Decision Making in Autonomous Driving
Description We develop deep reinforcement learning algorithms for autonomous lane changes using the open-source traffic simulator SUMO . We focus on various aspects, for example on mixed action spaces, constraints and including predictions of traffic participants.
Contact Gabriel Kalweit and Maria Hügle
State full
Machine Learning for Disease Progression Prediction in Rheumatoid Arthritis
Description In cooperation with the University Hospital in Lausanne, we develop algorithms to predict the disease progression in arthritis based on the Swiss Quality Management (SCQM) database, including lab values, medication, clinical data and patient reported outcomes.
Contact Maria Huegle
State full
Unsupervised Skill Learning from Video
Description In his thesis, Markus Merklinger introduces a model to leverage information from multiple label-free demonstrations in order to yield a meaningful embedding for unseen tasks. A distance measure in the learned embedding space can then be used as a reward function within a reinforcement learning system.
Contact Oier Mees and Gabriel Kalweit
Unsupervised Learning for Early Seizure Detection
In cooperation with the Epilepsy Center in Freiburg, we develop unsupervised learning algorithms to detect epileptic seizures based on intracranial EEG (EcoG) data .
Contact Maria Huegle