Sie sind hier: Startseite Student Projects

Student Projects & Theses

Openings for student projects and theses

 

We have several opportunities for students to work with us on the latest research in Reinforcement Learning.

The projects are listed below, but you are also encouraged to propose your own topic and collaborate with us.

If you are interested, please send us an email to  nrprojects@informatik.uni-freiburg.de with your interests, your transcript of records and optionally your CV.

 

Current Project Openings

If none of the listed projects are of interest, feel free to contact us for new projects via the address mentioned above! Please also include your fields of interest and prior knowledge, as it helps us find a suitable supervisor.

Title Type supervisor link

-

- -

-

 

There are currently no project openings.

 

 

 

Previous Projects

Reinforcement Learning for RoboCup Soccer Keep Away

Position type: project

Description: Unlike other RoboCup leagues, the Small Size League (SSL) has seen limited application of reinforcement learning (RL) techniques. This project aims to explore the potential benefits of utilizing RL methods in the SSL by focusing on a specific task: a "keep-away" scenario. We propose to demonstrate how state-of-the-art RL algorithms can be effectively employed to enhance team performance in dynamic, real-time environments.

Your task is to set up a simulation environment and train a keep-away task similar to the one in the paper references below. For more information please refer to the attached PDF.

Contact: Lisa Graf

More information: PDF

References: Stone, P., Sutton, RS, Singh, S. (2001). Reinforcement Learning for 3 vs. 2 Keepaway. In: Stone, P., Balch, T., Kraetzschmar, G. (eds) RoboCup 2000: Robot Soccer World Cup IV. RoboCup 2000. Lecture Notes in Computer Science(), vol 2019. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45324-5_23

Status : closed

Integration of process models (biological models of forests) into neural networks

 Position type: Project

Description : When studying complex phenomena in nature, a simplified version of a system is often used to describe the underlying effects and processes. In forest science, this is done through process models (PM), which combine empirical measurements with theoretical understanding of the underlying processes. However, PMs can represent complex phenomena incompletely or oversimplified. Neural networks (NNs) can outperform PMs with more flexible representations when large data sets are available. For small and sparse datasets, NNs can benefit from the integration of domain knowledge such as PMs.
Specifically for time series, LSTMs have been successfully combined with prior knowledge.
In this project, we would like to investigate whether we can improve the performance by combining new model architectures such as Mamba or Kolmogorov–Arnold Networks (KAN) with PMs. Several approaches could be explored:

  • Use Mamba or other new model architectures instead of the current NN:

Combine the PM model into the loss function to help guide the NNs learning processes

  • Combine PM model directly into the architecture of Mamba or of other new model types, eg for Mamba:

 Integrate the PM into the Mamba architecture by using Mamba's gated mechanism to evaluate the value of the PM model's performance for time series prediction.

Contact:  Hanne Raum

 

Reinforcement Learning development based on an excavator 

Position type:  project

Description:  Experienced excavator operators can move the working arm along desired trajectories very precisely and quickly, so the demands on the automation functions are also correspondingly high.

In this project, the trajectory tracking problem for hydraulic cylinders of an excavator will be solved using an (offline) reinforcement learning approach. Both model-based and model-free approaches will be considered and compared. In model-based, either a simplified physics-based or a data-driven model of the excavator is used as a proxy model for training the agent, prior to deploying it on the real excavator. Model-free, on the other hand, directly makes use of the data obtained from the real excavator, without the need of an intermediate model. Important aspects to be addressed are the robustness and transferability of the approaches, since the ideal solution is robust to minor parameter changes and easily transferable to different excavators.

The project includes the following essential topics:

  • Research the state of the art regarding the application of offline reinforcement learning, comprising both model-based and model-free concepts [1,2]
  • Improve the robustness and efficiency of current MPC methods, based on the techniques [3]
  • Test on real-world excavator in cooperation with Bosch Research if the method works well in simulation

 

Contact:  Yuan Zhang

References:

[1] A General Approach for the Automation of Hydraulic Excavator Arms Using Reinforcement Learning,  https://ieeexplore.ieee.org/document/9743573

[2] Learning Excavation of Rigid Objects with Offline Reinforcement Learning,  https://arxiv.org/abs/2303.16427

[3] Latent Linear Quadratic Regulator for Robotic Control Tasks,  https://arxiv.org/abs/2407.11107v1

 

Data Evaluation on the Intraoperative Heart-Lung Machine in Pediatric and Adult Cardiac Surgery

Description : The Heart Center at the University Hospital of Freiburg is conducting research to optimize heart-lung machine (HLM) therapy, which is crucial for peri- and post-operative patient care. More than 600 standardized intraoperative data sets are available for evaluation since 2022. In collaboration with cardiovascular technology, pediatric cardiology and informatics, we're looking for a student research assistant to analyze these data. Your role will involve on-site processing at the University Hospital, including attending heart surgery alongside specialists.

More information : German , English

Contact : Lisa Graf

Status : closed

 

Mitigating Extrapolation Error In Offline Inverse Reinforcement Learning

Description : Offline Inverse Reinforcement Learning aims to learn a reward function and its corresponding policy from previously collected expert demonstrations. Offline (Deep) RL algorithms use neural networks to approximate the true value functions (Q(s,a) or V(s)) and hence are prone to extrapolation error when estimating the value of out-of-distribution states (states that are not seen in the demonstrations). To combat these two common groups of strategies are employed within the offline RL literature: 1) Restrict the policy to stay close to the demonstrations 2) Restrict the value function to have lower values ​​for out-of-distribution states. In the Inverse Reinforcement setting, we have control over the reward function. In this project, we would like to investigate possible modifications to the reward function such that the derived Value functions and Policies behave well for out-of-distribution states. This can be done by applying the offline RL techniques, used for restricting the policy or value function, to the reward function, as well as developing novel methods to achieve this goal.

Contact : Erfan Azad

Status : full

Context-aware Reinforcement Learning using Time-series Transformer

Description  Generalization to different tasks is always a challenge to reinforcement learning. Sometimes, a small change in the environment could drastically influence the performance of the policy learned through reinforcement learning. We then need to train an agent which can learn the context/environment change by itself and thus, be able to solve a set of similar tasks. Transformer has been widely used in CV and NLP domains. Recently, there are also a few new models designed for time-serial data. In RL, we also often have time-serial data. Therefore, it's potentially beneficial if we use Transformer model to figure out the context of the environment.

Contact Baohe Zhang

Status full

 

Benchmarking Constrained Reinforcement Learning Algorithms

Description  Constrained Reinforcement Learning is developed for solving tasks which have not only the reward function, but also a set of constraints to follow. As a new field, there isn't yet a benchmark that has compared the performance of algorithms in a scientific manner. Therefore, a new benchmark may be a good move to push the community forward. In this project, you would expect to re-implement some constrained RL algorithms and design a new environment with a set of tasks to compare these algorithms.

Contact  Baohe Zhang

Status full

 

Vision Transformers for efficient policy learning

Description Learning strategies from raw videos is often infeasible in real world robotics, as current approaches require large amounts of training data. Extracting object keypoints, can make training significantly faster, unlocking a plethora of interesting tasks. However, they currently require specialized pretraining.
Using vision transformers can remove the need for specialized training and thus make the technique widely available.
In this project, the student(s) first evaluate the keypoint quality for state-of-the-art methods and then extend the technique to more challenging situations.
Hands-on policy learning on a real robot is possible and encouraged.

Contact Jan Ole von Hartz

Keypoints for efficient policy learning

Description As in the project above, we use object keypoints to learn strategies more efficiently.
In this project, the student(s) combine object keypoints with the novel SAC-GMM algorithm for policy learning on a real robot.

Contact Jan Ole von Hartz

 

Reinforcement Learning for Spatial Graph Design

 

Description  In this project, the development of a Reinforcement Learning agent for the design of spatial graphs is to be explored ( more details ).

 

Status full

 

Monte Carlo Tree Search for Antibody Design

 

Description  In this project, we want to utilize Monte Carlo Tree Search methods for the design of antibodies in a simulation ( more details ).

Status full

Uncertainty-driven offline model-based RL

Description  In this project, the development and usage of world-models in combination with uncertainty estimations for offline reinforcement learning is to be explored ( more details ).

State    full

Application of Recurrent Neural Network in Autonomous Driving

Description   The state observation is sometimes noisy and partially observed in autonomous driving, which is challenging to solve with usual RL architectures. The recurrent neural network (RNN) is a simple and potential representation for this partial observation. In this project, students are encouraged to explore the usage of RNN in autonomous driving applications. 

State    full

Autoinflammatory Disease Treatment Recommendation

Description  In cooperation with the foundation  Rhumatismes-Enfants-Suisse  , we develop algorithms for autoinflammatory disease treatment recommendation.  The project mainly focuses on unsupervised deep learning, and depending on the progress, on basic deep reinforcement learning (  more details  ).

Contact  Maria Huegle

State  full

High-Level Decision Making in Autonomous Driving

Description  We develop deep reinforcement learning algorithms for autonomous lane changes using the open-source traffic simulator  SUMO  .  We focus on various aspects, for example on mixed action spaces, constraints and including predictions of traffic participants.

Contact  Gabriel Kalweit  and  Maria Hügle

State  full

Machine Learning for Disease Progression Prediction in Rheumatoid Arthritis

Description  In cooperation with the  University Hospital in Lausanne,  we develop algorithms to predict the disease progression in arthritis based on the  Swiss Quality Management (SCQM)  database, including lab values, medication, clinical data and patient reported outcomes.

Contact  Maria Huegle

State  full

Unsupervised Skill Learning from Video

Description In his thesis, Markus Merklinger introduces a model to leverage information from multiple label-free demonstrations in order to yield a meaningful embedding for unseen tasks. A distance measure in the learned embedding space can then be used as a reward function within a reinforcement learning system.

Contact Oier Mees and Gabriel Kalweit

Unsupervised Learning for Early Seizure Detection

In cooperation with the Epilepsy Center in Freiburg, we develop unsupervised learning algorithms to detect epileptic seizures based on intracranial EEG (EcoG) data .

Contact Maria Huegle