Current courses

Open projects

Please contact me for open Research Internships, BSc thesis and MSc thesis projects.

  • Federated Search (Data Science):
    Research approaches that combine the results from multiple, independent and non-cooperative (in the sense that they do not share their index) search engines
    • NEW: Ranked federated search for the Clarin Virtual Language Observatory (VLO): Clarin is the European Research Infrastructure for Language Resources and Technology. The project should answer the question” How to model ranking, and how does it improve the quality (and efficiency) of Clarin’s content search engine?
  • Federated Learning (Data Science):
    Research approaches that divides machine learning over multiple independent and private data sources
    • NEW: Federated learning for the Personal Health Train. Develop and evaluate machine learning approaches using data lakes of Health care providers. The Personal Health Train provide FAIR data layers in which structured data is provided in a standard way. The data is available by federated queries and analysis. Goal: develop a federated machine learning approach using unstructured data, such as clinical notes entered by health practitioners. This project is done at the RUMC.
  • Conversational Search (Data Science):
    Can we use techniques from information retrieval or machine learning to improve open source virtual assistants like Stanford’s Almond?
  • COVID-19 search:
    Design a search engine for researches that work on cures and vaccins of COVID-19
  • Ephemeral Social networking (Software Science):
    Based on the W3C standard ActivityPub, design an ephemeral social network (in which most posts are removed after some time) and compare its network/storage/memory/cpu load compared to durable solutions like Mastodon.
  • Secure federated communication (Digital Security):
    Design/adapt an end-to-end encrypted solution for ActivityPub-based social networking: How to handle multiple devices and heterogeneous networks?
  • Transitioning the RU to self-hosted, federated, solutions (Information Sciences):
    For, for instance, self-hosted web analytics, social networking, or video streaming: What are the user requirements? What solutions meet these requirements? What are additional benefits (for instance more autonomy for employees and students)? How to show this with a proof-of-concept.
  • With Nedap Healthcare, Groenlo Machine Learning and Natural Language Processing:
    Clinical Natural Language Processing / De-identification of medical records.
  • With RUMC, Nedap Healthcare and Leiden University: MSc thesis project on Generating synthetic clinical data for shared Machine Learning tasks.

Past courses

Teaching information