PhD vacancy for software correctness

We’re hiring a PhD Candidate for Software Correctness. The work will target the correctness of high-level programming languages that are “only” strings in your host language, such as SQL and regular expressions.

Software has shaped almost every aspect of our modern lives. Ensuring that software is correct, is both a major scientific challenge and an enterprise with enormous social relevance. Would you like to examine possibilities to introduce a theory for correctness levels for software? Then you have a part to play as a PhD Candidate.

The correctness of software is of major importance in computer science. Unfortunately, the significance of software correctness is not always clear. Furthermore, the automatic checking of software correctness is difficult. This leads to problems during system development projects and during the grading of software exercises.

This PhD candidate position is intended for four years. You understand the importance of correct software and know how to work with several meanings of software correctness. Your goal is to introduce a generic and formal theory of software correctness levels, in which partially correct/incorrect software can be handled in a flexible way.

You will put the generic theory into practice, by experimenting with automatic grading of software exercises in the context of our courses. One application will be dealing with automatic grading of SQL statements. You will be supervised by Patrick van Bommel and Djoerd Hiemstra. Profile:

  • You are an enthusiastic and motivated researcher.
  • You should have a Master’s degree in computer science, or a Master’s degree in mathematics and a demonstrable interest in computer science.

[Apply On-line]

(Deadline: 6 March 2022)

Vacancy: PhD Candidate for Fairness and Non-discrimination in Machine Learning for Retrieval and Recommendation

Information retrieval and recommender systems based on machine learning can be used to make decisions about people. Government agencies can use such systems to detect welfare fraud, insurers can use them to predict risks and to set insurance premiums, and companies can use them to select the best people from a list job applicants. Such systems can lead to more efficiency, and could improve our society in many ways. However, such AI-driven decision-making also brings risks. This project focuses on the risk that such AI systems lead to illegal discrimination, for instance harming people of a certain ethnicity, or other types of unfairness. A different type of unfairness could concern, for instance, a system that reinforces financial inequality in society. Recent machine learning work on measures of fairness has resulted in several competing approaches for measuring fairness. There is no consensus on what is the best way to measure fairness and the measures often depend on the type of machine learning that is applied. Based on the application of existing measures on real-world data, we suspect that many proposed measures are not that helpful in practice. In this project, you will study measures of fairness, answering questions such as the following. To what extent can legal non-discrimination norms be translated into fairness measures for machine learning? Can we measure fairness independently of the machine learning approach? Can we show which machine learning methods are the most appropriate to achieve non-discrimination and fairness? The project concerns primarily machine learning for information retrieval and recommendation, but is interdisciplinary, as it is also informed by legal norms. The project will be supervised by Professor Hiemstra, professor of data science and federated search, and Professor Zuiderveen Borgesius, professor of ICT and law.


  • You hold a completed Master’s Degree or Research Master’s degree in computer science, data science, machine learning, artificial intelligence, or a related discipline.
  • You have good programming skills.
  • You have good command of spoken and written English.
  • We encourage you to apply even if you think you do not meet all the requirements.

More information at:

Professor positions in Machine Learning for Data Science

The Data Science section of the Radboud University seeks to appoint an Assistant Professor and an Associate Professor in Machine Learning for Data Science. Deadline: 31 March.

To strengthen and expand the Data Science section’s research, we seek to appoint an Assistant Professor and an Associate Professor in Machine Learning for Data Science. Also, these positions will be pivotal for supporting our Bachelor’s programme and our Data Science Master’s specialisations, in particular for Master’s courses that attract many students. The main goal of Machine Learning for Data Science is to develop machine learning approaches and techniques of broader applicability outside a specific application domain. Machine Learning for Data Science involves the study, development and application of machine learning techniques in order to tackle real-life problems involving challenging learning tasks and/or type of data.

[More information]

PhD candidate vacancy: Transfer Learning for Federated Search

We are looking for a PhD candidate to join the Data Science group at Radboud University for an exciting new project on transfer learning for language modelling with an application for federated search. Transfer learning learns general purpose language models from huge datasets, such as web crawls, and then trains the models further on smaller datasets for a specific task. Transfer learning in NLP has successfully used pre-trained word-embeddings for several tasks. Although the success of word embeddings on search tasks has been limited, recently pre-trained general purpose language representations such as BERT and ELMo have been successful on several search tasks, including question answering tasks and conversational search tasks. Resource descriptions in federated search consist of samples of the full data that are sparser than full resource representations. This raises the question of how to infer vocabulary that is missing from the sampled data. A promising approach comes from transfer learning from pre-trained language representations. An open question is how to effectively and efficiently apply those pre-trained representations and how to adapt them to the domain of federated search. In this project, you will use pre-trained language models, and further train those models for a (federated) search task. You will evaluate the quality of those models as part of international evaluation conferences like the Text Retrieval Conference (TREC) and the Conference and Labs of the Evaluation Forum (CLEF).

[more information]

PhD position data-driven maintenance optimization

The Data Management and Biometrics group and Formal Methods & Tools groups at the University of Twente seek a PhD candidate for SEQUOIA: Smart maintenance optimization via big data & fault tree analysis, a project funded by the NWO Applied and Engineering Sciences, and the companies ProRail and NS. ProRail is responsible for the Dutch railway network, including its construction, management, maintenance, and safety; NS has the same responsibility for the Dutch train fleed. The project is led by Mariƫlle Stoelinga, Joost-Pieter Katoen and Djoerd Hiemstra.

SEQUOIA aims to improve the reliability of the Dutch railroads by deploying big data analytics to predict and prevent failures. Its scientific core is a novel combination of machine learning, fault tree analysis and stochastic model checking. Key idea is that big data analytics provide the statistics on failures, their correlations, dependencies etc. and fault trees provide the domain knowledge needed to interpret these data. The project outcome aims at developing explainable machine learning techniques that discover causal relations instead of statistical correlations; machine learning of fault trees or of other models that are normally designed top-down by domain experts. The techniques should help ProRail to decrease train disruptions and delays, to lower maintenance cost, and to increase passenger comfort.

The project involves an intense cooperation ProRail and the RWTH Aachen University. The PhD candidate will spend a portion of their time at ProRail. Key project deliverables are efficient analysis algorithms and a workable tool to be used in the ProRail context. For more information, see:!/phd-position-sequoia/134206

Job Vacancy: Scientific Programmer

Scientific programmer: folktale search and visualisation

The FACT project will investigate new possibilities for humanities researchers (folktale researchers, narratologists, documentalists, etc.) to study folktales based on annotations and relations that have been automatically assigned using data-driven methods. The Dutch Folktale Database (Nederlandse Volksverhalenbank) of the Meertens Institute is a very large and varied collection of Dutch Folktales. Within FACT, software will be developed to automatically enrich the folktales in this collection with metadata such as names, keywords, genre, a summary and type. An additional research goal is to investigate if automatic analysis of the folktale collection can reveal relations between folktales that are difficult to discover through human inspection. The annotation and clustering methods to be developed will be integrated in a user-friendly XML-based platform for the annotation and exploration of folktales, to support research on the variability of human oral and written transmission.

The University of Twente has vacancies for a PhD-student, a postdoc and a scientific programmer, who will be working together as a team to achieve the project goals. In addition there will be close cooperation with the Tunes & Tales project (funded under the Computational Humanities programme of KNAW) that is aimed at investigating sequences of motifs in, and variability of, melodies and folktales in oral transmission.

The scientific programmer will work on the development of user-friendly tools for folktale researchers that incorporate the annotation and clustering techniques developed by the postdoc and the PhD student. The annotation tool should allow for (semi) automatic annotation of folktales with language, genre, keywords, names, summary and type. The visualization tool should enable easy inspection of document clusters. In addition, the programmer will develop an XML-based search system that allows the general public to search for folktales in the Folktale Database based on their annotations.

Apply on-line (Deadline: 1 November 2011)

PhD position: Deep Web Entity Monitoring

The Database Group of the University of Twente offers a PhD student position in the Dutch national project COMMIT, a 100M Euro project involving 10 universities and 70 companies. The program brings together leading researchers in search engines, parallel computing, databases, interaction in context, embedded systems and knowledge technology.

A large part of the web, the invisible web or deep web, cannot be indexed by web crawlers, for instance dynamic web pages that are returned in response to filling in a web form, or performing a search in a search engine. Instead of crawling deep web data, the approach will monitor web pages for certain (types of) queries. The objective is to develop approaches for monitoring web data that allow users to see a page's full history of relevant/important changes by identifying entities: people, organizations, products, geographic locations, events, etc. The approach should relate changes in multiple web sites, giving the user a data-warehouse-like overview of the pages they monitor; drilling down to time periods, persons, events, etc.

The research will be done in co-operation with WCC. WCC, started in 1996 and is a successful software company based in Utrecht (NL) and Reston (USA). WCC's current focus areas are the Employment and Identification Security markets. Both commercial and government customers worldwide use WCC's smart search & match solutions to support their primary processes. Both WCC and the Database Group of the University of Twente have made significant advances in entity matching and entity ranking applied to for instance Employment Matching and Expert Search. This project will extend this work to monitoring of deep web pages, such a social networking sites, micro-blogging sites, job sites, etc. The candidate will spend part of the time at WCC in Utrecht.

[official vacancy text] (deadline: July 3rd, 2011)

PhD-position: semantic linking of multimedia content

The digital library of the future will be a dynamic and highly networked entity, consisting of both the original documents and user-generated annotations and links to and from external resources. Among other things, the Human Media Interaction (HMI) group of the University of Twente investigates the possibilities for multimedia content analysis and information linking to support and provide facilities for navigating and exploring digital libraries with content in a variety of formats including text, audio, images and video. There is funding available for a PhD position starting from January 2010.

The PhD research will be carried out in the context of AXES, a multidisciplinary research project funded by the EU (FP7, Digital Libraries). The research will focus on deploying diverse, automatically generated, time-labeled annotations -for example those coming from automatic speech recognition- for connecting heterogeneous data sources, and will be strongly evaluation-driven.

More information (deadline: 21 November)

Jobs: Three PhD student positions

Position: Distributed Information Retrieval

The Database Group of the University of Twente offers a job opening in the NWO Vidi Project “Distributed Information Retrieval by means of Keyword Auctions”. The project's aim is to distribute internet search functionality in such a way that communities of users and/or federations of small search systems provide search services in a collaborative way. Instead of getting all data to a centralized point and process queries centrally, as is done by today's search systems, the project will distribute queries over many small autonomous search systems and process them locally. In this project, the PhD student will research a new approach to distribute search: distributed information retrieval by means of keyword auctions. Keyword auctions like Google's AdWords give advertisers the opportunity to provide targeted advertisements by bidding on specific keywords. Analogous to these keyword auctions, local search systems will bid for keywords at a central broker. They “pay” by serving queries for the broker. The broker will send queries to those local search systems that optimize the overall effectiveness of the system, i.e., local search systems that are willing to serve many queries, but also are able to provide high quality results. The PhD student will work within a small team of researchers that approaches the problem from three different angles: 1) modeling the local search system, including models for automatic bidding and multi-word keywords, 2) modeling the search broker's optimization using the bids, the quality of the answers, and click-through rates, and 3) integration of structured data typically available behind web forms of local search systems with text search.

See official announcement. (Deadline: 19 April 2009)

Two positions: PuppyIR, Information Retrieval for Children

The Groups Human Media Interaction and Databases of the University of Twente offer two job openings in the European Project PuppyIR. Current Information Retrieval (IR) systems are designed for adults: they return information that is unsuitable for children, present information in lists that children find difficult to manage and make it difficult for children to ask for information. PuppyIR will create information search services that are tailored to the specific needs of children, giving children the opportunity to fully and safely exploit the power of the Internet. PuppyIR will develop new interaction paradigms to allow children to easily express their information need, to have results presented in an intuitive way and to engage children in system interaction. It will develop a set of Information Services: components to summarise textual and audiovisual content for children, to help children safely explore new information, to moderate information for children at different ages, to build new social networks and to intelligently aggregate and present information to children. PuppyIR will offer an open source platform that enables system designers to construct useful and usable information retrieval systems for children. The project will demonstrate the effectiveness of the PuppyIR modules through demonstrator systems constructed in collaboration with the Netherlands Public Library Association and the Emma Children's Hospital. At the university of Twente, a team of six senior researchers and three PhD students will cooperate in PuppyIR. One PhD student will work on user interaction design. The other two positions are described below.

Position 1: Analyzing and structuring textual information (at Human Media Interaction) Analyzing and structuring textual information studies how natural language processing tools can assist the organization of information in a way that enables children to easily access the information. The PhD student at Human Media Interaction will focus on information extraction, text classification, and story understanding and summarization on written and spoken data, for instance for questions or comments created by children (e.g., chats, blogs) and content created explicitly for children (e.g., stories).

Position 2: Multimedia content mining (at Databases) Multimedia content mining will develop database search technology that enables better understanding of the individual behavior of the child and consequently his/her information need. The PhD student at Databases will focus on concept retrieval, faceted search, query formulation assistance, and intuitive relevance feedback mechanisms that allow children to easily access the content of multimedia data sources, for instance for content sharing within online groups including moderated discovery.

See official announcement. (Deadline: 15 April 2009)