Goodbye everybody at U. Twente

(written for CS teaching mailing no. 16 of 11 July)

As of 1 July, I will leave the U. Twente after almost 30 years (first as student, then as PhD student, finally as staff member) for a new challenge at the Radboud University in Nijmegen. I am proud to announce that I will join Radboud University’s faculty of science as professor of Federated Search.

I was privileged to teach in a world that changed a lot since I became an assistant professor (in 2001). Today, university-level courses are no longer taught for the privileged few at universities in developed countries. They are now freely available to anyone online via platforms like Coursera, edX, FutureLearn and on social media, such as on YouTube. Over the last 18 years, I tried to stimulate students to find additional study material online. In return I tried to contribute to the online study material by publishing my teaching material for students to use and for colleagues to share (my Canvas courses are still entirely publicly available) and by using novel social media like UT Mastodon (https://mastodon.utwente.nl).

In my years at the UT, I enjoyed promoting critical thinking by letting students actively put theory to practice, instead of letting students passively absorb knowledge. I particularly enjoyed developing the MSc course Managing Big Data with Maarten Fokkinga and Robin Aly (later perfected by Doina Bucur) where students analysed terabytes of data on a large Hadoop cluster. I enjoyed developing the BSc module Data & Information with Klaas Sikkel, Maurice van Keulen and Luís Ferreira Pires, where we let students work in agile teams, including daily stand-ups, sprint review meetings, and sprint backlogs. I also very much liked running the MSc course Information Retrieval with Paul van der Vet, Theo Huibers and Dolf Trieschnigg, where students used open source search engines and actively contributed to our research. Some of that work was published, and in such cases, students presented their work at international workshops or conferences.

Saying goodbye to Twente is harder than I expected. But remember, Nijmegen is close by: Feel free to contact me. As for PhD students, I intend to continue to be an active contributor to the courses of the Dutch research school SIKS: I hope to see you there.

Goodbye everybody!

The influence of network structure and prosocial cultural norms on charitable giving

A multilevel analysis of Movember’s fundraising campaigns in 24 countries

by Tijs van den Broek, Ariana Need, Michel Ehrenhard, Anna Priante, and Djoerd Hiemstra

This study examines how the interplay between an online campaign’s network structure and prosocial cultural norms in a country affect charitable giving. We conducted a multilevel analysis that includes Twitter network and aggregated donation data from the 2013 Movember fundraising campaigns in 24 countries during 62 campaign days. Prosocial cultural norms did not affect the relationship between network size and average donations raised, nor did they affect the relationship between network centralization and average donation amount. Prosocial cultural norms did affect the relationship between network density and average donations raised. However, this effect was negative and contrary to our expectation.

Published in Social Networks 58, pages 128-135

[download pdf]

Beyond research and teaching: on the role of universities in our society

(a thread on Mastodon U. Twente.)

In the essay The Fragmentation of Truth danah boyd makes the following important point: To combat increasing polarisation in our society, we need to rely on organisations that actively and intentionally let people with fundamental differences work alongside one another.

Boyd mentions the military as an example of an organisation that brings together people from different social backgrounds and political views to work on a common goal. To “intentionally bridge gaps in the social graph, to intentionally connect people and communities.”

I see schools and universities as another major power to combat polarisation in our society. Our university brings together people from different backgrounds, politcal views and cultures. Creating a sense of common purpose and a sense of a university community is important to fight polarisation and populism in our society.

That’s why our campus, our study associations, our sport, cultural and other student associations, are so important. That’s also why we need democratic institutions and self-government. They do not only shape our university now, they shape our future society.

We need to work harder to shape our universty as a community. If international students feel disconnected, then we completely failed as a university, no matter how excellent our educational programs are. This U-Today story, International bachelors: psychological and social problems, breaks my heart: (“One in three non-European bachelors had study problems in the previous academic year due to psychological, medical or social circumstances.”)

Danah boyd discusses in depth how platforms like Youtube and Facebook harm our society; how they directly threaten the important role that schools and universities play in creating a peaceful society. From this view point it is clear: Youtube should not be the primary channel for our online lectures; Facebook should not be the primary channel for our events.

Finally, services like search engines may be harmful, however well-intended and well-implemented. I find this hard to say as an Information Retrieval researcher, but search is easily manipulated, and you might not want powerful search in some applications. Boyd’s concept of ‘data voids’ is really insightful. Maybe we should teach students about search engine optimization in our courses too… #FIR

The Recent Applications of Machine Learning in Rail Track Maintenance: A Survey

by Muhammad Nakhaee, Djoerd Hiemstra, Mariëlle Stoelinga, and Martijn van Noort

Railway systems play a vital role in the world’s economy and movement of goods and people. Rail tracks are one of the most critical components needed for the uninterrupted operation of railway systems. However, environmental conditions or mechanical forces can accelerate the degradation process of rail tracks. Any fault in rail tracks can incur enormous costs or even result in disastrous incidents such as train derailment. Over the past few years, the research community has adopted the use of machine learning (ML) algorithms for diagnosis and prognosis of rail defects in order to help the railway industry to carry out timely responses to failures. In this paper, we review the existing literature on the state-of-the-art machine learning-based approaches used in different rail track maintenance tasks. As one of our main contributions, we also provide a taxonomy to classify the existing literature based on types of methods and types of data. Moreover, we present the shortcomings of current techniques and discuss what research community and rail industry can do to address these issues. Finally, we conclude with a list of recommended directions for future research in the field.

To be presented at the International Conference on Reliability, Safety and Security of Railway Systems: Modeling, Analysis, Verification and Certification (RSSRail 2019) on 4-6 June 2019 in Lille, France.

[download pdf]

Wim Florijn graduates on Semantically Grouping Search Query Data

Information Retrieval by Semantically Grouping Search Query Data

by Wim Florijn

Query data analysis is a time-consuming task. Currently, a method exists where word (combinations) in queries are labelled by using an information collection consisting of regular expressions. Because the information collection does not contain regular expressions from never-before seen domains, the method heavily relies on manual work, resulting in decreased scalibility. Therefore, a machine-learning based method is proposed in order to automate the annotation of word (combinations) in queries. This research searches for the optimal configuration of a pre-processing method, word embedding model, additional data set and classifier variant. All configurations have been examined on multiple data sets, and appropriate performance metrics have been calculated. The results show that the optimal configuration consists of omitting pre-processing, training a fastText model and enriching word features using additional data in combination with a recurrent classifier. We found that an approach using machine learning is able to obtain excellent performance on the task of labelling word (combinations) in search queries.

[download pdf]

Anna Priante defends PhD thesis on online social movement campaigns

by Anna Priante

Social movement organizations widely use social media to organize collective action for social change, such as cancer awareness campaigns. However, little is known about how effective online social movement campaigns are at generating social change by translating online action into meaningful (offline) action. This dissertation examines the micro-mobilization dynamics at play that can explain the effectiveness of online social social movement campaigns. This book comprises seven chapters presenting research based on a multidisciplinary, mixed-method approach, combining theories and methods from sociology, social psychology, communication science, and computational social science. The findings show that, with mobilization dynamics of collective action, we can gain an important understanding of the mechanisms at work during online social movement campaigns and of the effectiveness of such campaigns in fostering communication processes related to the cause, obtaining important resources for the cause, developing a collective identity, and raising awareness.

[download pdf]

Anna’s defense was the 5000th PhD defense at the UT!

Marieke Graef graduates cum laude on the Analysis of HPV discussions on Twitter

Responses to HPV Vaccination Campaigns in The Netherlands: an analysis of discussions on Twitter

by Marieke Graef

Even though the human papillomavirus vaccine (HPV) is an effective and safe instrument to decrease HPV infections and cases of several types of cancer, the Dutch HPV vaccination rate has been suboptimal from the start and has even shown a decline in the last two years. This study sought to assess the determinants of HPV vaccination uptake in the Netherlands and how the vaccine and RIVM and GGDTwente messages were discussed on Twitter from 2011 till 2016. Method: All Dutch language tweets mentioning HPV from the years 2011 till 2016 were collected from a database, amounting to a total of 17319. A content analysis of all tweets was carried out manually. The content of the GGDTwente and RIVM tweets was examined as well as responses to these tweets. Furthermore, the tweets were analyzed for specific determinants of HPV vaccination uptake and general sentiments. Results: The GGDTwente and RIVM only became truly active on Twitter regarding the HPV vaccination program in 2015. The RIVM tweets received significantly more response, though this response mostly consisted of retweets. Nearly all GGDTwente tweets concerned vaccination schedules. By far the most common determinant of low vaccination uptake in tweets from the public was the fear of side-effects, with scare stories going viral in 2015 and 2016 especially. On the other hand, publications on the high number of HPV infections among women received a lot of attention as well. Overall, the general sentiment towards the HPV vaccine on Twitter was more positive than negative in the first years, but due to stories about side-effects turned more negative in 2015. Conclusions: The results show that the fear of side-effects is something that needs to be addressed by public health authorities. Additionally, more practical measures such as a school-based vaccination program may be a great way to help increase the vaccination rate.

[download pdf]

How Twente may lead the fight against global heating

(a thread on Mastodon U. Twente)

I signed the Klimaatbrief Universiteiten. Our university does not have an ambitious climate agenda. A common approach among universities is lacking. With this letter, we call upon university management to develop and implement policies to drastically reduce the universities’ carbon emissions.

Frankly speaking, the policies that this letter calls for should not be controversial at all. Universities have a moral duty to work on the big problems of the world, and a duty to advance approaches that may solve these problems. In fact, the University of Twente can build a campus that is CO2 neutral now. Let me give a few examples.

Let’s build, on campus, the state-of-the-art wind mills that use generators developed at the University of Twente. The superconductors developed by Marc Dhallé and colleagues, Lighter windmills thanks to superconductivity, replace the heavy magnets inside the generators of conventional wind mills. As a result, the weight and size of the new generator is significantly reduced while at the same time, it is capable of delivering the same output power. Another advantage is the minimal use of rare earth metals.

Let’s put solar panels on every roof and turn everyday objects on campus into solar panels using luminescent solar concentrator (LSC) photovoltaic technologies that Angèle Reinders and colleagues experiment with. The typical material properties of LSCs — low cost, colorful, bendable, and transparency — offer a lot of design freedom.

Let’s use the additional energy generated on campus to generate solar fuels. This involves the direct conversion of energy from sunlight into a usable fuel (in this case, hydrogen). Using only earth-abundant materials, Han Gardeniers, Jurriaan Huskens and colleagues developed the most efficient conversion method to date: UT boosts efficiency of solar fuels.

The high school children that are on strike for the climate now will be our future students. Let’s give them the world — and the campus — they protested for.

Flávio Martins defends PhD thesis on Temporal Models for Microblog Search

Temporal Information Models for Real-Time Microblog Search

by Flávio Martins

Real-time search in Twitter and other social media services is often biased towards the most recent results due to the “in the moment” nature of topic trends and their ephemeral relevance to users and media in general. However, “in the moment”, it is often difficult to look at all emerging topics and single-out the important ones from the rest of the social media chatter. This thesis proposes to leverage on external sources to estimate the duration and burstiness of live Twitter topics. It extends preliminary research where it was shown that temporal re-ranking using external sources could indeed improve the accuracy of results. To further explore this topic we pursued three significant novel approaches:
(1) multi-source information analysis that explores behavioral dynamics of users, such as Wikipedia live edits and page view streams, to detect topic trends and estimate the topic interest over time;
(2) efficient methods for federated query expansion towards the improvement of query meaning; and
(3) exploiting multiple sources towards the detection of temporal query intent.
It differs from past approaches in the sense that it will work over real-time queries, leveraging on live user-generated content. This approach contrasts with previous methods that require an offline preprocessing step.

(Photo by @krisztianbalog@twitter.com)