Visualization recommendation in a natural setting
by Ties de Kock
Data visualization is often the first step in data analysis. However, creating visualizations is hard: it depends on both knowledge about the data and design knowledge. While more and more data is becoming available, appropriate visualizations are needed to explore this data and extract information. Knowledge of design guidelines is needed to create useful visualizations, that are easy to understand and communicate information effectively.
Visualization recommendation systems support an analyst in choosing an appropriate visualization by providing visualizations, generated from design guidelines implemented as (design) rules. Finding these visualizations is a non-convex optimization problem where design rules are often mutually exclusive: For example, on a scatter plot, the axes can often be swapped; however, it is common to have time on the x-axis.
We propose a system where design rules are implemented as hard criteria and heuristics encoded as soft criteria that do not need to be satisfied, that guide the system toward effective chart designs. We implement this approach in a visualization recommendation system named OVERLOOK , modeled as an optimization problem implemented with the Z3 Satisfiability Modulo Theories solver. Solving this multi-objective optimization problem results in a Pareto front of visualizations balancing heuristics, of which the top results were evaluated in a user study using an evaluation scale for the quality of visualizations as well as the low-level component tasks for which they can be used. In evaluation, we did not find a difference in performance between OVERLOOK and a baseline of manually created visualizations for the same datasets.
We demonstrated OVERLOOK, a system that creates visualization prototypes based on formal rules and ranks them using the scores from both hard- and soft criteria. The visualizations from OVERLOOK were evaluated in a user study for quality. We demonstrate that the system can be used in a realistic setting. The results lead to future work on learning weights for partial scores, given a low-level component task, based on the human quality annotations for generated visualizations.