Relaxed Dot Plots: Faithful Visualization of Samples and Their Distribution

We introduce relaxed dot plots as an improvement of nonlinear dot plots for unit visualization. Our plots produce more faithful data representations and reduce moiré effects. Their contour is based on a customized kernel frequency estimation to match the shape of the distribution of underlying data values. Previous nonlinear layouts introduce column-centric nonlinear scaling of dot diameters for visualization of high-dynamic-range data with high peaks. We provide a mathematical approach to convert that column-centric scaling to our smooth envelope shape. This formalism allows us to use linear, root, and logarithmic scaling to find ideal dot sizes. Our method iteratively relaxes the dot layout for more correct and aesthetically pleasing results. To achieve this, we modified Lloyd's algorithm with additional constraints and heuristics. We evaluate the layouts of relaxed dot plots against a previously existing nonlinear variant and show that our algorithm produces less error regarding the underlying data while establishing the blue noise property that works against moiré effects. Further, we analyze the readability of our relaxed plots in three crowd-sourced experiments. The results indicate that our proposed technique surpasses traditional dot plots.


INTRODUCTION
Dot plots [31] visualize data frequency in one dimension using stacked dots, each representing one data value. This direct mapping of graphical primitives to the underlying data enables countability and precise representation-each dot is positioned using its data value. In contrast, histograms translate aggregated data values into a visual variable, e.g., • N. Rodrigues bar length. Such aggregation generally leads to a less true representation of the data, which is particularly noticeable at low data frequency: a bar in a histogram is located at the midpoint of its interval, irrespective of the distribution of aggregated data. While traditional dot plots have advantages over histograms for smaller data sets, the balance shifts toward histograms when dealing with larger data sets and varying data frequencies: histograms easily scale the bar height, e.g., logarithmically, without changing the bins. Nonlinear scaling of dot plots is more complex and aggravates existing issues with moiré effects, especially at low image resolution [21]. Moreover, forcing dots into the traditional layout in columns can misrepresent true data values.
In this work, we introduce relaxed (nonlinear) dot plots that deliberately abandon the traditional column layout to place the dots more correctly and more aesthetically pleasingly within an envelope shape. To achieve the desired results, we employ kernel frequency estimation to infer dot size from value frequencies, and find a trade-off between conflicting criteria for correctness and visual appeal. Due to this optimization, our relaxed dot plots are more faithful to the underlying data and avoid moiré effects of traditional nonlinear plots without blurring the dots. The latter is possible by establishing the blue noise property (locally equidistant dot spacing).
Relaxed nonlinear dot plots combine advantages of previously disjoint visualization techniques: As an example of a unit visualization with a one-to-one mapping between data items and visual marks [16], there are the benefits of improved intuition, perception, and interaction possibilities. At the same time, we mitigate the disadvantages of typical unit visualizations by improving the perceptual and display scalability for larger data sets through nonlinear data-dependent scaling; and different from previous nonlinear dot plots [21], we do not need blurring that would impair the character of a unit visualization. Therefore, we see our technique as a good compromise between traditional unit visualization and scalable aggregate visualization of data frequencies and distributions. Fig. 1 shows an example of relaxed dot plots. Here, we can clearly see the individual data items, even enriched in the form of icons with country flags. We can also notice the scaling of the visual marks, which allows more dots in high-frequency areas (in particular, in the range close to 0 percent). The improved faithfulness of the data values (here, the percentages) manifests itself in the fine-grained placement in the horizontal direction.
Our main technical contributions are: First, we provide a mathematical description to transition from strictly aligned nonlinear dot plots to our relaxed variant. Second, we present a layout method that starts with the efficient sweep algorithm [21] and applies constrained Lloyd relaxation [12] to improve dot placement accuracy and establish the blue noise property. Third, we provide an open-source implementation of this technique online [7] and as supplemental material [20]. Further, we evaluate our proposed visualization against column-based nonlinear dot plots with respect to correctness of dot placement, blue noise property, and readability. Empirical user study results show that relaxed dot plots surpass their traditional counter part.

RELATED WORK
Generally, all dot plots are used for statistical visualization and are known to work well for small data sets [1,19]. Historically, however, the term dot plot has been used ambiguously. For example, Cleveland understands it as (1) a visualization that replaces bar charts [4] and (2) a variant of X-Y-plots [5].
More tangible is the definition of dot plots by Wilkinson [31]: there must be a one-to-one mapping between each data value and each rendered dot (R1), all dots are of the same size (R2), lone dots are placed at the exact position of their data value along the displayed scale (R3), and colliding dots are stacked into straight columns (R4). These fundamental dot plots are intuitive to understand [3]. The straightforward layout algorithm can easily be applied for hand-drawn visualizations. The countable dots help children with no prior experience to quickly interpret the plots, allowing them to explore the characteristics of displayed population data [15]. The one-to-one mapping between data points and dots is an example of a unit visualization [16] and might be a reason for the intuitive understanding and easy adoption of the plots. For example, in a comparison between dot displays and landscape visualizations, the dots were more memorable [27]. Note that histograms that render bars with dots (so-called "histodot plots" [31]) go against rules R1 and R3 and are, therefore, not covered by Wilkinson's definition.
Linear dot plots do not visually scale well (Fig. 2, left). Rodrigues and Weiskopf [21] introduced nonlinear dot plots (Fig. 2, middle) for high-dynamic frequency range data. Their visualization is largely based on the same principles as Wilkinson's but deviates from rule R2. They use nonlinear scaling functions for the dot size: the higher the stack, the smaller the dots. While our relaxed variant also scales dot size, it is not a function of the stack height but of the value frequency. Additionally, we do not adhere to rule R4. Instead of stacking dots into rigid columns, we distribute them using relaxation to improve correctness and aesthetics (Fig. 2, right).
Dot-based Visualization Bee swarm plots [8] are a kind of dot plot, using the same visual primitive in a similar way. They introduce larger spaces between dots to achieve correctness but do not enforce an envelope shape to represent value density (distribution shape) accurately. Instead, the direction of the bee swarm's sweep layout algorithm determines its overall appearance. Contrary to our proposed visualization, they do not target the accurate representation of data frequency by height in the diagram. They exhibit moiré effects, as they are visible in Fig. 8 and the supplemental material [20].
The recently introduced Blue Noise Plots (BNPs) by van Onzenoodt et al. [28] also use relaxation and produce a visualization similar to jittered strip plots. While our relaxed dot plots provide an envelope shape to show data frequency directly from the y-axis (also investigated in the study in Sect. 7.2), BNPs, by default, spread dots over all available height but cannot avoid overdraw in larger data sets. This leaves only the visual dot density as an indicator for frequency, which seems sufficient for users to get a rough impression of the frequency distribution as indicated by their study results. We include a comparison between our proposed technique and the public implementation of BNPs in Fig. 8. Before BNPs, Görtler et al. [9] used the blue noise property to convey 2D scalar fields using stippling. Results from their study, with stimuli very similar to BNPs, show that the performance of stippled density is comparable to that of grayscale. In contrast, our study includes aspects of the perception of the outer silhouette of dotted areas instead of the density within. There is also an option to centralize dots in BNPs to show the shape of the value distribution (similar to violin plots), but it does not provide sharp edges and aggravates the issue of overdraw. Relaxed dot plots provide tightly packed and individually distinguishable dots with minimal overdraw. Stippling has been adapted to many areas of visualization [9,13,14], including a recent multi-class approach that changes background color (inverted stippling) to increase dynamic range [26]. However, stippling aims at shading and thus generally ignores rule R1. Our technique adopts ideas from stippling to establish and analyze the blue noise property.
Blue Noise Property The term describes any noise with minimal low-frequency components and no intense spikes in energy. In essence, blue noise samples must be distributed locally equidistant (and ideally isotropic). Since this is a useful property for many applications in sampling, various methods were developed over the past decades [32]. Historically, Poisson-disk-based and tile-based methods had been popular before research focused on relaxation-based methods. Two important representatives of this class of methods are Lloyd's algorithm [12] and the more generic Linde-Buzo-Gray algorithm [11]. The latter can infer the number of samples and reduces to Lloyd's algorithm under suitable constraints.
Since the number of samples is fixed in a dot plot, our work uses constrained Lloyd relaxation to place dots closer to their corresponding data values. Compared to existing dot plot algorithms, our approach yields a dot distribution with less overall display error, increasing faithfulness to the underlying data, while establishing anisotropic blue noise to improve appearance.
Kernel Density Estimation (KDE) Wilkinson [31] discusses the relationship between dot plots and KDE. One can assume that data density ρ for a given interval width h is analogous to a data frequency f . Our relaxed dot plots use kernel density to get a smooth representation of the underlying data density. This serves for calculating the available plotting area and dot sizes. Often, KDE [17,22] is used to compute a probability density function (PDF), which is normalized to a total area of one. However, the result of our estimation can be any nonnegative number, including values > 1, to represent data frequency. KDE employs a smoothing bandwidth, which affects the usefulness of the estimate the most. Prior work shows a connection between optimal bandwidth and optimal dot size [31]. Building upon the dot size parameter from nonlinear dot plots, we derive the additional parameters required for KDE. Also, we replace the inferred bandwidth and adapt the kernel width instead. In contrast to previous dot plots, the overall distribution shape visualized by our technique is as faithful as KDE, i.e., we have better control over, and higher reliability of, the visualization of distributions.

GOALS AND OVERVIEW
We aim at more faithful and aesthetic dot plots, by which we understand the following goals: The dots should have less positional error (1), i.e., the difference between the actual display position on the x-axis and the true data value. The shape of the plot should well-represent (nonlinear) data frequencies (2), i.e., the peaks, valleys, smooth and frayed features of the outline should match the true frequency distribution. The dot plot should be space-filling and reasonably-well packed with little overlap between dots and avoid patterns that are not backed up by data (e.g., moiré patterns) (3). To achieve these goals, our method consists of two main stages as shown in Fig. 3. Sect. 4 corresponds to the first stage, where we describe how to determine the nonlinear size of dots and the necessary plotting space and shape. Similar to KDE, we use a kernel function to obtain an accurate and smooth estimate of the data distribution because dot plots show frequency distributions. To obtain the envelope, we vertically scale the distribution to meet the space requirements of the nonlinearly sized dots. Our scaling builds upon that of nonlinear dot plots and lifts the binning-like constraints of a column-based layout.
Sect. 5 corresponds to the second stage, where we present the relaxation of dots within the envelope. Here, we elaborate on placing the dots closer to their actual data value. Our technique uses the conventional two-way sweep algorithm to obtain initial positions efficiently. Then, we perform a modified Lloyd's algorithm [12] to optimize for proximity of the data value and establish the blue noise property.

PLOTTING SPACE, SHAPE, AND DOT SIZE
To position dots accurately and aesthetically, we first need to determine the envelope in which we can move dots. We also have to compute dot sizes for nonlinear scaling.

Kernel Frequency Estimation (KFE)
We require a distribution shape that faithfully represents the underlying data frequencies. There are multiple possible definitions of the true data frequency-a common problem in statistics-and even more methods. Our approach is based on kernel density estimation (KDE) [17,22]. Building upon the same statistical principles helps keep our visualization faithful to the underlying data-KDE has well-known properties for approximating statistical distributions.
In a nutshell, if (x 1 , ... , x n ) are the values from the data set, KDE allows us to estimate the probability of a data value being close to x with the use of a kernel K and smoothing bandwidth h. Since we require data frequency instead of probability, we do not normalize the area between the function and the x-axis. Independently from the smoothing bandwidth h, each kernel can have a varying width greater than zero. To avoid having another user-defined parameter, we mandate using an adapted kernel K h to have the same width as the bandwidth. Thereby, every kernel function produces a non-zero value within [−h/2, +h/2]. Outside this interval, it should be zero. The resulting frequency estimation is: According to Wilkinson, the choice of a dot diameter is similar to the bin width in histograms [31]. It is also related to the smoothing bandwidth in KDE, for which there are rules of thumb and complex fitting approaches. Moreover, for frequency estimation, each kernel can have multiple additional parameters. Prior nonlinear dot plots [21] make the selection of an initial dot diameter d 1 intuitive by inferring it from the user-desired aspect ratio of the final plot.
Building upon that definition of diameter d 1 , we determine the bandwidth and calculate the necessary kernel parameters so that kernel and bandwidth share the same size. Then, we can assume the kernel to be zero at distances higher than the bandwidth. This approach limits computational effort. In our experience, a bandwidth of h = 2d 1 is a good compromise between compactness of the dots and showing the shape of the data distribution. We will discuss the implications of bandwidth for relaxation later in Sect. 5.2.
Our method supports any kernel with an integral of ≈ 1 since we use the frequency to estimate how many data values appear close to a point x in the data domain. Accordingly, each value in the underlying data should contribute approximately a value of 1 to the frequency.
Detailed equations for the kernels provided with our implementation (e.g., box kernel, Epanechnikov, and Gaussian) can be found in the supplemental material [20]. For the Gaussian kernel, we use a truncated version that reaches zero: we calculate the parameter for standard deviation with σ = h/6, as is common practice. Doing so results in an interval of ±3σ within the bandwidth and an included area of ≈ 0.97. The small difference to the required area of 1 does not cause issues.
When a high dot frequency falls off abruptly, we still would get many dots with a diameter much smaller than the bandwidth h = 2d 1 . Our envelope shape would force these dots to spread out into the empty area covered by the kernel, leading to large errors in dot placement. To address this issue we employ a concept from bounded KDE called reflection [10]. We mirror the frequency values back into the denser area when at the outer limits or in gaps that are at least d 1 wide (see Fig. 4). Also, we determine the radius of the dot closest to the gap or outer limit and place the reflective boundary at the position of this dot's edge (data value ± radius).

Nonlinear Frequency Scaling
KFE gives us an approximation of how many data values appear close to a position on the x-axis. While this approach is sufficient for linear relaxed dot plots, the envelope is too large for the smaller dots from nonlinear plots. Previous column-based approaches used the number of dots c in each column to compute the height h c (c). To scale the height of the envelope, we need to map the previous column height h c (c) to a new height function h f ( f (x)) that depends on the dot frequency f at any position x along the x-axis. With the index c we refer to existing column-based functions, while the index f refers to the new frequency-based functions.
The (nonlinear) sweep layout creates columns that contain as many dots as there are data values in the covered range of the x-axis. We can derive a mean local value frequency f c (c) through the width (equal to the dot diameter d c (c)) and dot count in each column: The mean frequency f c (c) should be equal to the estimated frequency f (x) when the distribution of the underlying data is uniform, the bandwidth is sufficiently large, and the integral of the estimation kernel is 1-as required in Sect. 4.1. Therefore, the following equalities must hold: Using these relations, we can calculate the frequency-based height function of the traditional root dot plots [21] by where d 1 is the user-specified start-diameter of dots and s is a parameter that characterizes the root. In logarithmic plots, the user-selected base b is used for column height and the diameter is calculated through division by dot count. Since solving for frequency-based height is not straightforward, we get the inverse equation for frequency from height, and solve it numerically (nested intervals): In the supplemental material [20], we explain in detail how we rearranged the equations to arrive at the results in this section. We also show plots of the envelope scaling functions. When we draw dot plots, there are only integer numbers of dots. This conflicts with continuous, smooth value frequencies. When the estimated frequencies are lower than 1/d 1 , the height functions h f would not provide enough room to place full-sized dots (see point P in function plot in supplemental material [20]). In practice, when the frequency is so low, there are so little data values to display, that there is no need for stacking. Therefore, we extend our height functions to handle this special case. The piecewise functionsĥ f ( f (x)) andd f ( f (x)) return 0 at frequency f (x) = 0 and return d 1 when the frequency f (x) is between 0 and 1/d 1 . For higher frequencies, they return the same value as the original functions h f ( f (x)) and d f ( f (x)). In the supplemental material [20], we analyze our new functions with respect to limits for the user-defined parameters.

Individual Dot Diameter
While we could reuse the circle diameters from the sweep layout in our relaxed variant, having circles of the same diameter in spatial proximity could work against our efforts for finding ideal dot locations and avoiding moiré effects. Also, we cannot know the number of dots c in nonexisting "relaxed columns." However, we know the height of the envelope at any point along the x-axis. Accordingly, we can use generic observations about the height of columns and the local frequency within them to get a generic function for the individual frequency-based diameter of each dot. Again, we refer the interested reader to the supplemental material [20].

DOT PLACEMENT
After having determined plotting space and dot size requirements, we optimize the actual dot positions in a two-stage process: First, we initialize the layout using the two-way-sweep algorithm [21] for a good approximation. Then, we relax and swap the dot positions iteratively with the goal of more correct placement while achieving blue noise property. Algorithm 1 shows the high-level procedure.

Alternating Vertical Order
Column-based nonlinear dot plots sort all underlying data points by their value along the x-axis as a prerequisite of the sweep algorithm. Therefore, when no second data attribute is shown as color, there is also a predetermined vertical order within each column: small values appear as dots on the bottom and larger ones at the top. Our relaxation algorithm will move dots closer to their underlying data value later on (see Sect. 5.3). Similarly to bee swarm plots [8], this would create the "leaning towers of dots" shown in Fig. 5. Therefore, as part of the initialization with the column layout, dots are vertically sorted by alternating order. First, the dot representing the smallest value, then the largest one, next the second smallest, second largest, etc. To maintain the coherence of colored patches, we only reorder dots of the same color. As Fig. 5 shows, this approach avoids the leaning columns. We will revisit further implications of this technique when discussing faithfulness in Sect. 7.1.

Centroidal Voronoi Tessellation (CVT)
Our relaxation is based on Lloyd's algorithm [12] to establish the blue noise property. In each iteration, the algorithm calculates a Voronoi diagram of circles from the given dot positions and diameters. We use signed distance functions and the GPU-friendly jump-flooding [33] in our implementation, however, there are also other algorithms to compute the Voronoi diagram of circles [2]. The result is one cell per dot. To stay within the desired plot bounds, we clip these cells to the envelope before computing the centroids. When moving the dots toward their respective centroids, one obtains a Centroidal Voronoi Tesselation (CVT) where the distances between dots are locally equalized, and the blue-noise property is observable [24,29] (see Fig. 8).
Since we clip the Voronoi cells to the envelope, outer cells might split into disjunct regions, as in Fig. 6. The centroids of such split cells could escape the envelope, disrupting the visualization of the distribution of data frequency. To avoid split cells, we use a bandwidth of twice the initial dot diameter. The rest of the algorithm has no issues with roughly shaped cells and spiky envelopes.
Iterative algorithms require a condition for termination. Previous work on stippling analyzed how the number of iterations and the type of algorithm influences the quality of produced images [6]: The quality  10 until mean dot movement from d ′ to d ≤ ε quickly improves at the beginning but then stagnates with higher iteration counts. Our analysis of mean dot movement and positional error in units of diameters shows similar results. We chose mean dot movement as a condition for termination because it converges to more similar values across a variety of plots. In the supplemental material [20], we show that a threshold of ε = 0.015 includes most dot movement. Therefore, this value is sufficient for a fast approximation of the layout. A less time-constrained and more highly refined layout is achievable with ε = 0.003.

Placement Correction
One of our goals was to create a more faithful visualization in which dots are closer to their underlying value than in previous sweep layouts. Since a dot plot shows one-dimensional data, the error is the difference between the actual display position on the x-axis and the true data value. Think of two dots that are both offset by 5 pixels. One of them has a diameter of 20 pixels, the other 3 pixels. The latter has much more error, as it represents a value that lies outside the area of the x-axis covered by the dot. The error is smaller for the large diameter because it covers the correct data position. Additionally, the center of a large circle is more difficult to determine, making a slight offset less significant for the user's perception of the visualization. Therefore, we quantize the positional error of a dot as the offset in units of its radius.
Furthermore, a plot with many small errors is more trustworthy than one with few large ones: If a dot should be on the far right but appears on the far left, it introduces fabricated information, and the user cannot map it to the underlying data. In contrast, small errors are barely perceptible and can fall within the inaccuracy of reading the dot positions on the x-axis. Therefore, we use the squared error to favor layouts with smaller errors and heavily penalize large offsets. Thus, the overall error of a plot is the sum of squared errors (SSE). Shifting the dots toward the position of their underlying data reduces the SSE and leads to a more faithful layout (see Fig. 7). Dividing the SSE by the number of data points allows us to compare errors between plots of different data sets.
Implementation-wise, we move dots along the x-axis through weighted positions from the CVT and the actual data value, as shown in line 9 of Algorithm 1. Too much weight on the correct position leads to increased overlap between dots (see Fig. 7). Too little weight for correction, and dots will not move toward the correct data value. Therefore, users of the visualization have to decide on a trade-off between aesthetics and correctness. As Fig. 7 shows, a default value of v = 0.3 provides satisfactory results. However, users can still adjust the weight to fit their needs.

Swaps for Tunneling
One major limitation of any partition-based relaxation is that partitions can block each other, i.e., dots cannot move closer to their ideal position at high dot density. This problem is also known as the conflict of Voronoi diagrams [18] in multi-class sampling. Inspired by quantum tunneling [23], we let dots cross an obstacle by tunneling through it instead of displacing it.
In Algorithm 2, we check whether the overall error decreases when two dots swap positions. Note that dots only swap their positions and retain their diameter and underlying data. This way, we can achieve a layout with less overall error. Notably, the dots can be stored in a sorted manner with minimal overhead in line 1 of Algorithm 2 because the input data is already sorted at the beginning of the traditional sweep algorithm. This order enables us to restrict the search for swap partners to only a small neighborhood of each dot (line 3).
If the dots are indistinguishable, e.g., by color, the influence of swapping is hardly noticeable. However, swapping can change the vertical order of dots, which interferes with the display of a second data dimension, e.g., categorical data. Without vertical coherence of categories, the dot plot would quickly look variegated and cluttered. Therefore, we only swap dots that have the same image or whose colors are sufficiently similar.

EXAMPLES
In this section, we showcase our technique using two real-world datasets that touch upon the topic of global warming in a broader sense and have high dynamic range, i.e., nonlinearity is worthwhile. Fig. 8 shows global air temperature measurements retrieved from the website of the German meteorological service 1 . The original data was rounded to one decimal place, which lead to various regular patterns in nonrelaxed plots. Since we want to compare the visualizations themselves without stray effects from the granularity of the data, we jittered the temperatures to get data points from a continuous spectrum.
The column-based, traditional nonlinear dot plot in Fig. 8 cannot represent the data frequency well. The sharp spikes between the temperature of 20 and 30°C are not present in the data. Notably, there are no spikes for similar frequency fluctuation between 0 and 10°C, which is inconsistent. Similarly, the intermittent gaps from -20 to -5°C show DDA of RDP DDA of NDP DDA of BNP DDA of BSP Our relaxed dot plot (RDP) Nonlinear dot plot (NDP) [21] Centralized blue noise plot (BNP) [28] Bee swarm plot (BSP) [8] MSE ≈ 0.0212 MSE ≈ 0.4347 swap position x,y (A) with position x,y (B) that the quantized widths of columns are not a good fit for the data frequency. In contrast, our result shows a consistent and appropriately smooth contour. Moreover, our relaxed dot plot shows a lower mean squared positional error of the dots. It is most noticeable at the edges of the x-axis that points with different values are not placed at the same x-coordinate. Besides improving correctness, our result appears more aesthetically pleasing as dots are more evenly distributed and the blue noise property avoids strong moiré effects.
We also provide visualizations of the data in two plots from related work. Note that both related implementations are not designed to separate differently colored dots within a single layout, making it difficult to extract information from the dot colors.
Blue noise plots by van Onzenoodt et al. [28] have an option to centralize dots and limit their positions to an area that represents the underlying value frequency (similar to violin plots). As Fig. 8 shows, the visualization we created with the publicly available implementation places many overdrawn dots on a horizontal line. We did not expect the line-like structures at the outer bounds of the shape that is supposed to indicate value frequency. We do not know whether these two artifacts are caused by conceptual faults in algorithm or errors in the public implementation. Plots of smaller data sets are consistent with the expected outcome of the technique and the figures in the original publication. The obtained layout was calculated in over 7 minutes with the regular CPU implementation as the graphics hardware was running out of memory (Intel Core i5-6600 3.3 GHz, NVIDIA GeForce 1060 GTX 3 GB). All other layouts were computed in less than 10 seconds.
Eklund's bee swarm plot [8] has no overdraw but moiré effects. Fig. 8 shows how the visualization appears inside the plotting pane of RStudio when using the current public R implementation of the algorithm. Dots close to the x-axis form straight stacks before branching out. In the supplemental material [20], we provide monochrome plots in which the spaces between dots merge to form worm-like shapes in mostly diagonal directions. The color noise in Fig. 8 helps to mask such effects.
Both visualizations from related work do not adapt dot sizes, providing only linear plots. The small dot sizes required to visualize the entire temperature data set are problematic when trying to analyze outliers. This is where nonlinear dot plots are most suited.  Fig. 1 shows the percentage of renewables in electricity production of various countries. Here, we show each country's flag and encode the continent in a colored outer ring to emphasize possibilities of unit visualization. Most countries are between no and 10% renewable energy. There are two clusters around 60 and 100%, but only a few countries in between. Through color, one can estimate each continents' distribution in the context of the overall distribution.
This example is also well suited for broad audiences: While only a few individuals know all countries and their flags, most people know their own and neighboring countries'. Therefore, such a visualization in a public place could be used for nudging and increasing awareness of the need for renewable energy in times of global warming. Passersby might try to find their country's flag, read off the data value, and compare it to other known countries. While it has been shown that unit visualizations with column-based dot plots are simple enough for children to grasp and interpret quickly [15], we argue that relaxed dot plots lower cognitive barriers: There is no need to understand the binning into columns and the values are represented with less hidden uncertainty. Viewers are only required to grasp how a number line works to extract basic and more correct information than with columnbased layouts.

EVALUATION
In this section, we report on computational experiments and the results of our crowd-sourced user study.

Correctness
In comparison to column-oriented layouts, the weighted placement correction in the relaxation algorithm strongly decreases horizontal error. This can be seen in Fig. 8 and in the synthetic case of Fig. 7. The relaxed plot in Fig. 8 has over 95% less error in dot positions.
The envelope shape uses KFE to determine the underlying data distribution and is completely filled in the relaxation step. Therefore, the overall shape of our visualization shows a better approximation of the frequency distribution than the column-based dot plots. Additionally, our method also provides a more faithful display of local variance. Previous techniques for dot plots align dots in perfectly straight stacks, depriving the user of a notion of variance within each column. If such a vertical column of dots appears in a relaxed dot plot, it strongly indicates a low variance of the underlying data values. Fig. 9 illustrates such a case, in which our layout's alternating order spreads the dots in the column with higher variance. This is especially useful when analyzing uncertain and noisy data: a vertical line pattern will call for attention toward an anomaly.

Crowd-Sourced User Study
We conducted a crowd-sourced user study to check whether users can actually benefit from the reduced positional error, the depiction of local variance (or lack thereof in columns), and from the smooth approximation of global data distribution (KFE). Thus, our hypotheses are centered around the comparison to traditional nonlinear dot plots. As mentioned in Sect. 8, column-based layouts exhibit auxiliary lines that might help viewers to read the display position of a dot. These lines are missing in our relaxed dot layout and, therefore, we hypothesize that the accuracy for reading the dot positions in the relaxed layout is lower (H1). As our layout indicates local variance, we hypothesize that comparisons between close underlying data values are more accurate (H2). The original column-based layout is ragged and contains high and low peaks close to each other. This might cause issues with the perception of the density in the underlying data distribution. When focusing on local minima of value frequency, it is our hypothesis that the apparent depth in the relaxed layout is closer to the KFE of the underlying data distribution than with the column layout (H3). Each hypothesis relates to a specific aspect of both dot plots and is tested through one experiment. All these experiments are designed as twoalternative forced-choice (2AFC) [25] to avoid bias and provide easy input methods for participants. The 2AFC design is sufficient to provide evidence for our hypotheses without going into overly complex studies to extract just noticeable differences. Additionally, there are no missing answers with "no choice" because participants have to decide between the two alternatives. The study has a within-subject design.
Procedure We started with a short introduction to explain how dot plots are read. At the beginning of each task, we provided additional explanations and a test question with immediate feedback (more details in supplemental material [20]). Further training does not seem to have been necessary, confirming the low cognitive threshold to access information in dot plots [15]. In the experiment for H1, we showed a plot and highlighted one dot, as in Fig. 10. Participants were asked to read off the displayed dot position on the x-axis. They were presented with a correct choice that represented the exact display position of a dot and an incorrect value that was offset by our independent variable δ 1 (Fig. 11). This task was designed to compare the influence of vertically aligned points (column-based) and evenly distributed points (relaxed) on value-reading performance. Participants were asked not to use additional tools. Besides the reading strategy, the resolution of the tick marks could also affect reading accuracy. Therefore, the wrong choice was offset in units of distance between tick marks on the x-axis. The range of δ 1 ∈ {±0.005, ±0.25, ±0.5} ensured easier and very difficult tasks (determined in pilot studies).
For H2, we highlighted two dots and asked which one represented the higher value (see Fig. 10). The distance δ 2 between the underlying values of these dots was the independent variable (see Fig. 11). With this setup, we examined which layout was better suited to decide which of two data values was higher. Differences in lateral position between small dots are easier to notice than with large ones (same reasoning as with error metric in Sect. 5.3). Therefore, the values of the two dots that represented the choices in this experiment were spaced apart in units of displayed radii with δ 2 ∈ {0, ±2/3, ±4/3}. Regarding H3, we showed a plot in the center and two smooth silhouette shapes on the sides (see Fig. 10). Participants were tasked with selecting the shape that best matched the plot. With this experiment, we wanted to compare the perception of minima in both plots' silhouettes. To this end, we used the envelope of the underlying data distribution and generated Hermite splines to connect the two local maxima that surrounded a local minimum. Fig. 11 shows how the independent variable δ 3 controls attraction between the Hermite spline and the envelope when positive, filling the valley ( ), or, repulsion when negative, creating a "deeper valley" ( ). The range of δ 3 ∈ {±0.2, ±0.4, ±0.6} was set in units of distance between the envelope and spline.
For crowd-sourcing, we created 360 2AFC questions in batches (3 tasks × 3 data set sizes × 5 values for δ × 2 plot types × 4 batches). Additionally, we asked test questions with stimuli adjusted to have very obvious answers. Participants with an accuracy rate below 80% on these questions were removed from the analysis.
Analysis In total, we collected 90 responses from each of the 100 participants. Fig. 12 shows the distribution of correct responses for each task and δ . In the value-reading task, we checked the distribution of heights of selected dots because it could be a confounding variable. The distributions did not differ significantly between both plot types (two-sample Kolmogorov-Smirnov test, D = 0.2, p = 0.182). The selected data values for the task were drawn randomly and are the same in both variants; therefore, possible influences of horizontal anchoring effects are rooted in the layout algorithm and actually part of the dependent variable. The distributions of correct answers does not differ considerably. The percentage of correct answers for traditional dot plots (Mdn = 0.78) is not significantly different from our relaxed ones (Mdn = 0.75), which we confirm with a Mann-Whitney test, U(N column = 100, N relaxed = 100) = 5415.5, p = 0.309. Therefore, we reject our hypothesis H1 and assume that users can read the dot positions with the same accuracy in both visualization types. Note, however, that relaxed plots place the dots more correctly in the layout and, therefore, the user-read values should be closer to the underlying data.
For the comparison task, there is a large difference in variance and median (Mdn column = 0.60, Mdn relaxed = 0.86) of correctness. When there is no difference in data value, the traditional dot plots produce less variance-without being more accurate. In all other cases, responses based on relaxed dot plots where consistently much more correct. We can confirm H2 with a Mann-Whitney test, U = 771.0, p ≪ 0.001.
Results for the perceived shape of the plot are consistently correct when estimating a higher limit-there is no confusion when we fillin the valleys. However, participants are very erratic when choosing a lower limit for minima in dot distributions of column-based plots (overall Mdn = 0.72). Our relaxed variant shows the drop in frequency more clearly, allowing participants to distinguish between the real depth and the artificial lower offset (overall Mdn = 0.92). We can confirm H3 with a Mann-Whitney test, U = 1876.5, p ≪ 0.001.

Blue Noise Property and Moiré Effect
An important advantage of our relaxed dot plots is that they observe the blue noise property. It is community standard in computer graphics to show this quality using Fourier analysis [24] (FA) and Differential Domain Analysis [29] (DDA). Dot plots are inherently non-uniform distributions (except for artificial corner cases that are evenly dotted with 1:1 aspect ratio). Since FA is not applicable to non-uniform point distributions without introducing strong assumptions (e.g., through window functions), we use DDA to assess the quality of dot plots instead. The main mathematical difference between those methods is that FA uses a cosine kernel, whereas DDA uses a Gaussian kernel. For the interested reader, we refer to the original publication [29] for a detailed derivation of DDA from FA. Accordingly, DDA does not show direction-dependent energy of frequencies but direction-dependent distribution of distances. Note that we transform our signed distances sd() to positive distances using max(0, sd(s i , s j ) − 2r min ) with s i and s j being dot pairs and 2r min being the minimum dot diameter before applying DDA. The DDA plots in Fig. 8 quantitatively show the influence of relaxation regarding blue noise. The traditional nonlinear dot plot shows regular patterns induced by the static column layout, whereas the relaxed dot plot shows the typical eclipse-like pattern of blue noise. The peak energy drops from 7475 to 178, i.e., 2.4 % peak energy remains. As expected, the relaxed dot plot does not establish isotropic blue noise as indicated by minor patters around the eclipse. However, there are almost no low-frequency components in the center and the rest of the energy is noise-like distributed.
Blue noise plots (BNPs) by van Onzenoodt et al. [28] placed approximately half the dots on a horizontal center line. This leads to a strong line in the DDA and and many weaker horizontal stripes. In the supplemental material [20], we provide a DDA of the same BNP as in Fig. 8 but exclude the dots on the center line. It shows an oval eclipse and noisy distribution of energy round it. As the bee swarm plot (BSP) creates a tight packing of same-sized circles, their DDA has mostly diagonal patterns. The sharp edges of the center eclipse are due to the layout where dots are all of the same size and can touch but will not overlap. In contrast, the eclipse in the DDA of our relaxed layout is MOD≈ 0.0547 MOD≈ 0.1083 MOD≈ 0.0177 Fig. 13: Dot plots with overlap marked in red. The columnar plot (left) draws dots 5% smaller than the diameter during the layout to create separation. Relaxed variants with the same size reduction would have too much overdraw (center). Therefore, we shrink dots by 20% to avoid most overlap (right).
blurred because the dots are of different sizes.
In the context of dot plots, a moiré pattern can emerge when the sampling scheme of the viewer becomes too similar to the dot pattern. For traditional dot plots, this effect usually occurs when reducing the plot size (e.g., using bilinear interpolation for downsampling) or increasing the viewing distance (which has a similar effect). In such a scenario, the columns of dots align with columns of pixels. Since every column might have a slightly different width, it aligns more or less perfectly with the pixels and can create virtual tilted lines. However, since our relaxed dot plots show sufficiently good blue noise, it is less likely that a moiré effect occurs since there are no intense frequencies to "catch on."

DISCUSSION
Overdraw and Dot Distinguishability When two dots overlap, the circle area becomes ambiguous and difficult to recognize (see Fig. 13). Dots rarely overlap in column-oriented plots-usually only when the underlying value frequency changes rapidly. Lloyd relaxation with signed distance functions also tends to keep the dots apart, but the horizontal placement correction is not aware of circle boundaries and increases overlap. The traditional nonlinear plots used circle padding, i.e., they reduced the display size of dots (typically by 5%) to gain vertical separation and neutralize minute horizontal overlay. In comparison, the relaxed layout with v = 0.3 (Fig. 13, center) has more overdraw. With the same padding, densely packed dots fuse and become difficult to distinguish. As a compromise between faithful data presentation and readability, we apply a padding of 20% to decrease the mean dot overlap in display diameters (MOD) to a negligible amount.
Computational Complexity and Performance For this discussion, we assume that the size of the dataset is n ≪ 100, 000 since the number of distinguishable dots is limited.
The practical runtime of sweep algorithms for dot plots is small. They run in linear time after sorting the input data, usually with a sorting algorithm that has an average computational complexity of O(n log n). Our approach is different. The complexity for the envelope (see Sect. 4) depends on both n and the resolution r for sampling the KFE. However, it is linear in both variables.
The runtime of GPU-based jump-flooding for the calculation of the Voronoi diagram mainly depends on the texture size, which has a lower bound defined by the ratio between the largest and smallest diameter of the dots in the plot. Therefore, it scales with the square of the value frequency range, not the size of the dataset. The complexity for determining the centroids-used for moving the dots in our relaxation step-also depends on the resolution of the texture. A CPUbased implementation would have O(n log n) complexity to compute the Voronoi diagram, followed by linear complexity to compute the centroids.
Our tunneling swaps algorithm makes use of the almost-sorted dot positions. Thus, we can obtain a correct order in linear time, e.g., using the insertion sort algorithm. Then, we iterate the sorted list of dots and limit the search to a neighborhood because only dots that are closer to the ideal position are possible candidates for a swap. Accordingly, this algorithm has linear complexity O(n) with respect to the size of the underlying data set with a frequency-dependent factor for neighborhood search.
Our open-source implementation is a web-based D3.js plugin for maximum portability. Notably, this technology choice introduces considerable GPU-CPU-communication overhead in each iteration (WebGL readPixels()), which will be entirely avoidable as more compute-oriented technologies become available (WebGPU). Plots with 1, 000 points achieve interactive responses (≤ 1 s). Larger examples with high-dynamic-range require more time due to the above-mentioned technical constraints (≈ 5 s for Fig. 8).
Readablity Our proposed visualization works with both linear and nonlinear scaling. Traditional column-based linear dot plots provide a regular grid that gives viewers implicit auxiliary lines to track the vertical and horizontal position of dots. The introduction of nonlinear scaling to traditional dot plots enables the visualization of larger and high-dynamic range data. At the same time, it sacrifices the ease with which the vertical position of dots can be read. Additionally, the small and regularly arranged graphics primitives lead to unwanted moiré effects. Advantages of having guides for the horizontal position remain.
Are implicit vertical auxiliary lines in column-based dot plots actually a benefit? The ease with which users can read the value of a dot instills unjustified confidence because the display position itself is flawed: all dots in a column share the same x coordinate. Our relaxed layout does not align dots; neither vertically, nor horizontally. This leads to a more correct positioning and avoids unwanted moiré effects.

CONCLUSION AND FUTURE WORK
We analyzed the dot-count-based scaling of traditional nonlinear dot plots-that required strict columns-and derived frequency-based functions for scaling of value-frequencies in underlying data. This enabled us to place dots more freely to create relaxed dot plots. The proposed visualization is more faithful to the underlying data than previous variants: The dot placement has less horizontal error. The lack of columns better shows local variance and allows interpretation of perfectly straight columns. The envelope shape represents the data distribution more accurately. Also, our result is more aesthetically pleasing with no moiré effect to speak of or any need for vertical blurring of columns. The lack of strict columns might lower cognitive barriers, making our relaxed variant of dot plots better suitable for visualization for the masses. In a user study, we were unable to find differences in the accuracy of reading the numerical value of the dots' display positions despite the lack of vertical aids. The study also confirmed that comparisons between the values of dot pairs are more accurate with our proposed plot type. We were able to confirm that, in contrast to column-based layouts, the perceived silhouette of relaxed dot plots matches the underlying frequency distribution more closely.
Future work might entail custom envelope scaling functions that are not connected to previous nonlinear dot plots or avoid their intrinsic quadratic distortion. The incremental nature of the relation algorithm allows for efficient animated plots of time-varying data where dots move, appear, and vanish. Animated plots do not require high accuracy and could reuse previous Voronoi diagrams and dot positions, meaning that each frame would only require a single iteration of the relaxation. An important aspect we have planned for future open-source development is a WebGPU port to eliminate GPU-CPU communication during iteration. Instead of relying on circles, relaxed dot plots might generalize to arbitrarily-shaped glyphs via generalized Voronoi diagrams. For example, lines or ellipses could encode fuzzy and uncertain data points.