IEEE Xplore At-A-Glance
  • Abstract

Interactive Visual Analysis of Complex Scientific Data as Families of Data Surfaces

The widespread use of computational simulation in science and engineering provides challenging research opportunities. Multiple independent variables are considered and large and complex data are computed, especially in the case of multi-run simulation. Classical visualization techniques deal well with 2D or 3D data and also with time-dependent data. Additional independent dimensions, however, provide interesting new challenges. We present an advanced visual analysis approach that enables a thorough investigation of families of data surfaces, i.e., datasets, with respect to pairs of independent dimensions. While it is almost trivial to visualize one such data surface, the visual exploration and analysis of many such data surfaces is a grand challenge, stressing the users' perception and cognition. We propose an approach that integrates projections and aggregations of the data surfaces at different levels (one scalar aggregate per surface, a 1D profile per surface, or the surface as such). We demonstrate the necessity for a flexible visual analysis system that integrates many different (linked) views for making sense of this highly complex data. To demonstrate its usefulness, we exemplify our approach in the context of a meteorological multi-run simulation data case and in the context of the engineering domain, where our collaborators are working with the simulation of elastohydrodynamic (EHD) lubrication bearing in the automotive industry.

SECTION 1

Introduction

Simulation is used in science and engineering to study a wide range of problems and to understand underlying models and investigated phenomena. Interactive visual analysis helps professionals to explore simulation results and to understand and explain the data. The readily available computing power also allows to investigate multiple simulation runs for a given case scenario. The input parameters (independent variables) are varied and the values of the output parameters (depen-dent variables) are computed for each combination of the input parameters. The resulting collection of simulation runs needs to be analyzed by exploring individual simulations and how they relate, i.e., what are the emerging patterns.

One important motivation for the study of multi-run simulation data is to perform a sensitivity analysis of the computation. To do so, new ways of visual data exploration and analysis are needed — data from multiple simulation runs are very complex to study (compared to conventional simulations where just time and/or space are considered as independent variables). Traditional scientific data, and corresponding visualization and analysis methods, are usually tuned for this more traditional data model.

We present an advanced visual analysis approach that supports the analysis of families of data surfaces, i.e., datasets that are seen with respect to pairs of independent variables (dimensions). While it is straightforward and easy to visualize one such data surface, the visual exploration and analysis of a large number of such data surfaces poses significant stress to the user's perception and cognition. We propose an approach that carefully integrates different projections and aggregations of the data surfaces at three different levels (one scalar aggregate per surface, a 1D profile per surface, or the surface as such).

There is an increasing number of application domains, including industrial simulation and meteorology, for example, in which it becomes natural to automatically consider multiple simulation runs for analysis, and accordingly there is increased need for appropriate visualization solutions. Konyha et al. [12] showed how to analyze data from multiple simulations of 1D CFD in an automotive injection system. They limited their approach to the investigation of families of curves, i.e., the consideration of the data with respect to one variable. A large number of important real-world problems can benefit from this approach. Other problems, however, have complex data sets that should be considered as families of surfaces. Examples include avalanche warning systems in mountains and the simulation/measurement of the seabed as used in modern tsunami warning systems.

In the following, we use one illustrative example, i.e., a study of a historic climate scenario, in the beginning in order to introduce the here proposed methodology. Although this meteorological example is not a real case study, it is useful as an easily understandable illustrative example. Later, we discuss a more detailed case study from the engineering domain, based on the simulation of elastohydrodynamic (EHD) lubrication bearing in the automotive industry, which stems from an actual inter-disciplinary collaboration amongst the authors of this paper. Along with these two cases, we demonstrate the necessity for a flexible system that allows to integrate a large number of different (linked) views for making sense of this complex data scenario and propose new interaction and analysis techniques which make it possible to deal well with such data.

SECTION 2

Related Work

The body of literature about the visualization of large, high-dimensional, and time-dependent data sets is impressive and the field is still an area of active research [1], [3], [19], [20].

The exploration of large data sets [11] is based on the idea to present the data in a visual form that would allow analysts to interact with it. Data visualization techniques, when suited for the given data set, reduce the cognitive load while performing analysis tasks. A visualization technique should have limited visual overlap, fast learning, and good recall [11]. Furthermore, good integration with traditional techniques (including simulation) improves the data exploration process.

Time-dependent data is a very important category of data sets. Brushing the time axis to display details of the selected time frame is one very common and useful interaction technique used with static representations [9], [12], [15].

Aigner et al. [1] provide an overview of visual methods for analyzing time-oriented data and discuss general aspects of time-dependent data. The time factor requires a special treatment during visual exploration. Two cases are distinguished based on the time dependence of the visual representations, time-dependent (dynamic) and time-independent (static) representations. The examples of the multivariate data visualization techniques for multi-variate time-dependent data include the ThemeRiver [8] and Spiral Graph [22].

Time-dependent (serial) data often exhibit some periodic behavior. Such serial periodic data are of special interest. For example, time continues forward serially but include recurring periods (weeks, months, and years) [4]. The challenge is how to simultaneously display serial and periodic attributes of a data set.

All of these methods consider each dimension in a multidimensional space to be a scalar value (numeric, categorical, nominal, or the like). In the case of time-dependent data they handle it as an isolated case, or aggregate the data in order to get scalar values.

The challenges that result from a complex internal data structure can be tackled, for example, by the interactive visual analysis of families of curves [12]. That approach provides analysis procedures and practical aspects of interactive visual analysis that are specific to this type of data. Multiple linked views combined with advanced interactive methodology support iterative visual analysis by providing means to create complex, composite brushes that span multiple views and that are constructed using different combination schemes and that respect the 1D data series (curves) as a data (sub-)structure [5].

Time-independent representations are well explored but still have room for new innovations. Some of the recent findings include Lexis pencils [6] that map various time-dependent variables to the faces of a pencil. Another interesting approach uses extruded parallel coordinates, linking with wings, and three-dimensional parallel coordinates, integrated in a single rendering system, that visualize trajectories of higher-dimensional dynamical systems [23].

Interaction techniques allow the user to better understand the data due to the ability to interact with the data. One of the well established techniques is Focus+Context (F+C) visualization [7]. When the amount of data is too large to be displayed, the user should be able to focus on specific data sets of interest while keeping track of the context (entire dataset). There are four groups of F+C techniques: distortion-oriented, overview methods, filtering, and in-place techniques [3]. Focus+Context visualization is often used in an multiple linked view setup that supports linking and brushing.

The display of surfaces from volume data is standard in visualization [13]. Surfaces can succinctly represent complex three- and multidimensional data. These kinds of surfaces are created from sampled scalar data in three spatial dimensions.

The user's perception and cognition of surfaces can be improved by designing perceptually near-optimal visualizations [21]. Such a design is achieved by collecting perceptual characteristics of visualization methods, and exploring them to discover principles and insights to guide the design of visualizations [10].

SECTION 3

Illustrative Example and Proposed Methodology

We first describe an example from the climate research field to illustrate the challenges and then introduce the newly proposed technology.

3.1 A sample Analysis of Multi-run Climate Data

Meteorological data provide a prime example of a collection of long-term multi-dimensional data sets. The relevance and broader impact of the results gathered from meteorological data are tremendous. The time scale ranges from hourly and daily weather forecasts to long-term climate change. There are many efforts to facilitate collection, storage and exchange of meteorological and related environmental data.

One such effort is the Potsdam Institute for Climate Impact Research (PIK — http://www.pik-potsdam.de/), where researchers in the natural and social sciences work together to study global change and its impacts on ecological, economic, and social systems. A combination of data analysis, computer simulations, and models is used to study meteorological and related data. The researchers at PIK collaborate with researchers worldwide in order to be able to predict future climate changes. Relevant progress with respect to better predictions can only be made if the past is well understood. As part of this research they investigate − amongst other cases, of course − climate scenarios around several meltwater outbreak events of proglacial Lake Agassiz, i.e., an immense glacial lake located in the center of North America and formed about 12,000 years ago [2].

Figure 1
Fig. 1. a. Two linked views, parallel coordinates depicting three simulated values and a scatterplot showing control parameters. A gradient brush over Greenland temperatures shows how it is related to diffv (Diff_V) and diffh (Diff_H) control parameters, and negative correlation between Greenland temperature and tropical temperature, and respectively with global precipitation. A plain table with 50000 rows was analyzed. b. A curve view showing the Arctic water parameter. Each curve represents 500 time steps of a simulation run with a particular set of control parameters. There are 100 curves depicted, one for each combination of diffv and diffh

The investigated data describes the simulated climate response to one of these outbreak events. The more than 4,000 year long lifespan of this Lake Agassiz provides an exciting case for simulation and data analysis. Approximately 8,000 years ago, the lake drained due to climate warming and melting of the surrounding Laurentide Ice Sheet (LIS). The here investigated multi-run climate simulation is based on the PIK Climber 2.3 model and simulates a cooling of about 3.6 K over the North Atlantic induced by this meltwater pulse from Lake Agassiz, routed through the Hudson strait.

To get a better understanding of the variability of their climate simulations with respect to external parameters of the simulation, they compute multiple runs of the simulation with varied parameters. They varied two diffusivity parameters (one horizontal, diffh, and one vertical, diffv, both with respect to the ocean part of this model, 10 variations each), a total of 100 (10 × 10) runs. Each simulation run spans 500 years of annual data.

As a result, they aggregated 35 different values from the more detailed raw simulation data. The aggregates include CO2 concentration, global surface air temperature, surface air temperatures for both hemispheres, land surface air temperature, ocean surface air temperature, Greenland temperature, global precipitation, ice areas for both hemispheres, salinity information, various differential measures (heat transport, ice transport), etc. For each of the 100 simulation runs all of these result values are available for 500 years (one time step per year).

The conventional approach would be to represent the data corresponding to one run as a table with 500 rows, each corresponding to one time step. There would be columns for the time step (an independent variable differentiating the rows in one simulation run) and output dimensions like surface air temperature. As we have 100 simulation runs, a large table consisting of 100 × 500 = 50,000 rows can be created. In this case, there are three independent parameters diffh diffv, and time. Such a large table can be explored and analyzed using coordinated multiple views so that experts gain deeper understanding. Some interesting relations between different climate descriptors can be seen instantly, e.g., that the average Greenland temperature is negatively correlated with the average temperature in the tropical regions as depicted in Figure 1a. There is also the strong correlation between Greenland temperature and simulation control parameter diffv.

If analysts are interested in the development of various results over time such an approach would be complicated to use (since coherency is lost in the sense of which data rows belong to which time step). All timesteps belonging to the same set of control parameters should also show up in the visualization coherently. Konyha et al. [12] showed how an advanced data model with an explicit support of time series in the data can be exploited for advanced visual analysis. In our case, this would mean to merge all 500 rows from one simulation run into one row containing time series in dependant dimensions. There would be 100 rows in such a table, but each would contain many time series dimensions as advanced data types Figure 1b shows an example of 100 time series (one per simulation run), representing the Arctic water parameter, visualized using a function graphs view. This view shows multiple function graphs simultaneously. If there are many curves, a density transfer function can be used to depict areas with less curves more transparently. A set of outliers that deserves further investigation (possibly indicating an error in the model) are shown in Figure 1b.

Generally, we can interpret such data as a collection of substructures, i.e., a collection of data subsets that form time series as addressed above (one per climate descriptor and run, and with 500 time steps each). For other application questions, we consider different substructures: they can be small 2D data tables of their own (one per climate descriptor and time step, and with 10 × 10 = 100 values in the table, i.e., one per instance pair of the two varied diffusivity parameters). We call these small subsets data surfaces, or just surfaces, and all surfaces representing one climate descriptor a family of surfaces. We merge all original data items which have the same time step, and organize the 35 simulated dimensions as 35 families of 500 surfaces each. For each time step, a data attribute such as Greenland temperature, for example, is now a function of two variables f(x,y), in our case f(diffh,diffv). This is true for each simulated value. We have 500 rows in the table now, each containing one scalar value − the time step, and 35 surfaces in the form surface = f(diffh,diffv).

Data organized in this way offers new and unique analysis opportunities. However, existing interactive visual analysis technology does not support it. Visualizing one data surface is trivial, visualizing and exploring an entire family is a challenge. We introduce new methodology for the visual analysis of such data.

Analysts want to get deep insight into data and they want to understand the data and underlying simulation model. They want to differentiate the surfaces according to overall characteristics, or considering the variation of climate descriptor values in surfaces along their horizontal (or vertical) diffusivity axis. In order to support such procedures we propose to use various levels of aggregation of the surfaces. The simplest one is to capture one surface as one aggregation scalar, such as the surface's maximum, minimum, median, mean, or span. These scalars can be easily visualized using parallel coordinates, for example Figure 2a shows parallel coordinates visualizing five standard Greenland temperature aggregates (for 500 surfaces each). In the next step, experts want to investigate these surfaces with respect to each of the two axes of the surfaces. They might be interested in how a climate descriptor value changes along the horizontal (or vertical) diffusivity axis. They want to compare how this changes along the time steps. As the view to the surface along one axis can be considered as a collection of curves, we use the curve view to depict the surfaces at this level Figure 2b shows Greenland temperature as seen along each of the axes. We depict all respective (curve-typed) cross-sections through the surfaces in this case. Only one surface was selected, the rest are shown as context in the Figure 2b. Note the different shapes which the surfaces from the collection have. This single surface is represented with a single polyline in the parallel coordinates view. Finally, professionals want to see the surface itself at some point of analysis. Although we cannot efficiently visualize whole families of surfaces at once, we can visualize one surface (or a few of them using a real 3D view, or a 2D height map Figure 2c shows the selected surface from the last step as 3D surface and height map.

Figure 2
Fig. 2. Different ways of depicting a family of surfaces. a. Standard scalar aggregates of Greenland temperatures can give a first overview of surface family. b. Only one surface is selected here. It is shown as a single polyline in the parallel coordinates, or as a collection of function graphs as seen along each of the axis. All respective (curve-typed) cross-sections through the surfaces are shown. c. The 3D surface view and the 2D height map view of the selected surface.

3.2 Interactive Visual Analysis of Families of Surfaces

The analysis of families of surfaces is a very complex task and depends on the application domain. However, we can identify some standard analysis steps supported by interactive visual analysis. Furthermore, interactive visual analysis is the only way a domain expert can cope with the significant complexity of such data, especially if doing a cross-analysis of several families of surfaces. There are three levels in this process. At the top level, the user is interested in overall trends and in high level correlations. The most efficient way to perform such tasks is to represent one surface in the family by one aggregated scalar (or more scalars). Once this high level analysis is done, the user is ready to drill down. However, more data is needed, scalar aggregates do not suffice. We have identified various profiles of the surfaces, which, when used simultaneously, support the cross-surface analysis at this level. Finally, as we do have surfaces, all our collaborators needed them in order to understand the profiles better. It is much easier to understand a surface if it is visualized as a surface.

For multivariate, multidimensional data we use a coordinated multiple views system to pursue the analysis. The system is capable of depicting scalars using various standard views, function graphs which will originate from various profiles at the second level in our case, and real 3D surfaces as 3D surfaces or 2D height maps.

3.2.1 Analysis Through Single Scalar Aggregates (Top Level)

At the early stage we want to get familiar with the data and to explore high level relations between various surface families and other dimensions in the data set. It is very convenient to have scalar aggregates. We have identified five most often used aggregates, i.e., the minimum, maximum, mean, median, and data span (max–min). All of these aggregates are automatically created for each family of surfaces, and all of them are available automatically for analysis. The analysis uses coordinated multiple views with multiple scalar dimensions at this stage.

Figure 3
Fig. 3. a. The parallel coordinates depict scalar aggregates of Greenland temperature. Low maximum values of Greenland temperature are selected and corresponding profiles along each axis are highlighted. The unusual shape might indicate an unstable phase of simulation (first 40 years). b. The same surface for 200 time steps.

We have identified several analysis patterns. Brushing through the parameter space (independent variables) in order to understand the influence and to perform sensitivity analysis is something which is often done first. The next step is selecting either wanted or undesired output values in order to eventually detect a pattern in input parameters causing these outputs. In our meteorological example we have often selected cases with low maximum and high minimum values. Sometimes, like in the EHD lubrication bearing case described in Section 5, the engineers are interested in the highest maxima, since, e.g., high pressure is undesirable and they want to see which cases cause the extreme Figure 1a shows an example of analysis at this level for a single surface. A similar example consisting of more views and values for more families can be done in a similar way. Not all of the aggregates were used in the same way. Minimum and maximum are most actively used, user interacts with them directly and selects ranges here. The span is also sometimes used in an active way. The mean and median aggregates, on the other hand, are rarely being actively brushed, they serve as an overview, and provide important visual feedback (as a passive visualization with focus–context discrimination).

Figure 4
Fig. 4. Automatically configured multiple profiles of a surface family used at the second level of analysis. Several such views were usually used, in addition to other views depicting scalars and surfaces.

3.2.2 Analysis Through Aggregated Profiles

The complex data often requires deeper analysis, exploration of complex cross-family relations. Scalar aggregates are certainly not sufficient for this level. As surfaces are dependant on two variables, users are often interested in surface behavior with respect to one of the independent variables. Furthermore, a common task identified was to explore how various surface parameters behave with respect to each of the independent variables. The simple aggregates, like maximum over all, are not enough, users want to see the maximum with respect to one variable, while keeping the other independent. As we use various profiles along one axis, the resulting data is a function graph per surface. Just as we have identified standard scalar aggregates, we have seen that users most often use standard profiles. We suggest to use the maximum, minimum, mean, median, and all curves profiles. If we want to create curve profiles, we have to select either x or y in f (x,y) first. Once the variable is selected, we can use a cutting plane parallel to one axis (dependent on the selection of the independent variable in focus) and create a collection of cuts. The intersection of the cutting plane and the surface defines a curve. We can use all possible cuts along one axis (since the data is discretized we get a finite number of cuts), or choose one particular cut from a predefined set. In order to depict curves at this stage we use the curve view.

The profiles are based on the surface value and comparison with other values on the curve. For example, the user selects to keep the x axis and for each xi, we can choose the y value dependent on where the surface has the maximum. Therefore, the surface is represented by the curve where values represent the feature along the cut. In the case of the maximum, a surface f(xi1,xi2) is replaced by a curve f(xi1) = maxxi2 f(xi1,xi2). The same is true for other profiles.

Figure 3a illustrates a simple case. We have used parallel coordinates to depict scalar aggregates of Greenland temperature (in the upper left). Low maximum values of Greenland temperature are selected and we can observe interesting relations.

First, we see that all selected runs fall within the first 40 years of the simulation (histogram in Figure 3a). A possible explanation for this result might be an instability of the simulation in the early phase. Although this might seem to be of limited importance, it can help the simulation team to detect errors in the model and to estimate when the simulation becomes stable. The surface family (depicted as all curves on the right) shows also that first simulations behave differently than the later ones. The blue curves in the upper right corner (they correspond to low maxima) have a reversed trend as well. They all correspond to early time steps. As time goes by the curves start to rise, and they get a more typical shape Figure 3b shows the surfaces for the last 200 time steps.

The users often use more than one profile at this stage of analysis. We have added a pre-configured profiles view which depicts ten profiles per surface next to each other. The users simply need all the views on the family of surfaces to understand what is going on Figure 4 illustrates an automatically configured view depicting the Greenland temperature family of surfaces using five standard profiles along each of the two axes. It is useful to have several such views, one for each surface family in the data, during the analysis. In this case we would see a case for large-scale high-resolution information displays — this is a clear case where more pixels mean more value in the analysis.

3.2.3 Analysis of Data Surfaces as Such (Lowest Level)

The scalar aggregates and curve profiles, as introduced above, offer a powerful tool for exploration and analysis of complex data. How ever, at some point, the users want to see the surface itself. It helps them significantly in forming the mental image of the shape, and they can interpret profiles much more easily once they also see the surface. Since we cannot visualize the whole family with all surfaces simultaneously, we use a 3D surface view and a 2D surface view in the advanced phases of the exploration process where we are able to narrow the focus to just a few surfaces from the family.

Figure 5
Fig. 5. Generic data tuple. Each item can be scalar, but can also be a mapping. The data set which contains more tuples, contains a family or families of mappings.
Figure 6
Fig. 6. a. EXCITE Power Unit Main Bearing EHD Model used in the simulation. Topology view on the left showing main bearing wall, EHD joint, and journal and geometry view on the right. b. Parameters which were varied in the simulation. Front view of the bearing on the left, and side view on the right. The groove is used for oil supply.
Figure 7
Fig. 7. Bearing loads, acting on main bearing journal, simulated using a complex multi body simulation model which includes all moving and supporting engine parts. The valleys (red circles) correspond to cylinder firings.

The idea of the 2D surface view is to represent a surface as a rectangle (2D surface) where a discrete point (x,y) is assigned a color value based on z = f(x,y). Due to a large number of surfaces in a family, providing rectangles for all surfaces is prohibitive, except for very small families. Once the number of selected surfaces is small enough (e.g., after brushing and drill-down), the 2D surface view supports an in-depth analysis of individual surfaces. The pixel count is a limiting factor in such a display. If we use a reasonable size for an overview (something like thumbnail view for images) we can simultaneously depict up to 100 2D surfaces in a usual working environment.

If we reduce the number of surfaces even further, the 3D surface view offers the most intuitive representation of a data surface. Interestingly, all our collaborators really needed the 3D surface view at some point of the analysis. They often switched back to it to understand the family of surfaces better.

SECTION 4

Data Model

The data set described in Section 3 is an example of data sets that are collections of data points (tuples) so that the data set under consideration is D = {x1 xi xn} where n is the size of the data set (the number of data points) and each data point xi =(xi1 xij xid) is a collection of attributes, one for each dimension. A tuple attribute xij can be categorical, numerical, or a data series itself.

In the data set from Section 3 xi1 is the value of diffh for the data point i xi2 is the value of diffv for the data point i xi3 is the year and so on for the 35 more attributes. Data points can be aggregated based on the same combination of diffh and diffv values. We can then refine our data model so that for each combination of diffh and diffv values, the attribute values are data series over 500 years. The data model now has a two-level structure (Figure 5).

While these model refinements are rather trivial in this simple example, they illustrate the rationale for a two-level data model that allows us to aggregate data points based on the values in a selected dimension(s) thus restructuring the data set to have a relatively small number of data points while preserving the information content. That provides new opportunities for visualization.

More formally, in our approach we are considering a two-level data set that consists of data points (tuple values) of d dimensions. For each tuple xi and each data series attribute xij in a data tuple, we have a separate set of "sub-tuples" with its own cardinality and dimension. The set of sub-tuples is defined as Dij = {y1 yk ynij } where nij is the number of sub-tuples. A sub-tuple in Dij has a form (y1 ydij) where dij is a sub-tuple size and each sub-tuple attribute is either categorical or numerical. The sub-tuple yk is then (yk1 ykdij).

The dimensions of data series need not overlap with the top level dimensions. While each data series has its own cardinality and dimensions, our discussion is limited to three-tuples and less (dij ≤ 3), i.e. a data series can be a sequence of numbers, a sequence of pairs of numbers or a sequence of three-tuples. Data series values can be represented using curves or surfaces. In the case of a sequence of numbers (y1) we can use a sequence position as a function domain (independent variable) and numbers as function values (dependent variable).

In case of the sequence of pairs of numbers (y1 y2), one number is used as an independent variable and other as the dependent variable. We can use a function graph (curve) to represent data series. In case of the sequence of three-tuples (y1 y2 y3), we can select one of the data series dimensions for a dependent variable and the remaining two dimensions for independent variables, thus defining a function of two variables that can be visualized as a surface.

SECTION 5

Case Study — Analysis of a Slider Bearing

For the design of internal combustion (IC) engines, the reliability of the crank train slider and thrust bearings and the piston to linear contact is of central importance. Its design affects key functions such as durability, performance, wear and engine noise. Due to increasing specific loads, all physical effects have become important and they have to be considered by an advanced simulation tool: structural elasticity and dynamics, energy flow, mixed friction, and the influence of temperature and pressure upon oil viscosity. The case study is based on an interactive visual analysis of the IC engine main slider bearing, simulated with the AVL EXCITE Power Unit solver [17], [18]. The corresponding model is shown in Figure 6.

5.1 Elasto-Hydrodynamic (EHD) Bearing Model

In the hydrodynamic bearing simulation, the physical behavior of the structural parts is described by the dynamics of the elastic bodies. Resources needed for the advanced slider bearing analysis are extensive and thus the analysis is very demanding. To allow an efficient sensitivity analysis in this case, two separate strands of analysis have been performed using two different modeling depths. In the first analysis a model of the entire four cylinders inline IC engine has been built up using a simplified main bearing model and a complex MBS model which includes all moving and supporting engine parts. With this model we calculated bearing loads acting on each bearing with sufficient accuracy. Afterwards, the most loaded bearing was selected for a detailed second analysis Figure 7 shows the load computed. The valleys (marked with red circles) correspond to the firing of cylinders.

Table 1
TABLE 1 Control Parameters
Table 2
TABLE 2 Output values

We have performed an elasto-hydrodynamic (EHD) analysis for this more detailed investigation, using an advanced slider bearing numerical model. It is represented by one main bearing wall section which carries loads applied to the journal as it is shown in Figure 6a. The aim of the analysis is to investigate a design space by varying several parameters in order to reduce bearing loads and damage, to increase bearing life time, and to reduce noise generation as well as friction losses.

5.2 Simulation Parameters and Results

Numerous control parameters can be defined for a simulation. We varied three design parameters, length and width of the oil groove (used for oil supply in the slider bearing), and height of the gap in the barrel shape of bearing profile (Figure 6b Table 1 shows the parameters used and their units. There are 9 × 5 × 5 = 225 possible parameters combinations, resulting in 225 simulation runs. A simulation run has a simulation period of two engine cycles which, for a four stroke engine, results in four complete revolutions of the crankshaft, or a revolution for 1,440 degrees around the rotational axis. Due to numerical instabilities in the first calculated engine cycle (because of imperfections of the predefined boundary conditions), only the second cycle is used for results evaluation (720 to 1,440 degrees of crankshaft rotation). In this cycle the numerical model is more stable and afterwards results are periodic and repeatable even if the simulation time is extended for more cycles. Seven response parameters were computed for each of the 225 simulation runs Table 2 shows the computed parameters, the abbreviations used in the figures, and the units used.

The simulation tool computes the distribution of the values from Table 2 over the entire bearing shell surface. The values are computed for every degree of crankshaft revolution using regularly spaced points across the bearing shell surface. Therefore, each value in Table 2 can be seen as a data surface, spanned by two independent variables, bearing shell angle and bearing width. Each surface is discretized with 85 points over the bearing shell angle and 11 equidistant nodes over the bearing width. To reduce the amount of data and to speed up the study, the surfaces extracted from the results data have 29 values of the crankshaft revolution regularly sampled within one engine cycle. The complete simulation (225 runs) takes 4–5 days on a typical PC.

5.3 Interactive Visual Analysis

Once the data was computed, we started the analysis. The approach with multiple simulations is a new trend here and our domain experts did not have a tool which would effectively support interactive visual analysis of multiple runs and of such complex data. Visualization of individual runs is done using various 2D and 3D charts of output parameters. Usually, a domain expert then compares results from various runs, but cannot interactively explore them. Furthermore, the experts are used to only treating crankangle (equivalent to time) as an independent variable. Since we actually have the output values distributed over all the surface of the bearing shell, it was natural to apply our surface analysis methodology. This way new insights emerged, and our collaborators adapted to the newly proposed method very fast.

Figure 8
Fig. 8. The first task was to identify cases with high asperity contact pressure as this leads to wearing of the bearing. We have selected PRSA maximum values in the maximum projection of PRSA surfaces and identify regions of control parameters which result in wanted, low maximum asperity pressure. Note that such a high pressure appears only at certain crankangles, and this information would have been lost in a conventional analysis. Note also the interesting shapes of PRES, CLEA, and FILL surface slices in the lower row of the figure.

The first task was to explore the distributions of asperity contact pressure. High asperity contact pressure yields to an increased load on the bearing and to wearing of the engine. Reducing the asperity contact can contribute the most to the slider bearing optimization.

We have used a scatterplot to depict groove width and height, and two histograms to depict barrel gap and crankangle (Figure 8, barrel gap histogram is not shown). We have many simulation runs, one for each combination of groove width, height, barrel gap, and crankangle, and therefore each point in the scatterplot represents multiple runs. The function graph views in Figure 8 show maxima of PRSA and PRES, as well as minima of CLEA and FILL data surfaces. The shape of the surfaces is completely invisible if aggregates are used only.

The engineer selected maximum PRSA (unwanted behavior) and identified possible values of the control parameters which lead to lower maximum PRSA (gray points in the scatterplot in Figure 8). By analyzing the results from the simulation runs we saw that asperity contact is reduced in general by increasing oil groove length. At the same time asperity pressure can be reduced by decreasing oil groove width. The most interesting region is identified: groove length of 180 degrees which is preferable from production point of view and small groove width (the green rectangle in Figure 8). This was set as the starting condition for the following analysis.

In the next step of the analysis we look at the simulation runs near to the interesting region (the green rectangle in the Figure 8). We use the same setup for control parameters as in the previous step Figure 9 shows four stages of the analysis. We first brush the region of interest in the scatterplot. This brush in the left scatterplot in Figure 9a remains the same throughout the analysis. In the first steps (Figures 9b–d) we also focus on one value of the barrel gap. Later we refine the selection using a histogram brush on the barrel gap histogram (second histogram in Figure 9a, active only for Figure 9e). Results for those six combinations of groove size (as selected in the scatterplot) and always the same barrel gap (as selected in the histogram) are showed for complete engine cycle − for all crankangles. To look at the results in more detail, the distribution of asperity contact and hydrodynamic pressure is also shown as a 3D surface, plotted over the bearing shell angle and the bearing width for all crank angles throughout Figure 9.

Figure 9
Fig. 9. a. Groove dimensions are fixed using scatterplot. Barrel Gap is the same for b, c, and d. The extended case is used for e. Three different peaks in PRSA were identified and explored. Selection was refined by selecting high pressure and barrel gap influence was studied. Note the use of 3D surface view and 3D scatterplot which were almost always used by domain expert - mechanical engineer. b. The interesting region of groove size was selected for an in depth analysis. c. Selections were refined by selecting high PRESS-MAX. d. Selections were refined by selecting high PRSA-MAX. e. Finally, gap was extended using histogram selection and high PRSA-MAX was selected.

Figure 9b shows that maximum asperity contact pressure for selected cases is less than 50% of the maximum pressure in all simulation runs (PRSA-MAX graph) and maximum hydrodynamic pressure is also reduced by approximately 10% (PRES-MAX graph). Our domain expert likes the 3D surface view in wire-frame mode. He always uses it in order to see if there is some strange behavior. As this view depicts many data surfaces simultaneously all we see is a kind of envelope. He can hardly see the interior of the family of surfaces where many data surfaces are possible (and can be clearly seen in function graph views). As he is often interested in extremes he uses it often. By looking at the 3D surface distribution one can see that asperity contact appears at bearing edges and mainly at one side of the bearing due to the distribution of applied bearing moments (Figure 9b). The PRSA 3D surface view shows all peaks at one side. At the same time, the distribution of hydrodynamic pressure has a peak in the central bearing region (PRES 3D surface view and function graph view). When we apply the third brush (Figure 9c) and show only the maximum values of hydrodynamic pressure, one can see that the maximum appears at only one crank angle near to the top dead center, where maximum bearing forces are acting due to the combustion in the particular cylinder. At this crank angle, the distribution of hydrodynamic pressure (PRES) is regular and asperity contact pressure is near to zero (PRSA). Therefore, further decreasing of the hydrodynamic pressure will probably be not feasible only with selected parameters or without any increasing of the asperity contact pressure.

On the other side, when we look at the maximum asperity contact pressure in Figure 9b, (PRSA-MAX views) one can see three groups of characteristic peaks over the bearing shell angle. We select the highest, (3rd,) peak (brush 3 in Figure 9d) and explore the corresponding surfaces. It can be seen that maximum asperity contacts (the PRSA graph in Figure 9d) appear at one side of the bearing but are not the minimum or the maximum values from the all simulation runs. We can also see in the crankangle histogram that those peaks appear immediately after maximum hydrodynamic pressure after combustion with shifted maximum values to the bearing edge where solid contact is detected (3D surface views). Typically, the asperity contact pressure at the bearing edges can be reduced by using a barrel profile of the bearing shell. This means increasing the barrel gap as defined in Figure 6b. We tried to see what is happening as we increase the gap, and we expected the asperity contact to decrease. We extended the barrel gap histogram brush over larger values of the barrel gap, the rightmost histogram in Figure 9a shows this selection. Corresponding graphs are depicted in Figure 9e. Our results show that asperity contact is even increased due to increases of hydrodynamic pressure and the changed distribution, contrary to our first expectations.

Finally, it is important to see that the here presented sensitivity analysis can also be implemented as a first step in slider bearing optimization. We have efficiently identified the most influential parameters on the bearing behavior in an interactive way. Using our new methodology it is possible to check large amounts of results data and identify correlations between main design parameters and the slider bearing dynamic behavior. The newly proposed data model makes it possible to explore inter-surface relations efficiently. The paradigm shift from a crankangle–based concept to surface width/height based series was very fast and, once achieved, also intuitive. Next steps would be to make the simulation model more complex by varying bearing shell surface properties and oil quality, e.g. The same methodology can be used to explore future designs and automatic optimization results.

SECTION 6

Conclusion

We introduce a new interactive visual analysis methodology to support the analysis and exploration of complex data originating from multiple-runs simulations. Such data is becoming increasingly popular and our approach significantly improves our ability to cope with increased complexity. We introduce a novel way of considering two independent variables in the data. Such data can be understood as families of surfaces. Here we focuse on how to deal with the complexity of the data. We can easily depict a million of data items when using a scatter plot, but only about 10 000 items (in a legible and expressive way) using parallel coordinates (without advanced techniques such as reverting to a frequency-based representation [16]), and only a handful of surfaces when using the 3D surface view. The proposed approach is certainly not trivial and requires a certain learning curve. Domain experts were puzzled at the beginning, but then they appreciated the new technique. They realized quickly that we need to keep data surfaces coherent in order to understand complex relations and interplay of parameters.

We identify three levels of analysis, aggregation using scalars, profiling with respect to one variable, and finally the last stage where individual surfaces are used. Due to occlusion related problems the final level can be used at late stages of analysis when the user drills down to a single (or just a few) surface(s) in a family. This approach is widely applicable, here illustrated with two examples, meteorological data and the analysis of EHD bearing from automotive industry.

Here we deal with cases where the surfaces are regularly sampled. Surfaces with irregular sampling represent an interesting research challenge. The extension to alternative sampling strategies is an important direction of future work. In both here discussed cases (meteorology and bearing), the analysis is an a posteriori process, i.e., with no further influence on the computed simulation data. We illustrate the newly proposed technology on the cases with just a few control parameters. In other cases, one could think about a coupled setup (as in the case of computational steering [14]) where throughout the analysis new parameters are varied. Then it would be necessary to re-load the analysis with respect to new surface structures in the data.

SECTION 7

Acknowledgments

The authors thank Thomas Nocke and colleagues from PIK (www.PIK-Potsdam.de) for providing the climate simulation data, Stian Eikeland for helping with the data conversion, and Johannes Kehrer for fruitful discussions (both from the visualization group in Bergen www.ii.UiB.no/vis). The EHD simulation data is courtesy of AVLList GmbH (www.AVL.com). Part of this work was done in the scope of the VSOE VCV program at the VRVis Research Center (www.VRVis.at) and at the Center for HCI at Virginia Tech (www.HCI.VT.edu).

Footnotes

Krešimir Matković is with the VRVis Research Center in Vienna, Austria, E-mail: Matkovic@VRVis.at.

Denis Gračanin is with Virginia Tech, USA, E-mail: gracanin@vt.edu.

Borislav Klarin is with AVL AST d.o.o in Zagreb, Croatia, E-mail: Borislav.Klarin@avl.com.

Helwig Hauser is with the University of Bergen, Norway, E-mail: Helwig.Hauser@UiB.no.

Manuscript received 31 March 2009; accepted 27 July 2009; posted online 11 October 2009; mailed on 5 October 2009.

For information on obtaining reprints of this article, please send email to: tvcg@computer.org.

References

1. Visual methods for analyzing time-oriented data.

W. Aigner, S. Miksch, W. Müller, H. Schumann and C. Tominski

IEEE Transactions on Visualization and Computer Graphics, 14 (1): 47–60, 2008.

2. Simulation of the cold climate event 8200 years ago by meltwater outburst from Lake Agassiz.

E. Bauer and A. Ganopolski

Paleoceanography, 19(PA3014): 1–13, 2004.

3. Visualization of multi-variate scientific data.

R. Bürger and H. Hauser

In EuroGraphics State of the Art Reports (STARs), pages 117–134, 2007.

4. Interactive visualization of serial periodic data.

J. V. Carlis and J. A. Konstan

In UIST '98: Proceedings of the 11th Annual ACM Symposium on User Interface Software and Technology, pages 29–38. ACM Press, 1998.

5. Interactive feature specification for focus+context visualization of complex simulation data.

H. Doleisch, M. Gasser and H. Hauser

In G.-P. Bonneau, S. Hahmann and C. D. Hansen

editors, Proc. of the Joint EUROGRAPHICS - IEEE TCVG Symp. on Vis., 2003.

6. Visualisation of historical events using Lexis pencils.

B. Francis and J. Pritchard

Advisory Group on Computer Graphics, 1997.

7. Generalizing Focus+Context Visualization, in Scientific Visualization: The Visual Extraction of Knowledge from Data

H. Hauser

chapter Generalizing Focus+Context Visualization, pages 305–327. Springer, 2006.

8. ThemeRiver: Visualizing thematic changes in large documents collections.

S. Havre, E. Hetzler, P. Whitney and L. Nowell

IEEE Transactions on Visualization and Computer Graphics, 8 (1): 9–20, 2002.

9. Dynamic query tools for time series data sets: timebox widgets for interactive exploration.

H. Hochheiser and B. Shneiderman

Information Visualization, 3 (1): 1–18, 2004.

10. An approach to the perceptual optimization of complex visualizations.

D. H. House, A. S. Bair and C. Ware

IEEE Transactions on Visualization and Computer Graphics, 12 (4): 509–521, 2006.

11. Visual exploration of large data sets.

D. A. Keim

Communications of the ACM, 44 (8): 38–44, 2001-08.

12. Interactive visual analysis of families of function graphs.

Z. Konyha, K. Matković, D. Gračanin, M. Jelović and H. Hauser

IEEE Transactions on Visualization and Computer Graphics, 12 (6): 1373–1385, 2006.

13. Display of surfaces from volume data.

M. Levoy

IEEE Computer Graphics and Applications, 8 (3): 29–37, 1988.

14. Interactive visual steering - rapid visual prototyping of a common rail injection system.

K. Matkovic, D. Gracanin, M. Jelovic and H. Hauser

IEEE Transactions on Visualization and Computer Graphics, 14 (6): 1699–1706, 2008.

15. Strategies for the visualization of geographic time-series data.

M. Monmonier

Cartographica, 27 (1): 30–45, 1990.

16. Outlier-preserving focus+context visualization in parallel coordinates.

M. Novotný and H. Hauser

IEEE Transactions on Visualization and Computer Graphics, 12 (5): 893–900, 2006.

17. Quality and Validation of Cranktrain Vibration Prediction - Effect of Hydrodynamic Journal Bearing Models.

G. Offner

In Multi-body Dynamics - Monitoring and Simulation Techniques III, 2004.

18. Simulation of Vibration and Structure Borne Noise of Engines - A Combined Technique of FEM and Multi Body Dynamics.

H. H. Priebsch and J. Krasser

1998.

19. State of the Art: Coordinated & Multiple Views in Exploratory Visualization.

J. C. Roberts

In G. Andrienko, J. C. Roberts and C. Weaver

editors, Proc. of the 5th International Conference on Coordinated & Multiple Views in Exploratory Visualization. IEEE CS Press, 2007.

20. The Visual Display of Quantitive Information

E. R. Tufte

Graphics Press, Cheshire, Connecticut, second edition, 2001.

21. Information Visualization: Perception for Design

C. Ware

Morgan Kaufmann Publishers, second edition, 2004.

22. Visualizing time-series on spirals.

M. Weber, M. Alexa and W. Müller

In Proc. of the IEEE Symp. on Information Visualization, pages 7–13, 2001.

23. Visualizing the behavior of higher dimensional dynamical systems.

R. Wegenkittl, H. Löffelmann and E. Gröller

In Proceedings of the IEEE Visualization (VIS '97), pages 119–125, 1997.

Authors

No Photo Available

Krešimir Matković

Member, IEEE CS
No Bio Available
No Photo Available

Denis Gračanin

Member, IEEE
No Bio Available
No Photo Available

Borislav Klarin

No Bio Available
No Photo Available

Helwig Hauser

Member, IEEE
No Bio Available

Cited by

No Citations Available

Keywords

Corrections

No Corrections

Media

No Content Available

Indexed by Inspec

© Copyright 2011 IEEE – All Rights Reserved