We explore the effects of applying the various hierarchical display configurations supported by HiVE to a dataset of property transactions for addressing research questions. We then propose guidelines (section 7) from the issues identified. The exploration process is reflected in the sequence of figures in this paper and the accompanying video.
6.2 Layouts for ordinal data
Most research questions benefit from using 1D orders that are independent of rectangle size. For these, slice-and-dice (VT and HZ) and ordered-squarified OS layouts are suitable. The choice of layout partly depends on the number of ordinal values and the aspect ratio of the space available. Ordered squarified is particularly suitable where there is a large number of values (e.g. the 108 months in each borough shown in Fig. 3B). Slice-and-dice may be more suitable where there are fewer categories. Alternating VT and HZ through the hierarchy can produce layouts similar to mosaic plots (and matrix diagrams if sizes are fixed). They are particularly suitable where variables have hierarchical dependencies, such as our calendar views (sHier($yr,$mn)).
Fig. 2. A: Sized-based ordering, coloured by average price: sHier(/,$br,$ty,$yr,$mn); sLayout(/,SQ); sSize(/,$sal); sColor(/,Ø,Ø,Ø,$prc). B: Reconfigure to a spatial and temporal layout: oLayout(/,1,SP); oLayout(/,2,OS); oLayout(/,3,VT); oLayout(/,4,HZ). C: Fix the size: oSize(/,1,FIX); oSize(/,2,FIX); oSize(/,3,FIX); oSize(/,4,FIX). D: Remove time, and colour by deviation from expected sales: oCut(/,4); oCut(/,3); oColor(/,2,$xsl).
Previous | View All | Next
6.3 Layouts for time-based data and questions
Temporal data can be considered as ordinal. In Fig. 1A, years are not arranged temporally; as such, temporal trends are difficult to detect. Rearranging the years into a time-based order using an ordered space-filling layout  (Fig. 1B) makes the increase in annual house price easier to detect. In Fig. 1C, we have added month to the hierarchy producing calendar views coloured by the number of sales.
Seasonal variations in the numbers of sales are apparent for flats and terraced housing, however colour rescaling (using oColorMap) or using colour schemes that are local to individual parts of the hierarchy are required to detect these patterns where property types have low sales. Alternatively, colour can be used to show values as a proportion or deviation from a baseline. Appropriate baselines include those that reflect the values expected from hypotheses that we might then accept or reject on the basis of the display. For example, in Fig. 4A (calendar views), our null hypothesis is that the number of sales does not vary monthly (expected or baseline values are a twelfth of the sales for each year). The geographically-consistent seasonal trends that are apparent might cause us to reject our null hypothesis. Identifying the elements with statistically-significant levels of variation might help us make that choice. Fig. 4B shows the deviation of price from the yearly average (accounting for inflation). Whilst prices rises steadily every year, this is not the case for 2008 where prices have dropped markedly in the final quarter, a trend not observed in Westminster.
Nesting the two temporal resolutions of year and month to produce calendar views is appropriate where we are expecting yearly and monthly patterns. However, this may obscure other temporal patterns. In Fig. 3B, we use an ordered squarified layout of all 108 months in the period ordered from the left top to bottom right (compare with the calendar views in Fig. 3A). Although both graphics show exactly the same data, the use of $my and the associated OS layout in Fig. 3B make the upward trend in prices and subsequent slump more apparent as it is a continuous trend over the entire period. The result is a more appropriate layout for research questions that relate to ongoing rather than periodic change. The additional hierarchical level used in Fig. 3A and alternative layouts are more appropriate for comparing annual patterns which are overshadowed by the longer term trend in the case of this attribute. Again, interactive colour rescaling or colouring on the basis of relative values is required to detect relative rises and falls in different boroughs.
6.4 Geographical layouts
Spatially-ordered layouts (SP) have rectangles that are arranged according their geographical locations. The effect of this layout can be seen by comparing the non-spatial layout in Fig. 2A with the spatial layout in Fig. 2B, in which flats overwhelmingly dominate sales near Central London whereas sales of other types are proportionally higher in peripheral areas, sometimes exceeding those of flats. Fig. 3 also uses a spatial layout, facilitating the detection of spatial patterns in average price trends – south and east London have the lowest prices and central and southwestern areas have the highest prices.
Fig. 3. Sales by borough and month, sized by the number of sales and coloured by the average price. A: Uses calendar views of time: sHier(/,$br,$yr,$mn); sLayout(/,SP,VT,HZ); sSize(/,$sal); sColor(/,Ø,Ø,$prc). B: Uses all 108 months in the period ordered from the top left: oCut(/,3); oCut(/,2); oInsert(/,2,$my); oLayout(/,2,OS); oColor(/,2,$prc).
Previous | View All | Next
Fig. 4. Boroughs containing calendar views, coloured by deviation from 'expected'. A: Red indicates higher sales than the yearly average; blue indicates fewer sales: sHier(/,$br,$yr,$mn); sLayout(/,SP,VT,HZ); sSize(/,$sal); sColor(/,Ø,Ø,$xsl). B: Brown indicates higher prices than average for the year; turquoise indicates lower prices: oColor(/,3,$xpr).
Previous | View All | Next
Spatially-ordered layouts can also apply to multiple levels of a hierarchy. In Fig. 5B, a hierarchy of two spatial units of increasing granularity are nested and spatially arranged. High spatial variation is apparent within boroughs. For example in Lambeth, wards with the highest average price are closer to Central London, the converse is true in the case of Camden. The space-filling nature of these cartograms often results in positional inaccuracies which can be conveyed using displacement vectors . Where absolute locations are required for research questions, these can be encoded using a perceptually-constant 2D colour-space  or by using a different layout.
We use animated transitions to relate the layouts in Figs. 5C, 5D and 5E (this method for relating layouts has been found to be effective ) – see video. The layouts that use absolute space show more of the spatial subtleties of the patterns, e.g. the high average house prices linearly arranged from the centre to the southwest. However, occluding layouts such as Fig. 5C are difficult to interpret on their own but may be useful when animated transitions are provided to other layouts and layouts whose geometrical elements do not fill space completely, produce less data-dense graphics when dimensionally stacked.
oSwap is a useful operator for OD-maps  which are raster-based origin-destination maps – sHier(/,$oc,$dc); sLayout(/,SP); sSize(/,FIX); sColor(/,$fl) – in which $oc is the originating grid cell, $dc is the destination grid cell and $fl is the volume of flow between the given origin and destination cells. oSwap enables directionality in the origins and destinations to be explored. This example also illustrates that datasets may have multiple locations, both of which may be added to the hierarchy, in this case producing raster maps of destinations embedded in raster maps of origins.
Comparing layouts where space is discretised differently is one way of studying the effect of the modifiable areal unit problem  Fig. 6 shows a spatial arrangement where instead of conditioning the data by administrative unit, we use 4km2 grid squares, in which we embed calendar views (sLayout(/,VT,HZ);sHier(/,$yr,$mn)). Fixing the size of both the spatial units and rectangle sizes and using a spatial arrangement results in a layout that imposes a regular tesselated grid on absolute geographical space (at the $gd level) upon which geographical boundaries can be drawn.
Fig. 5. Cartograms and maps. A: Rectanglar cartogram: sHier(/,$br); sLayout(/,SP); sSize(/,$sal); sColor(/,$prc). B: Hierarchical rectangular cartogram: oInsert(/,2,$wd); oLayout(/,2,SP)]; oColor(/,1,Ø); oColor(/,2,$prc). C: As B, but using absolute positioning: oCut(/,2); oLayout(/,1,SA). D: Gastner cartogram (polygon layout; sized by sales): oLayout(/,1,PG). E: Map (as D, but using geographical shape): oSize(/,$abr). $abr is the borough area.
Previous | View All | Next
Fixing the sizes of rectangles reduces their individual information-carrying capacity but facilitates more consistent overall layouts. It also reduces the cartogram effect, resulting in data of lower significance (low sales, therefore low sample sizes) being displayed with equal prominence. The average prices shown in row 5, col 9 of Fig. 6B correspond to low sales (see corresponding cell in Fig. 6A) but they are given more prominence in layouts where rectangles are sized by sales. As such, this (equally valid) view of the data must be interpreted slightly differently – perhaps in conjunction with a version that is coloured by the number of sales. We suggest side-by-side comparison or animated transition to help relate these views such as these.
Geography does not necessarily have to be at the base of the hierarchy. In Fig. 7, we place boroughs at the second level of the hierarchy, apply the oSize(FIX) operator to fix the size of rectangles, remove the final two hierarchical levels and reconfigure level 2 to map-based layouts (Fig. 7C). This small multiple map layout allows the recognisable shapes of boroughs to be preserved, but at the expense of space-efficiency and space-efficient dimensional stacking.
6.5 Layouts for nominal data
We recommend that a consistent ordering be used for nominal values. In Figs. 2B, 2C and 2D, we consistently order flats, terrace, semi-detached and detached types. The ordering used should be selected to reflect some ordinal sequence to encourage comparison (unlike in Fig. 2A – see Redbridge). We have ordered these by likely floor-space.
The numbers of sales vary markedly between the property types, resulting in some rectangles sizes (e.g. detached houses in the centre) being too small to be easily resolvable. In Fig. 2C, we fix the size of each rectangle (grey shows no data; there are few detached house sales in the City of London). Fixing the rectangle size may draw more attention to these than warranted and so these displays should be used in conjunction with a version that is coloured by sales, either using a fade transition or placing side-by-side (as is the case in Fig. 6).
To investigate how relative sales of different property types vary spatially, we can form a null hypothesis that the ratio of sales between the property types are spatially invariant. To test this hypothesis, we use the average sales proportions of flats (49%), terraced (31%), semi-detached (16%) and detached (4%) for the whole area to establish a baseline and then show the deviation from this. Fig. 2D (this uses a linear and symmetrical diverging colour scheme) shows that we can probably reject our null hypothesis. Sales of flats are higher than the London average in the centre (the consistent ordering ensures flats are always in the top left), more semi-detached housing than average exists towards the periphery and no borough has the average proportion. By modifying the hierarchy (with the oCut, oInsert and oSwap operators), reconfiguring the layouts (oLayout and oSize), changing the colour (oColor and oColorMap) and establishing alternative baselines, alternative hypotheses can be investigated to address different research questions.
In Fig. 7 we study the consistency of price by type, space and time, by colouring layouts by the coefficient of variation of price. The instability of colour, suggests that many of the sample sizes are too small to give reliable estimations of price variation, but nevertheless colour is relatively consistent by borough and different spatial patterns can be detected for each property type. In Fig. 7C, we fix the size of the rectangles, remove the temporal attributes from the hierarchy and switch the layout to polygons. This results in small multiple choropleth maps conditioned by type (sHier(/,$ty,$br); sLayout(/,OS,PG); sSize(/,FIX,$abr)).