Characterizing and subsetting big data workloads | IEEE Conference Publication | IEEE Xplore