Skip to Main Content
In order to understand the dependencies in the power system we try to derive state information by combining high-rate voltage time series captures at different locations together with data analysis at different scales. This may enable large-scale simulation and modeling of the grid. Data captured by our recently introduced Electrical Data Recorders (EDR) and power grid simulation data are stored in the large scale data facility (LSDF) at Karlsruhe Institute of Technology (KIT) and growing rapidly in size. In this article we compare classic sequential multithreaded time series data processing to a distributed processing using Pig on a Hadoop cluster. Further we present our ideas for a better organization for our raw- and metadata that is indexable, searchable and suitable for big data.