Skip to Main Content
Since recommendation systems tackle the problem of information overload, the processing of huge datasets can not be avoided. When these datasets no longer fit into the RAM memory of a computing node, a scalable data storage approach is required. While database systems are frequently used for this goal, they have their disadvantages and when not properly designed may slow down the recommendation process. In this paper we propose an alternative file-based data storage approach that is particularly well suited for a high-performance computing environment where the usage of databases may not always be an option. By breaking down the recommendation process in separate phases and carefully structuring the input and output of each phase, we have build a file-based recommendation system that scales proportional with the number of computing nodes and processor cores available in each node.