Skip to Main Content
For years, businessmen made use of ad-hoc technologies in order to analyze huge amount of data related to the domain of interest, aiming at extracting relevant information to elaborate successful company strategies. Such technologies focused essentially on the structured data. In particular Data Warehousing systems represent the decision support systems on which academia and industry focused their attention. It is believed that "about 80% of the information of any organization is contained in unstructured and semi-structured documents", so limiting the analysis to only the structured data, as it has been done so far, is likely to lose a high percentage of potentially useful knowledge. Since text is the primary mean to disseminate information and knowledge, it is necessary to introduce concepts related to text-oriented Business Intelligent and Document Warehousing systems, which could have many useful applications in industries or large domains. In this paper we present a prototype application of a Document Warehousing system, highlighting challenges and solutions for each phase of its lifecycle. The prototype is related to Security and Prevention domain and it is built with a set of open-source tools whose features and limitations are highlighted. As we currently know, organization and setting of the fundamental elements of a Document Warehouse system lifecycle, are issues which have not been deepened yet. Furthermore until now, we have not find an application of Document Warehousing, which has been implemented integrating the open-source tools which we use to implement our prototype yet.