Categorising AWS Common Crawl Dataset using MapReduce | IEEE Conference Publication | IEEE Xplore