Recently, the existence of considerable amount of redundancy in the Internet traffic has stimulated the deployment of several redundancy elimination techniques within the network. These techniques are often based on either packet-level Redundancy Elimination (RE) or Content-Centric Networking (CCN). However, these techniques cannot exploit sub-packet redundancies. Further, other alternative techniques such as the end-to-end universal compression solutions would not perform well either over the Internet traffic, as such techniques require infinite length traffic to effectively remove redundancy. This paper proposes a memory-assisted universal compression technique that holds a significant promise for reducing the amount of traffic in the networks. The proposed work is based on the observation that if a source is to be compressed and sent over a network, the associated universal code entails a substantial overhead in transmission due to finite length traffic. However, intermediate nodes can learn the source statistics and this can be used to reduce the cost of describing the source statistics, reducing the transmission overhead for such traffics. We present two algorithms (statistical and dictionary-based) for the memory-assisted universal lossless compression of information sources. These schemes are universal in the sense that they do not require any prior knowledge of the traffic's statistical distribution. We demonstrate the effectiveness of both algorithms and characterize the memorization gain using the real Internet traces. Furthermore, we apply these compression schemes to Internet-like power-law graphs and solve the routing problem for compressed flows. We characterize the network-wide gain of the memorization from the information theoretic point of view. In particular, through our analysis on power-law graphs, we show that non-vanishing network-wide gain of memorization is obtained even when the number of memory units is a tiny fraction of the total number- of nodes in the network. Finally, we validate our predictions of the memorization gain by simulation on real traffic traces.