Efficient Methods on Reducing Data Redundancy in the Internet

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2015-10-30
Checking the digitized thesis and permission for publishing
Instructions for the author
Degree programme
90 + app. 58
Aalto University publication series DOCTORAL DISSERTATIONS, 153/2015
The transformation of the Internet from a client-server based paradigm to a content-based one has led to many of the fundamental network designs becoming outdated. The increase in user-generated contents, instant sharing, flash popularity, etc., brings forward the needs for designing an Internet which is ready for these and can handle the needs of the small-scale content providers. The Internet, as of today, carries and stores a large amount of duplicate, redundant data, primarily due to a lack of duplication detection mechanisms and caching principles. This redundancy costs the network in different ways: it consumes energy from the network elements that need to process the extra data; it makes the network caches store duplicate data, thus causing the tail of the data distribution to be swapped out of the caches; and it causes the content-servers to be loaded more as they have to always serve the less popular contents.  In this dissertation, we have analyzed the aforementioned phenomena and proposed several methods to reduce the redundancy of the network at a low cost. The proposals involve different approaches to do so--including data chunk level redundancy detection and elimination, rerouting-based caching mechanisms in information-centric networks, and energy-aware content distribution techniques. Using these approaches, we have demonstrated how we can perform redundancy elimination using a low overhead and low processing power. We have also demonstrated that by using local or global cooperation methods, we can increase the storage efficiency of the existing caches many-fold. In addition to that, this work shows that it is possible to reduce a sizable amount of traffic from the core network using collaborative content download mechanisms, while reducing client devices' energy consumption simultaneously.
Supervising professor
Ylä-Jääski, Antti, Prof., Aalto University, Department of Computer Science, Finland
Thesis advisor
Lukyanenko, Andrey, Dr., Aalto University, Department of Computer Science, Finland
cache, redundancy, energy, ICN, Internet
Other note
  • [Publication 1]: Sumanta Saha, Andrey Lukyanenko, Antti Ylä-Jääski. CombiHeader: Minimizing the number of shim headers in redundancy elimination systems. In IEEE INFOCOM Workshops, Shanghai, China, 798-803, doi: 10.1109/INFCOMW.2011.5928920, April 2011.
  • [Publication 2]: Sumanta Saha. On Reducing the Processing Load of Redundancy Elimination Algorithms. In IEEE GLOBECOM Workshops, Texas, USA, 1106-1110, doi: 10.1109/GLOCOMW.2011.6162349, December 2011.
  • [Publication 3]: Sumanta Saha, Andrey Lukyanenko, Antti Ylä-Jääski. Cooperative Caching through Routing Control in Information-Centric Networks. In IEEE INFOCOM, Turin, Italy, 100-104, doi: 10.1109/INFCOM.2013.6566743, April 2013.
  • [Publication 4]: Sumanta Saha, Andrey Lukyanenko, Antti Ylä-Jääski. Efficient Cache Management in Information-Centric Networks. Computer Networks, vol:84, issn:1389-1286, 32-45, doi: http:// dx.doi.org/10.1016/j.comnet.2015.04.005, July 2015.
  • [Publication 5]: Sumanta Saha, Maneesh Chauhan, Andrey Lukyanenko. Beyond the Limits: Maximization of ICN Caching Capabilities with Global Detour Algorithm. In IEEE Symposium on Computers and Communications, Larnaca, Cyprus, July 2015.
  • [Publication 6]: Sumanta Saha, Mohammad Hoque, Andrey Lukyanenko. Analyzing Energy Efficiency of a Cooperative Content Distribution Technique. In IEEE GLOBECOM, Texas, USA, 1-6, doi: 10.1109/GLOCOM.2011.6133841, December 2011.