Benchmarking Hadoop performance on different distributed storage systems
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.advisor | Döngelci, Ridvan | |
| dc.contributor.author | Mukherjee, Alapan | |
| dc.contributor.school | Perustieteiden korkeakoulu | fi |
| dc.contributor.supervisor | Heljanko, Keijo | |
| dc.date.accessioned | 2015-09-18T08:27:44Z | |
| dc.date.available | 2015-09-18T08:27:44Z | |
| dc.date.issued | 2015-08-24 | |
| dc.description.abstract | Distributed storage systems have been in place for years, and have undergone significant changes in architecture to ensure reliable storage of data in a cost-effective manner. With the demand for data increasing, there has been a shift from disk-centric to memory-centric computing - the focus is on saving data in memory rather than on the disk. The primary motivation for this is the increased speed of data processing. This could, however, mean a change in the approach to providing the necessary fault-tolerance - instead of data replication, other techniques may be considered. One example of an in-memory distributed storage system is Tachyon. Instead of replicating data files in memory, Tachyon provides fault-tolerance by maintaining a record of the operations needed to generate the data files. These operations are replayed if the files are lost. This approach is termed lineage. Tachyon is already deployed by many well-known companies. This thesis work compares the storage performance of Tachyon with that of the on-disk storage systems HDFS and Ceph. After studying the architectures of well-known distributed storage systems, the major contribution of the work is to integrate Tachyon with Ceph as an underlayer storage system, and understand how this affects its performance, and how to tune Tachyon to extract maximum performance out of it. | en |
| dc.format.extent | 99+11 | |
| dc.format.mimetype | application/pdf | en |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/17713 | |
| dc.identifier.urn | URN:NBN:fi:aalto-201509184328 | |
| dc.language.iso | en | en |
| dc.programme | Master's Programme in Mobile Computing - Services and Security | fi |
| dc.programme.major | Mobile Computing | fi |
| dc.programme.mcode | T-110 | fi |
| dc.rights.accesslevel | openAccess | |
| dc.subject.keyword | Tachyon | en |
| dc.subject.keyword | HDFS | en |
| dc.subject.keyword | Ceph | en |
| dc.subject.keyword | benchmarks | en |
| dc.title | Benchmarking Hadoop performance on different distributed storage systems | en |
| dc.type | G2 Pro gradu, diplomityö | en |
| dc.type.okm | G2 Pro gradu, diplomityö | |
| dc.type.ontasot | Master's thesis | en |
| dc.type.ontasot | Diplomityö | fi |
| dc.type.publication | masterThesis | |
| local.aalto.idinssi | 52047 | |
| local.aalto.inssiarchivenr | 3020 | |
| local.aalto.inssilocation | P1 Ark Aalto | |
| local.aalto.openaccess | yes |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- master_Mukherjee_Alapan_2015.pdf
- Size:
- 1.94 MB
- Format:
- Adobe Portable Document Format