Fault-Tolerance of a Primary-Backup Replication Database
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2018-10-08
Department
Major/Subject
Embedded Systems
Mcode
SCI3024
Degree programme
Master's Programme in ICT Innovation
Language
en
Pages
56+6
Series
Abstract
This thesis evaluates a primary-backup replication design for a proprietary database. The system's architecture is redesigned from a monolith to a distributed system. We explored distributed systems testing practices in academia and industry. Based on the literature review and case company needs, we implemented three test methods: automatic tests suitable for a continuous integration pipeline, manual validation of production components and formal specification with model checking. All methods discovered issues in the initial design or underlying components. For example, a liveness issue was discovered in the cluster management. Safety violations were recognized in the remote storage functionality. Furthermore, consistency violation was reported in a commercial object storage service. These findings suggest that a broad range of testing methods are necessary to bring a distributed system to production. This thesis compares the applied testing methods. Additionally, it recommends solutions to the discovered issues.Detta diplomarbete granskar en primär-säkring replikeringsdesign för en proprietär databas. Systemets arkitektur är omplanerat från ett monolitiskt system till ett distribuerat system. Vi undersökte hur distribuerade system testas i näringslivet och akademiska världen. Baserat på litteraturredovisningen och fallföretagets behov implementerade vi tre testmetoder: En automatisk testsvit lämpad för kontinuerlig integration, manuell validering av produktionskomponenter och formell specifiering med modellgranskning. Alla metoder upptäckte brister i den preliminära designen eller i underliggande komponenter. Till exempel upptäcktes ett brott mot livlighet i klusterhanteringen. Brott mot säkerhet observerades i distanslagringsfunktionaliteten. Brott mot konsistens raporterades i en kommersiell objektlagringsservice. Dessa fynd tyder på att ett brett sortiment av testmetoder krävs för att produktionssätta ett distribuerat system. Detta diplomarbete jämför de tillämpade testmetoderna. Utöver detta så rekommenderas lösningar på de upptäckta bristerna.Description
Supervisor
Hirvisalo, VesaThesis advisor
Pitkäranta, TapioKeywords
model checking, distributed system testing, fault injection, replication