TY - GEN
T1 - Long live the image: container-native data persistence in production
AU - Li, Zheng
PY - 2021/5/10
Y1 - 2021/5/10
N2 - Containerization plays a crucial role in the de facto technology stack for implementing microservices architecture (each microservice has its own database in most cases). Nevertheless, there are still fierce debates on containerizing production databases, mainly due to the data persistence issues and concerns. Driven by a project of refactoring an Automated Machine Learning system, this research proposes the container-native data persistence as a conditional solution to running database containers in production. In essence, the proposed solution distinguishes the stateless data access (i.e. reading) from the stateful data processing (i.e. creating, updating, and deleting) in databases. A master database handles the stateful data processing and dumps database copies for building container images, while the database containers will keep stateless at runtime, based on the preloaded dump in the image. Although there are delays in the state/image update propagation, this solution is particularly suitable for the read-only, the eventual consistency, and the asynchronous processing scenarios. Moreover, with optimal tuning (e.g., disabling locking), the portability and performance gains of a read-only database container would outweigh the performance loss in accessing data across the underlying image layers.
AB - Containerization plays a crucial role in the de facto technology stack for implementing microservices architecture (each microservice has its own database in most cases). Nevertheless, there are still fierce debates on containerizing production databases, mainly due to the data persistence issues and concerns. Driven by a project of refactoring an Automated Machine Learning system, this research proposes the container-native data persistence as a conditional solution to running database containers in production. In essence, the proposed solution distinguishes the stateless data access (i.e. reading) from the stateful data processing (i.e. creating, updating, and deleting) in databases. A master database handles the stateful data processing and dumps database copies for building container images, while the database containers will keep stateless at runtime, based on the preloaded dump in the image. Although there are delays in the state/image update propagation, this solution is particularly suitable for the read-only, the eventual consistency, and the asynchronous processing scenarios. Moreover, with optimal tuning (e.g., disabling locking), the portability and performance gains of a read-only database container would outweigh the performance loss in accessing data across the underlying image layers.
KW - container
KW - data persistence
KW - database
KW - microservice
KW - microservices architecture
U2 - 10.1109/ICSA-C52384.2021.00020
DO - 10.1109/ICSA-C52384.2021.00020
M3 - Conference contribution
AN - SCOPUS:85106574189
SN - 9780738133560
T3 - Proceedings: IEEE International Conference on Software Architecture Companion, ICSA-C
SP - 82
EP - 85
BT - Proceedings of the 2021 IEEE 18th International Conference on Software Architecture Companion, ICSA-C 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on Software Architecture Companion, ICSA-C 2021
Y2 - 22 March 2021 through 26 March 2021
ER -