Persistence is a somewhat relevant topic for Node.1
When I say persistence I’m talking about keeping the state of files between images. And the two aspects I want to speak about is how files are persisted and what files are persisted.
Node is delivered as an image (now and in future).
An update (a new rootfs) is also delivered as an image.
Migrating changes between the original image, and the updated image is what I call persistence.
In the current Node persistence is implemented by keeping changed files on a writable partition, and bind mounting them at boot time into the target.
The persistence works with a white-list, so a user (or package) needs to name all files which need to be persisted.
A drawback of this approach is, that the bind-mounts are not available in the early boto stages.
The problem can be illustrated if you think about persisting a modified
/etc/fstab or new (3rd party) kernel modules.
Additionally someone needs to determine the files which define the state of a component.
For the future we are thinking about an approach where the whole root filesystem is writable2.
In addition to it beeing writable, we will be also booting into that filesystem. This makes it possible to
- Modify any file
- (which implies) Modify files which are relevant for the early boot process
The key difference to the old approach is, that we can now modify the filesystem we use to boot.
Persistence then works by migrating a file from the previous to the next image.
The migration can be a simple copy or similar to a 3-way merge. Maybe even both. Or more?
Independent of what strategy is choosen, he key fact remains: Persisting works by migrating the old state, to the new image.
Now we know how we persist files, but it’s still the question what files we need to be persist.
As said above, currently this works by whitelisting files.
This has the drawback that the user (or package/consumer) needs to be aware that it#s running on Node, and that is cumbersome to maintain, and not obvious to a user, why files need to be persisted.
I anticipate something like “transparent” persistence for the next-gen Node. It should be transparent to the user/consumer that persistence is necessary, so Node should decide what files need to be persisted which not.
Because Node will keep the original image, and a layer with the changed filesystem tree around, we can easily determine the changed files.
A clear option is to migrate all of the changed files to a new (updated) image. This approach is greedy: Each file that has been touched is copied from one image to the next. This will result in many changes which are moved from one image to another.
We need something smarter. And there is more than one option.
We could use file-level white-listing like we do today.
$ perists /etc/passwd
It’s very selective, but needs a lot of maintenance.
Besides doing it on a file level, we can also do it on the package level.
Two approaches: We can either name the package (and all of it’s files) to be persisted, or we can determine the package of a file, when the user tries to persist a file.
$ persist pkg vdsm
$ persist file /etc/passwd
Do you want to perists all changes to files of the package 'core-utils' instead of the single file '/etc/passwd'?
Or we could just come up with a blacklist of paths which should not be persisted. Except when explicitly requested.
These are a couple of options that come to my mind, I’m not yet sure how the final solution will look like. But I see use cases for all this ways. And the good is, the current infrastructure we plan, will support all of these approaches.