Getting a writable filesystems on Node

The diagram illustrates how we plan to bring a writable root filesystem to Node, without any limitations. No limitations, because the fs is justa regular filesystem, nothing special about it, the special handling is below, on the block layer, provided by LVM thin volumes (thanks to dm-thin):

      . . . . . . . . . . . . . . . . 
      :               :             :   
      :         +--------------+--------------+
      :         | layer-1 (rw) | layer-2 (rw) |
      :         +--------------+--------------+
      :         | base-1 (ro)  | base-2 (ro)  |
+------------+  +--------------+--------------+-----------+
| bootloader |  | lvm vg                                  |   
+------------+--+-----------------------------------------+
| disk                                                    |   
+---------------------------------------------------------+

Each image (called base-N) we publish will be stored as a read-only (thin) logical volume in the volume group (lvm vg). For each base base-N a writable layer named layer-N will be created atop of the base. To boot into the writable layers, boot entries will be added to the bootloader, pointing to all available (writable) layers. There is no possibility to boot into the read-only bases.

State If modifying/customizing of a layer is interpreted as the state of a base, then migrating the custom configuration between layers (i.e. migrating the changes to from an old to a new layer), can be called persisting the state of a base.

Apparently there are many ways how the state of a base can be persisted. It has been discussed and will be discussed, yet it is not clear what the Königsweg is.

And it’s obvious that base-N is intended to be an ancestor of base-(N+1).

Block layer Hiding the sparseness in the block layer has the big advantage that it is completely transparent to file based stuff, like permissions and (especially) SELinux.

It currently has the drawback that deduplication is harder - or impossible.

Testing pxeboot with qemu

The View from my hotel room in Washington DC

Ever wondered if it was possible to test multipathing PXE boot with qemu? it turns out it is.

Basic idea: Extract an iso using Fedora’s livecd-iso-to-pxeboot tool and point qemu to that directory.

$ livecd-iso-to-pxeboot <isoname>
$ qemu \
    -hda hda.qcow2 \
    -net user,tftp=tftpboot,bootfile=pxelinux.0 -net nic

Nifty?

It will probably be re-used in some test automation context.

Nested virtualization on Intel

nesting

Just because I needed it today, a reminder.

If you need to do nested virtualization on some Intel CPU:

# Configure kmod
cat /etc/modprobe.d/kvm.conf 
options kvm_intel nested=Y

# Unload kmod
rmmod kvm-intel && rmmod kvm

# Load kmod
modprobe kvm && modprobe kvm-intel

In virt-manager or alike, remember to copy the host cpu flags to the guest.

To save energy ressources it’s best to - save energy. In the light of this I’ve written a small script which I can easily be used to suspend my spare machines to RAM, and wake them up using wake-on-lan.

Nice to have this easy command to suspend a Fedora host:

systemctl suspend

Isolated test runner for functional tests (qemu + 9p + serial)

remote control pig pile

I’ve written about this before, functional testing of operating system features, i.e. messing with storage or network devices.

In the last post I used gherkin, pexpect and qemu to do this. It works, but was still to cumbersome.

This time I’ve further reduced the dependencies and made the process simpler and more mature.

The flow is now roughly as follows:

  • Write testcases on the host-side
  • Run a VM, and loop in some host-side dir, i.e. $PWD using 9pfs over virtio, and attach the serial console of the VM to stdio
  • Wait for some keyword to turn up and send the necessary client side commands to mount the host-side path into some dir
  • Change into the mounted path and run the host-side testcases inside the VM

Asciiart:

Job          Workspace                        VM

Init   -->   Populated

             Spawn a VM                  -->  Boot

             Pass workspace using 9pfs   -->  Mount workspace over
                                              9pfs

             Init test through serial io -->  Test runner runs
                                         <--  Write results to
                                              workspace through
                                              9pfs

                                              Shutdown

This doesn’t sound fancy, but it has some nice aspects:

  • No modification of the VM needed, as long as the serial console is used
  • Reduce the interaction with the VM to a minimum (but is general enough to have low assumptions about the VM)
  • Works nicely with Jenkins, because libvirt is not involved
  • Don’t touch the image by using snapshots
  • The tests can do whatever they want, complete isolation
  • Simple to understand and maintain
  • Few constraints on the host
  • Works with docker too?

To highlight the key achievements:

  • 9pfs is used to easily exchange data between host and guest.
  • Kickoff testing within the VM using the virtio serial

An example for this approach can be found here - it used to test some Node experiments.

The actual tests and test-runner can be found here.

Testing multipath with qemu

Alternatives

Ever wondered if it was possible to test multipathing with qemu? it turns out it is.

Basic idea: Create two devices which point to the same backing image. Important: Let qemu know that it is the same disk, by using the same serial for both.

$ qemu \
    -drive file=hda.qcow2,media=disk,bus=0,unit=0,if=ide,cache=none,serial=abcde \
    -drive file=hda.qcow2,media=disk,bus=1,unit=1,if=ide,cache=none,serial=abcde \
    -cdrom boot.iso

And in your guest:

# multipath -ll
QEMU_HARDDISK_abcde dm-0 ATA     ,QEMU HARDDISK   
size=30G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 0:0:0:0 sda 8:0  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 1:0:1:0 sdb 8:16 active ready running

Nifty?

It will probably be re-used in some test automation context.

Nice paintings from daala - intra prediction and complexity

Intra-Paint

daala was actually the reason why I started with packaging OpenCL for Fedora. daala planned to provide an OpenCL based reference implementation, but there was a lack of platforms supporting OpenCL.

Anyhow, Jean-Marc Valin wrote about an algorithm yesterday which is visually quite impressive, but it’s also complex. On the pro side is, that it is massively parallel - which actually directly redirects us to OpenCL. Sadly I won’t be able to contribute something, but at least that is a niche, where OpenCL can help.

The results of the algorithm are quite nice (see at the bottom of the post), and might be practical if the complexity can be addressed.

custom storage layout in anaconda

A custom installer class implementation carried in a product.img can be used to provide a custom storage configuration to anaconda.

weston -Brdp-backend.so — Wayland’s RDP backend

kalev mentioned that wayland’s RDP backend is already part of Fedora’s wayland builds. I just gave it a quick shot:

(VM with weston running - to the right, remmina as an RDP client — to the left)

This basically means that wayland is acting as an RDP server (without session management).

$ pkcon install xkeyboard-config weston

This installs wayland (and a dependency). To then start weston with the RDP backend use:

$ weston -Brdp-backend.so

Finally you can use remmina to connect to wayland.