tl;dr: Xen PVH is the perfect upgrade path from PV and in combination
with grub2 support, it's the Xen "killer feature" we really should have
in Buster.
Background info about Xen PVH:
https://wiki.xen.org/wiki/Virtualization_Spectrum#Almost_fully_PV:_PVH_mode
PVH mode in Xen, a.k.a. "HVM without having to run qemu" is a Xen guest
type best supported since Xen 4.11 and Linux kernel 4.17. Just like when
using PV mode, the guest does not have an emulated BIOS and the guest
kernel is directly started by the dom0. Buster will ship with Xen 4.11.
Why is PVH interesting?
1. When the whole Meltdown/Spectre story started, it quickly became
apparent that 64-bit PV is the most problematic virtualization mode to
protect and to protect from, since address space from the hypervisor and
other guests (including dom0) is reachable from a 64-bit PV domU. To
mitigate this, XPTI (the Xen variant of PTI) has been implemented in the
hypervisor, but with a performance hit. HVM (so, also PVH) guests are
better isolated from the hypervisor and other guests. Inside the guest a
choice can be made about which mitigations to enable or not. Also see
https://xenbits.xen.org/xsa/advisory-254.html
2. Unlike HVM, it's not needed to have a boot loader/sector, partitions,
and a qemu process in the dom0 (using cpu and memory and having an
attack surface). Also, when running a largeish amount of domUs on a
physical server, not having all the qemu processes is an advantage.
3. Unlike PV, PVH makes use of all hardware features that accelerate
virtualization.
The upgrade path from PV to PVH is super optimal. It's just setting
type='pvh' in the guest file and doing a full restart of the domU!
Unless... (insert Monty Python's Dramatic Chord!)
Unless... grub2 was used to boot the PV guests.
Why is it interesting to be able to use grub?
Without using grub in between, the guest kernel and initrd have to be
copied out of the guest onto the dom0 filesystem, because the guest has
to be booted with them directly. Currently, we already have the
grub-xen packages in Debian, which provide grub images which can be used
as kernel for a PV guest, after which it can load the actual linux
kernel that is symlinked from /vmlinuz on the guest filesystem at that
moment.
The final changes to the Linux kernel for grub+PVH are in Linux 4.20.
This request, to carry a few patches from Linux 4.20, provides one half
of the dots that need to be connected to make the full thing happen for
Buster.
Since we'll have Xen 4.11 in Buster, PVH is supported. The related grub2
patchset was committed to the grub master branch on Dec 12 2018 (yup,
today). So, I'll also start contacting the debian grub team soon to ask
(and help) to get the current grub-xen functionality in Debian to be
extended with PVH capabilities as well.
Test reports:
https://lists.xenproject.org/archives/html/xen-devel/2018-10/msg01913.htmlhttps://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03312.html
Permit overlayfs mounts within user namespaces to allow utilisation of e.g.
unprivileged LXC overlay snapshots.
Except by the Ubuntu community [1], overlayfs mounts in user namespaces are
expected to be a security risk [2] and thus are not enabled on upstream
Linux kernels. For the non-Ubuntu users that have to stick to unprivileged
overlay-based LXCs, this meant to patch and compile the kernel manually.
Instead, adding the kernel tainting 'permit_mounts_in_userns' module
parameter allows a kind of a user-friendly way to enable the feature.
Testable with:
sudo modprobe overlay permit_mounts_in_userns=1
sudo sysctl -w kernel.unprivileged_userns_clone=1
mkdir -p lower upper work mnt
unshare --map-root-user --mount \
mount -t overlay none mnt \
-o lowerdir=lower,upperdir=upper,workdir=work
[1]: Ubuntu allows unprivileged mounting of overlay filesystem
https://lists.ubuntu.com/archives/kernel-team/2014-February/038091.html
[2]: User namespaces + overlayfs = root privileges
https://lwn.net/Articles/671641/
Signed-off-by: Nicolas Schier <nicolas@fjasle.eu>
Split the rules in d/rules.real so that the [un]versioned_tools
knobs can be used to avoid building them.
This is necessary since the build-dependency were moved to be
conditional on those knobs, so the build fails when the
unversioned tools are set to disabled as libpci-dev is not
installed but the tools are built and fail due to it missing.
Signed-off-by: Luca Boccassi <bluca@debian.org>