docs/PORTABLE_SERVICES.md - systemd-stable - Rivoreo Source Code Repositories

 # Portable Services Introduction

 This systemd version includes a preview of the "portable service"
 concept. "Portable Services" are supposed to be an incremental improvement over
 traditional system services, making two specific facets of container management
 available to system services more readily. Specifically:

 1. The bundling of applications, i.e. packing up multiple services, their
    binaries and all their dependencies in a single image, and running them
    directly from it.

 2. Stricter default security policies, i.e. sand-boxing of applications.

 The primary tool for interfacing with "portable services" is the new
 "portablectl" program. It's currently shipped in /usr/lib/systemd/portablectl
 (i.e. not in the `$PATH`), since it's not yet considered part of the officially
 supported systemd interfaces — it's a preview still after all.

 Portable services don't bring anything inherently new to the table. All they do
 is put together known concepts in a slightly nicer way to cover a specific set
 of use-cases in a nicer way.

 ## So, what *is* a "Portable Service"?

 A portable service is ultimately just an OS tree, either inside of a directory
 tree, or inside a raw disk image containing a Linux file system. This tree is
 called the "image". It can be "attached" or "detached" from the system. When
 "attached" specific systemd units from the image are made available on the host
 system, then behaving pretty much exactly like locally installed system
 services. When "detached" these units are removed again from the host, leaving
 no artifacts around (except maybe messages they might have logged).

 The OS tree/image can be created with any tool of your choice. For example, you
 can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
 entirely generic, and doesn't have to carry any specific metadata beyond what
 distribution images carry anyway. Or to say this differently: the image format
 doesn't define any new metadata as unit files and OS tree directories or disk
 images are already sufficient, and pretty universally available these days. One
 particularly nice tool for creating suitable images is
 [mkosi](https://github.com/systemd/mkosi), but many other existing tools will
 do too.

 If you so will, "Portable Services" are a nicer way to manage chroot()
 environments, with better security, tooling and behavior.

 ## Where's the difference to a "Container"?

 "Container" is a very vague term, after all it is used for
 systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service
 containers, and even certain 'lightweight' VM runtimes.

 The "portable service" concept ultimately will not provide a fully isolated
 environment to the payload, like containers mostly intend to. Instead they are
 from the beginning more alike regular system services, can be controlled with
 the same tools, are exposed the same way in all infrastructure and so on. Their
 main difference is that the use a different root directory than the rest of the
 system. Hence, the intention is not to run code in a different, isolated world
 from the host — like most containers would do it —, but to run it in the same
 world, but with stricter access controls on what the service can see and do.

 As one point of differentiation: as programs run as "portable services" are
 pretty much regular system services, they won't run as PID 1 (like Docker would
 do it), but as normal process. A corollary of that is that they aren't supposed
 to manage anything in their own environment (such as the network) as the
 execution environment is mostly shared with the rest of the system.

 The primary focus use-case of "portable services" is to extend the host system
 with encapsulated extensions, but provide almost full integration with the rest
 of the system, though possibly restricted by effective security knobs. This
 focus includes system extensions otherwise sometimes called "super-privileged
 containers".

 Note that portable services are only available for system services, not for
 user services. i.e. the functionality cannot be used for the stuff
 bubblewrap/flatpak is focusing on.

 ## Mode of Operation

 If you have portable service image, maybe in a raw disk image called
 `foobar_0.7.23.raw`, then attaching the services to the host is as easy as:

 ```
 # /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
 ```

 This command does the following:

 1. It dissects the image, checks and validates the `/etc/os-release` data of
    the image, and looks for all included unit files.

 2. It copies out all unit files with a suffix of `.service`, `.socket`,
    `.target`, `.timer` and `.path`. whose name begins with the image's name
    (with the .raw removed), truncated at the first underscore (if there is
    one). This prefix name generated from the image name must be followed by a
    ".", "-" or "@" character in the unit name. Or in other words, given the
    image name of `foobar_0.7.23.raw` all unit files matching
    `foobar-*.{service|socket|target|timer|path}`,
    `foobar@.{service|socket|target|timer|path}` as well as
    `foobar.*.{service|socket|target|timer|path}` and
    `foobar.{service|socket|target|timer|path}` are copied out. These unit files
    are placed in `/etc/systemd/system.attached/` (which is part of the normal
    unit file search path of PID 1, and thus loaded exactly like regular unit
    files). Within the images the unit files are looked for at the usual
    locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and
    so on, relative to the image's root.

 3. For each such unit file a drop-in file is created. Let's say
    `foobar-waldo.service` was one of the unit files copied to
    `/etc/systemd/system.attached/`, then a drop-in file
    `/etc/systemd/system.attached/foobar-waldo.service.d/20-portable.conf` is
    created, containing a few lines of additional configuration:

    ```
    [Service]
    RootImage=/path/to/foobar.raw
    Environment=PORTABLE=foobar
    LogExtraFields=PORTABLE=foobar
    ```

 4. For each such unit a "profile" drop-in is linked in. This "profile" drop-in
    generally contains security options that lock down the service. By default
    the `default` profile is used, which provides a medium level of
    security. There's also `trusted` which runs the service at the highest
    privileges, i.e. host's root and everything. The `strict` profile comes with
    the toughest security restrictions. Finally, `nonetwork` is like `default`
    but without network access. Users may define their own profiles too (or
    modify the existing ones)

 And that's already it.

 Note that the images need to stay around (and the same location) as long as the
 portable service is attached. If an image is moved, the `RootImage=` line
 written to the unit drop-in would point to an non-existing place, and break the
 logic.

 The `portablectl detach` command executes the reverse operation: it looks for
 the drop-ins and the unit files associated with the image, and removes them
 again.

 Note that `portable attach` won't enable or start any of the units it copies
 out. This still has to take place in a second, separate step. (That said We
 might add options to do this automatically later on.).

 ## Requirements on Images

 Note that portable services don't introduce any new image format, but most OS
 images should just work the way they are. Specifically, the following
 requirements are made for an image that can be attached/detached with
 `portablectl`.

 1. It must contain a binary (and its dependencies) that shall be invoked,
    including all its dependencies. If binary code, the code needs to be
    compiled for an architecture compatible with the host.

 2. The image must either be a plain sub-directory (or btrfs subvolume)
    containing the binaries and its dependencies in a classic Linux OS tree, or
    must be a raw disk image either containing only one, naked file system, or
    an image with a partition table understood by the Linux kernel with only a
    single partition defined, or alternatively, a GPT partition table with a set
    of properly marked partitions following the [Discoverable Partitions
    Specification](https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/).

 3. The image must at least contain one matching unit file, with the right name
    prefix and suffix (see above). The unit file is searched in the usual paths,
    i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the
    image. (The implementation will check a couple of other paths too, but it's
    recommended to use these two paths.)

 4. The image must contain an os-release file, either in /etc/os-release or
    /usr/lib/os-release. The file should follow the standard format.

 Note that generally images created by tools such as `debootstrap`, `dnf
 --installroot=` or `mkosi` qualify for all of the above in one way or
 another. If you wonder what the most minimal image would be that complies with
 the requirements above, it could consist of this:

 ```
 /usr/bin/minimald                        # a statically compiled binary
 /usr/lib/systemd/minimal-test.service    # the unit file for the service, with ExecStart=/usr/bin/minimald
 /usr/lib/os-release                      # an os-release file explaining what this is
 ```

 And that's it.

 Note that qualifying images do not have to contain an init system of their
 own. If they do, it's fine, it will be ignored by the portable service logic,
 but they generally don't have to, and it might make sense to avoid any, to keep
 images minimal.

 Note that as no new image format or metadata is defined, it's very
 straight-forward to define images than can be made use of it a number of
 different ways. For example, by using `mkosi -b` you can trivially build a
 single, unified image that:

 1. Can be attached as portable service, to run any container services natively
    on the host.

 2. Can be run as OS container, using `systemd-nspawn`, by booting the image
    with `systemd-nspawn -i -b`.

 3. Can be booted directly as VM image, using a generic VM executor such as
    `virtualbox`/`qemu`/`kvm`

 4. Can be booted directly on bare-metal systems.

 Of course, to facilitate 2, 3 and 4 you need to include an init system in the
 image. To facility 3 and 4 you also need to include a boot loader in the
 image. As mentioned `mkosi -b` takes care of all of that for you, but any other
 image generator should work too.

 ## Execution Environment

 Note that the code in portable service images is run exactly like regular
 services. Hence there's no new execution environment to consider. Oh, unlike
 Docker would do it, as these are regular system services they aren't run as PID
 1 either, but with regular PID values.

 ## Access to host resources

 If services shipped with this mechanism shall be able to access host resources
 (such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and
 `BindReadOnlyPaths=` settings in unit files to mount them in. In fact the
 `default` profile mentioned above makes use of this to ensure
 `/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging
 subsystem are available to the service.

 ## Instantiation

 Sometimes it makes sense to instantiate the same set of services multiple
 times. The portable service concept does not introduce a new logic for this. It
 is recommended to use the regular unit templating of systemd for this, i.e. to
 include template units such as `foobar@.service`, so that instantiation is as
 simple as:

 ```
 # /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
 # systemctl enable --now foobar@instancea.service
 # systemctl enable --now foobar@instanceb.service
 …
 ```

 The benefit of this approach is that templating works exactly the same for
 units shipped with the OS itself as for attached portable services.

 ## Immutable images with local data

 It's a good idea to keep portable service images read-only during normal
 operation. In fact all but the `trusted` profile will default to this kind of
 behaviour, by setting the `ProtectSystem=strict` option. In this case writable
 service data may be placed on the host file system. Use `StateDirectory=` in
 the unit files to enable such behaviour and add a local data directory to the
 services copied onto the host.
	# Portable Services Introduction

	This systemd version includes a preview of the "portable service"
	concept. "Portable Services" are supposed to be an incremental improvement over
	traditional system services, making two specific facets of container management
	available to system services more readily. Specifically:

	1. The bundling of applications, i.e. packing up multiple services, their
	binaries and all their dependencies in a single image, and running them
	directly from it.

	2. Stricter default security policies, i.e. sand-boxing of applications.

	The primary tool for interfacing with "portable services" is the new
	"portablectl" program. It's currently shipped in /usr/lib/systemd/portablectl
	(i.e. not in the `$PATH`), since it's not yet considered part of the officially
	supported systemd interfaces — it's a preview still after all.

	Portable services don't bring anything inherently new to the table. All they do
	is put together known concepts in a slightly nicer way to cover a specific set
	of use-cases in a nicer way.

	## So, what is a "Portable Service"?

	A portable service is ultimately just an OS tree, either inside of a directory
	tree, or inside a raw disk image containing a Linux file system. This tree is
	called the "image". It can be "attached" or "detached" from the system. When
	"attached" specific systemd units from the image are made available on the host
	system, then behaving pretty much exactly like locally installed system
	services. When "detached" these units are removed again from the host, leaving
	no artifacts around (except maybe messages they might have logged).

	The OS tree/image can be created with any tool of your choice. For example, you
	can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
	entirely generic, and doesn't have to carry any specific metadata beyond what
	distribution images carry anyway. Or to say this differently: the image format
	doesn't define any new metadata as unit files and OS tree directories or disk
	images are already sufficient, and pretty universally available these days. One
	particularly nice tool for creating suitable images is
	[mkosi](https://github.com/systemd/mkosi), but many other existing tools will
	do too.

	If you so will, "Portable Services" are a nicer way to manage chroot()
	environments, with better security, tooling and behavior.

	## Where's the difference to a "Container"?

	"Container" is a very vague term, after all it is used for
	systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service
	containers, and even certain 'lightweight' VM runtimes.

	The "portable service" concept ultimately will not provide a fully isolated
	environment to the payload, like containers mostly intend to. Instead they are
	from the beginning more alike regular system services, can be controlled with
	the same tools, are exposed the same way in all infrastructure and so on. Their
	main difference is that the use a different root directory than the rest of the
	system. Hence, the intention is not to run code in a different, isolated world
	from the host — like most containers would do it —, but to run it in the same
	world, but with stricter access controls on what the service can see and do.

	As one point of differentiation: as programs run as "portable services" are
	pretty much regular system services, they won't run as PID 1 (like Docker would
	do it), but as normal process. A corollary of that is that they aren't supposed
	to manage anything in their own environment (such as the network) as the
	execution environment is mostly shared with the rest of the system.

	The primary focus use-case of "portable services" is to extend the host system
	with encapsulated extensions, but provide almost full integration with the rest
	of the system, though possibly restricted by effective security knobs. This
	focus includes system extensions otherwise sometimes called "super-privileged
	containers".

	Note that portable services are only available for system services, not for
	user services. i.e. the functionality cannot be used for the stuff
	bubblewrap/flatpak is focusing on.

	## Mode of Operation

	If you have portable service image, maybe in a raw disk image called
	`foobar_0.7.23.raw`, then attaching the services to the host is as easy as:

	```
	# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
	```

	This command does the following:

	1. It dissects the image, checks and validates the `/etc/os-release` data of
	the image, and looks for all included unit files.

	2. It copies out all unit files with a suffix of `.service`, `.socket`,
	`.target`, `.timer` and `.path`. whose name begins with the image's name
	(with the .raw removed), truncated at the first underscore (if there is
	one). This prefix name generated from the image name must be followed by a
	".", "-" or "@" character in the unit name. Or in other words, given the
	image name of `foobar_0.7.23.raw` all unit files matching
	`foobar-*.{service\|socket\|target\|timer\|path}`,
	`foobar@.{service\|socket\|target\|timer\|path}` as well as
	`foobar.*.{service\|socket\|target\|timer\|path}` and
	`foobar.{service\|socket\|target\|timer\|path}` are copied out. These unit files
	are placed in `/etc/systemd/system.attached/` (which is part of the normal
	unit file search path of PID 1, and thus loaded exactly like regular unit
	files). Within the images the unit files are looked for at the usual
	locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and
	so on, relative to the image's root.

	3. For each such unit file a drop-in file is created. Let's say
	`foobar-waldo.service` was one of the unit files copied to
	`/etc/systemd/system.attached/`, then a drop-in file
	`/etc/systemd/system.attached/foobar-waldo.service.d/20-portable.conf` is
	created, containing a few lines of additional configuration:

	```
	[Service]
	RootImage=/path/to/foobar.raw
	Environment=PORTABLE=foobar
	LogExtraFields=PORTABLE=foobar
	```

	4. For each such unit a "profile" drop-in is linked in. This "profile" drop-in
	generally contains security options that lock down the service. By default
	the `default` profile is used, which provides a medium level of
	security. There's also `trusted` which runs the service at the highest
	privileges, i.e. host's root and everything. The `strict` profile comes with
	the toughest security restrictions. Finally, `nonetwork` is like `default`
	but without network access. Users may define their own profiles too (or
	modify the existing ones)

	And that's already it.

	Note that the images need to stay around (and the same location) as long as the
	portable service is attached. If an image is moved, the `RootImage=` line
	written to the unit drop-in would point to an non-existing place, and break the
	logic.

	The `portablectl detach` command executes the reverse operation: it looks for
	the drop-ins and the unit files associated with the image, and removes them
	again.

	Note that `portable attach` won't enable or start any of the units it copies
	out. This still has to take place in a second, separate step. (That said We
	might add options to do this automatically later on.).

	## Requirements on Images

	Note that portable services don't introduce any new image format, but most OS
	images should just work the way they are. Specifically, the following
	requirements are made for an image that can be attached/detached with
	`portablectl`.

	1. It must contain a binary (and its dependencies) that shall be invoked,
	including all its dependencies. If binary code, the code needs to be
	compiled for an architecture compatible with the host.

	2. The image must either be a plain sub-directory (or btrfs subvolume)
	containing the binaries and its dependencies in a classic Linux OS tree, or
	must be a raw disk image either containing only one, naked file system, or
	an image with a partition table understood by the Linux kernel with only a
	single partition defined, or alternatively, a GPT partition table with a set
	of properly marked partitions following the [Discoverable Partitions
	Specification](https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/).

	3. The image must at least contain one matching unit file, with the right name
	prefix and suffix (see above). The unit file is searched in the usual paths,
	i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the
	image. (The implementation will check a couple of other paths too, but it's
	recommended to use these two paths.)

	4. The image must contain an os-release file, either in /etc/os-release or
	/usr/lib/os-release. The file should follow the standard format.

	Note that generally images created by tools such as `debootstrap`, `dnf
	--installroot=` or `mkosi` qualify for all of the above in one way or
	another. If you wonder what the most minimal image would be that complies with
	the requirements above, it could consist of this:

	```
	/usr/bin/minimald # a statically compiled binary
	/usr/lib/systemd/minimal-test.service # the unit file for the service, with ExecStart=/usr/bin/minimald
	/usr/lib/os-release # an os-release file explaining what this is
	```

	And that's it.

	Note that qualifying images do not have to contain an init system of their
	own. If they do, it's fine, it will be ignored by the portable service logic,
	but they generally don't have to, and it might make sense to avoid any, to keep
	images minimal.

	Note that as no new image format or metadata is defined, it's very
	straight-forward to define images than can be made use of it a number of
	different ways. For example, by using `mkosi -b` you can trivially build a
	single, unified image that:

	1. Can be attached as portable service, to run any container services natively
	on the host.

	2. Can be run as OS container, using `systemd-nspawn`, by booting the image
	with `systemd-nspawn -i -b`.

	3. Can be booted directly as VM image, using a generic VM executor such as
	`virtualbox`/`qemu`/`kvm`

	4. Can be booted directly on bare-metal systems.

	Of course, to facilitate 2, 3 and 4 you need to include an init system in the
	image. To facility 3 and 4 you also need to include a boot loader in the
	image. As mentioned `mkosi -b` takes care of all of that for you, but any other
	image generator should work too.

	## Execution Environment

	Note that the code in portable service images is run exactly like regular
	services. Hence there's no new execution environment to consider. Oh, unlike
	Docker would do it, as these are regular system services they aren't run as PID
	1 either, but with regular PID values.

	## Access to host resources

	If services shipped with this mechanism shall be able to access host resources
	(such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and
	`BindReadOnlyPaths=` settings in unit files to mount them in. In fact the
	`default` profile mentioned above makes use of this to ensure
	`/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging
	subsystem are available to the service.

	## Instantiation

	Sometimes it makes sense to instantiate the same set of services multiple
	times. The portable service concept does not introduce a new logic for this. It
	is recommended to use the regular unit templating of systemd for this, i.e. to
	include template units such as `foobar@.service`, so that instantiation is as
	simple as:

	```
	# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
	# systemctl enable --now foobar@instancea.service
	# systemctl enable --now foobar@instanceb.service
	…
	```

	The benefit of this approach is that templating works exactly the same for
	units shipped with the OS itself as for attached portable services.

	## Immutable images with local data

	It's a good idea to keep portable service images read-only during normal
	operation. In fact all but the `trusted` profile will default to this kind of
	behaviour, by setting the `ProtectSystem=strict` option. In this case writable
	service data may be placed on the host file system. Use `StateDirectory=` in
	the unit files to enable such behaviour and add a local data directory to the
	services copied onto the host.