src/libsystemd/sd-bus/PORTING-DBUS1 - systemd-stable - Rivoreo Source Code Repositories

 A few hints on supporting kdbus as backend in your favorite D-Bus library.

 ~~~

 Before you read this, have a look at the DIFFERENCES and
 GVARIANT_SERIALIZATION texts you find in the same directory where you
 found this.

 We invite you to port your favorite D-Bus protocol implementation
 over to kdbus. However, there are a couple of complexities
 involved. On kdbus we only speak GVariant marshaling, kdbus clients
 ignore traffic in dbus1 marshaling. Thus, you need to add a second,
 GVariant compatible marshaler to your library first.

 After you have done that: here's the basic principle how kdbus works:

 You connect to a bus by opening its bus node in /sys/fs/kdbus/. All
 buses have a device node there, it starts with a numeric UID of the
 owner of the bus, followed by a dash and a string identifying the
 bus. The system bus is thus called /sys/fs/kdbus/0-system, and for user
 buses the device node is /sys/fs/kdbus/1000-user (if 1000 is your user
 id).

 (Before we proceed, please always keep a copy of libsystemd next
 to you, ultimately that's where the details are, this document simply
 is a rough overview to help you grok things.)

 CONNECTING

 To connect to a bus, simply open() its device node and issue the
 KDBUS_CMD_HELLO call. That's it. Now you are connected. Do not send
 Hello messages or so (as you would on dbus1), that does not exist for
 kdbus.

 The structure you pass to the ioctl will contain a couple of
 parameters that you need to know, to operate on the bus.

 There are two flags fields, one indicating features of the kdbus
 kernel side ("conn_flags"), the other one ("bus_flags") indicating
 features of the bus owner (i.e. systemd). Both flags fields are 64bit
 in width.

 When calling into the ioctl, you need to place your own supported
 feature bits into these fields. This tells the kernel about the
 features you support. When the ioctl returns, it will contain the
 features the kernel supports.

 If any of the higher 32bit are set on the two flags fields and your
 client does not know what they mean, it must disconnect. The upper
 32bit are used to indicate "incompatible" feature additions on the bus
 system, the lower 32bit indicate "compatible" feature additions. A
 client that does not support a "compatible" feature addition can go on
 communicating with the bus, however a client that does not support an
 "incompatible" feature must not proceed with the connection. When a
 client encountes such an "incompatible" feature it should immediately
 try the next bus address configured in the bus address string.

 The hello structure also contains another flags field "attach_flags"
 which indicates metadata that is optionally attached to all incoming
 messages. You probably want to set KDBUS_ATTACH_NAMES unconditionally
 in it. This has the effect that all well-known names of a sender are
 attached to all incoming messages. You need this information to
 implement matches that match on a message sender name correctly. Of
 course, you should only request the attachment of as little metadata
 fields as you need.

 The kernel will return in the "id" field your unique id. This is a
 simple numeric value. For compatibility with classic dbus1 simply
 format this as string and prefix ":1.".

 The kernel will also return the bloom filter size and bloom filter
 hash function number used for the signal broadcast bloom filter (see
 below).

 The kernel will also return the bus ID of the bus in a 128bit field.

 The pool size field specifies the size of the memory mapped buffer.
 After the calling the hello ioctl, you should memory map the kdbus
 fd. In this memory mapped region, the kernel will place all your incoming
 messages.

 SENDING MESSAGES

 Use the MSG_SEND ioctl to send a message to another peer. The ioctl
 takes a structure that contains a variety of fields:

 The flags field corresponds closely to the old dbus1 message header
 flags field, though the DONT_EXPECT_REPLY field got inverted into
 EXPECT_REPLY.

 The dst_id/src_id field contains the unique id of the destination and
 the sender. The sender field is overridden by the kernel usually, hence
 you shouldn't fill it in. The destination field can also take the
 special value KDBUS_DST_ID_BROADCAST for broadcast messages. For
 messages intended to a well-known name set the field to
 KDBUS_DST_ID_NAME, and attach the name in a special "items" entry to
 the message (see below).

 The payload field indicates the payload. For all dbus traffic it
 should carry the value 0x4442757344427573ULL. (Which encodes
 'DBusDBus').

 The cookie field corresponds with the "serial" field of classic
 dbus1. We simply renamed it here (and extended it to 64bit) since we
 didn't want to imply the monotonicity of the assignment the way the
 word "serial" indicates it.

 When sending a message that expects a reply, you need to set the
 EXPECT_REPLY flag in the message flag field. In this case you should
 also fill out the "timeout_ns" value which indicates the timeout in
 nsec for this call. If the peer does not respond in this time you will
 get a notification of a timeout. Note that this is also used for
 security purposes: a single reply messages is only allowed through the
 bus as long as the timeout has not ended. With this timeout value you
 hence "open a time window" in which the peer might respond to your
 request and the policy allows the response to go through.

 When sending a message that is a reply, you need to fill in the
 cookie_reply field, which is similar to the reply_serial field of
 dbus1. Note that a message cannot have EXPECT_REPLY and a reply_serial
 at the same time!

 This pretty much explains the ioctl header. The actual payload of the
 data is now referenced in additional items that are attached to this
 ioctl header structure at the end. When sending a message, you attach
 items of the type PAYLOAD_VEC, PAYLOAD_MEMFD, FDS, BLOOM_FILTER,
 DST_NAME to it:

    KDBUS_ITEM_PAYLOAD_VEC: contains a pointer + length pair for
    referencing arbitrary user memory. This is how you reference most
    of your data. It's a lot like the good old iovec structure of glibc.

    KDBUS_ITEM_PAYLOAD_MEMFD: for large data blocks it is preferable
    to send prepared "memfds" (see below) over. This item contains an
    fd for a memfd plus a size.

    KDBUS_ITEM_FDS: for sending over fds attach an item of this type with
    an array of fds.

    KDBUS_ITEM_BLOOM_FILTER: the calculated bloom filter of this message,
    only for undirected (broadcast) message.

    KDBUS_ITEM_DST_NAME: for messages that are directed to a well-known
    name (instead of a unique name), this item contains the well-known
    name field.

 A single message may consists of no, one or more payload items of type
 PAYLOAD_VEC or PAYLOAD_MEMFD. D-Bus protocol implementations should
 treat them as a single block that just happens to be split up into
 multiple items. Some restrictions apply however:

    The message header in its entirety must be contained in a single
    PAYLOAD_VEC item.

    You may only split your message up right in front of each GVariant
    contained in the payload, as well is immediately before framing of a
    Gvariant, as well after as any padding bytes if there are any. The
    padding bytes must be wholly contained in the preceding
    PAYLOAD_VEC/PAYLOAD_MEMFD item. You may not split up basic types
    nor arrays of fixed types. The latter is necessary to allow APIs
    to return direct pointers to linear arrays of numeric
    values. Examples: The basic types "u", "s", "t" have to be in the
    same payload item. The array of fixed types "ay", "ai" have to be
    fully in contained in the same payload item. For an array "as" or
    "a(si)" the only restriction however is to keep each string
    individually in an uninterrupted item, to keep the framing of each
    element and the array in a single uninterrupted item, however the
    various strings might end up in different items.

 Note again, that splitting up messages into separate items is up to the
 implementation. Also note that the kdbus kernel side might merge
 separate items if it deems this to be useful. However, the order in
 which items are contained in the message is left untouched.

 PAYLOAD_MEMFD items allow zero-copy data transfer (see below regarding
 the memfd concept). Note however that the overhead of mapping these
 makes them relatively expensive, and only worth the trouble for memory
 blocks > 512K (this value appears to be quite universal across
 architectures, as we tested). Thus we recommend sending PAYLOAD_VEC
 items over for small messages and restore to PAYLOAD_MEMFD items for
 messages > 512K. Since while building up the message you might not
 know yet whether it will grow beyond this boundary a good approach is
 to simply build the message unconditionally in a memfd
 object. However, when the message is sealed to be sent away check for
 the size limit. If the size of the message is < 512K, then simply send
 the data as PAYLOAD_VEC and reuse the memfd. If it is >= 512K, seal
 the memfd and send it as PAYLOAD_MEMFD, and allocate a new memfd for
 the next message.

 RECEIVING MESSAGES

 Use the MSG_RECV ioctl to read a message from kdbus. This will return
 an offset into the pool memory map, relative to its beginning.

 The received message structure more or less follows the structure of
 the message originally sent. However, certain changes have been
 made. In the header the src_id field will be filled in.

 The payload items might have gotten merged and PAYLOAD_VEC items are
 not used. Instead, you will only find PAYLOAD_OFF and PAYLOAD_MEMFD
 items. The former contain an offset and size into your memory mapped
 pool where you find the payload.

 If during the HELLO ioctl you asked for getting metadata attached to
 your message, you will find additional KDBUS_ITEM_CREDS,
 KDBUS_ITEM_PID_COMM, KDBUS_ITEM_TID_COMM, KDBUS_ITEM_TIMESTAMP,
 KDBUS_ITEM_EXE, KDBUS_ITEM_CMDLINE, KDBUS_ITEM_CGROUP,
 KDBUS_ITEM_CAPS, KDBUS_ITEM_SECLABEL, KDBUS_ITEM_AUDIT items that
 contain this metadata. This metadata will be gathered from the sender
 at the point in time it sends the message. This information is
 uncached, and since it is appended by the kernel, trustable. The
 KDBUS_ITEM_SECLABEL item usually contains the SELinux security label,
 if it is used.

 After processing the message you need to call the KDBUS_CMD_FREE
 ioctl, which releases the message from the pool, and allows the kernel
 to store another message there. Note that the memory used by the pool
 is ordinary anonymous, swappable memory that is backed by tmpfs. Hence
 there is no need to copy the message out of it quickly, instead you
 can just leave it there as long as you need it and release it via the
 FREE ioctl only after that's done.

 BLOOM FILTERS

 The kernel does not understand dbus marshaling, it will not look into
 the message payload. To allow clients to subscribe to specific subsets
 of the broadcast matches we employ bloom filters.

 When broadcasting messages, a bloom filter needs to be attached to the
 message in a KDBUS_ITEM_BLOOM item (and only for broadcasting
 messages!). If you don't know what bloom filters are, read up now on
 Wikipedia. In short: they are a very efficient way how to
 probabilistically check whether a certain word is contained in a
 vocabulary. It knows no false negatives, but it does know false
 positives.

 The parameters for the bloom filters that need to be included in
 broadcast message is communicated to userspace as part of the hello
 response structure (see above). By default it has the parameters m=512
 (bits in the filter), k=8 (nr of hash functions). Note however, that
 this is subject to change in later versions, and userspace
 implementations must be capable of handling m values between at least
 m=8 and m=2^32, and k values between at least k=1 and k=32. The
 underlying hash function is SipHash-2-4. It is used with a number of
 constant (yet originally randomly generated) 128bit hash keys, more
 specifically:

    b9,66,0b,f0,46,70,47,c1,88,75,c4,9c,54,b9,bd,15,
    aa,a1,54,a2,e0,71,4b,39,bf,e1,dd,2e,9f,c5,4a,3b,
    63,fd,ae,be,cd,82,48,12,a1,6e,41,26,cb,fa,a0,c8,
    23,be,45,29,32,d2,46,2d,82,03,52,28,fe,37,17,f5,
    56,3b,bf,ee,5a,4f,43,39,af,aa,94,08,df,f0,fc,10,
    31,80,c8,73,c7,ea,46,d3,aa,25,75,0f,9e,4c,09,29,
    7d,f7,18,4b,7b,a4,44,d5,85,3c,06,e0,65,53,96,6d,
    f2,77,e9,6f,93,b5,4e,71,9a,0c,34,88,39,25,bf,35

 When calculating the first bit index into the bloom filter, the
 SipHash-2-4 hash value is calculated for the input data and the first
 16 bytes of the array above as hash key. Of the resulting 8 bytes of
 output, as many full bytes are taken for the bit index as necessary,
 starting from the output's first byte. For the second bit index the
 same hash value is used, continuing with the next unused output byte,
 and so on. Each time the bytes returned by the hash function are
 depleted it is recalculated with the next 16 byte hash key from the
 array above and the same input data.

 For each message to send across the bus we populate the bloom filter
 with all possible matchable strings. If a client then wants to
 subscribe to messages of this type, it simply tells the kernel to test
 its own calculated bit mask against the bloom filter of each message.

 More specifically, the following strings are added to the bloom filter
 of each message that is broadcasted:

   The string "interface:" suffixed by the interface name

   The string "member:" suffixed by the member name

   The string "path:" suffixed by the path name

   The string "path-slash-prefix:" suffixed with the path name, and
   also all prefixes of the path name (cut off at "/"), also prefixed
   with "path-slash-prefix".

   The string "message-type:" suffixed with the strings "signal",
   "method_call", "error" or "method_return" for the respective message
   type of the message.

   If the first argument of the message is a string, "arg0:" suffixed
   with the first argument.

   If the first argument of the message is a string, "arg0-dot-prefix"
   suffixed with the first argument, and also all prefixes of the
   argument (cut off at "."), also prefixed with "arg0-dot-prefix".

   If the first argument of the message is a string,
   "arg0-slash-prefix" suffixed with the first argument, and also all
   prefixes of the argument (cut off at "/"), also prefixed with
   "arg0-slash-prefix".

   Similar for all further arguments that are strings up to 63, for the
   arguments and their "dot" and "slash" prefixes. On the first
   argument that is not a string, addition to the bloom filter should be
   stopped however.

 (Note that the bloom filter does not contain sender nor receiver
 names!)

 When a client wants to subscribe to messages matching a certain
 expression, it should calculate the bloom mask following the same
 algorithm. The kernel will then simply test the mask against the
 attached bloom filters.

 Note that bloom filters are probabilistic, which means that clients
 might get messages they did not expect. Your bus protocol
 implementation must be capable of dealing with these unexpected
 messages (which it needs to anyway, given that transfers are
 relatively unrestricted on kdbus and people can send you all kinds of
 non-sense).

 If a client connects to a bus whose bloom filter metrics (i.e. filter
 size and number of hash functions) are outside of the range the client
 supports it must immediately disconnect and continue connection with
 the next bus address of the bus connection string.

 INSTALLING MATCHES

 To install matches for broadcast messages, use the KDBUS_CMD_ADD_MATCH
 ioctl. It takes a structure that contains an encoded match expression,
 and that is followed by one or more items, which are combined in an
 AND way. (Meaning: a message is matched exactly when all items
 attached to the original ioctl struct match).

 To match against other user messages add a KDBUS_ITEM_BLOOM item in
 the match (see above). Note that the bloom filter does not include
 matches to the sender names. To additionally check against sender
 names, use the KDBUS_ITEM_ID (for unique id matches) and
 KDBUS_ITEM_NAME (for well-known name matches) item types.

 To match against kernel generated messages (see below) you should add
 items of the same type as the kernel messages include,
 i.e. KDBUS_ITEM_NAME_ADD, KDBUS_ITEM_NAME_REMOVE,
 KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD, KDBUS_ITEM_ID_REMOVE and
 fill them out. Note however, that you have some wildcards in this
 case, for example the .id field of KDBUS_ITEM_ID_ADD/KDBUS_ITEM_ID_REMOVE
 structures may be set to 0 to match against any id addition/removal.

 Note that dbus match strings do no map 1:1 to these ioctl() calls. In
 many cases (where the match string is "underspecified") you might need
 to issue up to six different ioctl() calls for the same match. For
 example, the empty match (which matches against all messages), would
 translate into one KDBUS_ITEM_BLOOM ioctl, one KDBUS_ITEM_NAME_ADD,
 one KDBUS_ITEM_NAME_CHANGE, one KDBUS_ITEM_NAME_REMOVE, one
 KDBUS_ITEM_ID_ADD and one KDBUS_ITEM_ID_REMOVE.

 When creating a match, you may attach a "cookie" value to them, which
 is used for deleting this match again. The cookie can be selected freely
 by the client. When issuing KDBUS_CMD_REMOVE_MATCH, simply pass the
 same cookie as before and all matches matching the same "cookie" value
 will be removed. This is particularly handy for the case where multiple
 ioctl()s are added for a single match strings.

 MEMFDS

 memfds may be sent across kdbus via KDBUS_ITEM_PAYLOAD_MEMFD items
 attached to messages. If this is done, the data included in the memfd
 is considered part of the payload stream of a message, and are treated
 the same way as KDBUS_ITEM_PAYLOAD_VEC by the receiving side. It is
 possible to interleave KDBUS_ITEM_PAYLOAD_MEMFD and
 KDBUS_ITEM_PAYLOAD_VEC items freely, by the reader they will be
 considered a single stream of bytes in the order these items appear in
 the message, that just happens to be split up at various places
 (regarding rules how they may be split up, see above). The kernel will
 refuse taking KDBUS_ITEM_PAYLOAD_MEMFD items that refer to memfds that
 are not sealed.

 Note that sealed memfds may be unsealed again if they are not mapped
 you have the only fd reference to them.

 Alternatively to sending memfds as KDBUS_ITEM_PAYLOAD_MEMFD items
 (where they are just a part of the payload stream of a message) you can
 also simply attach any memfd to a message using
 KDBUS_ITEM_PAYLOAD_FDS. In this case, the memfd contents is not
 considered part of the payload stream of the message, but simply fds
 like any other, that happen to be attached to the message.

 MESSAGES FROM THE KERNEL

 A couple of messages previously generated by the dbus1 bus driver are
 now generated by the kernel. Since the kernel does not understand the
 payload marshaling, they are generated by the kernel  in a different
 format. This is indicated with the "payload type" field of the
 messages set to 0. Library implementations should take these messages
 and synthesize traditional driver messages for them on reception.

 More specifically:

    Instead of the NameOwnerChanged, NameLost, NameAcquired signals
    there are kernel messages containing KDBUS_ITEM_NAME_ADD,
    KDBUS_ITEM_NAME_REMOVE, KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD,
    KDBUS_ITEM_ID_REMOVE items are generated (each message will contain
    exactly one of these items). Note that in libsystemd we have
    obsoleted NameLost/NameAcquired messages, since they are entirely
    redundant to NameOwnerChanged. This library will hence only
    synthesize NameOwnerChanged messages from these kernel messages,
    and never generate NameLost/NameAcquired. If your library needs to
    stay compatible to the old dbus1 userspace, you possibly might need
    to synthesize both a NameOwnerChanged and NameLost/NameAcquired
    message from the same kernel message.

    When a method call times out, a KDBUS_ITEM_REPLY_TIMEOUT message is
    generated. This should be synthesized into a method error reply
    message to the original call.

    When a method call fails because the peer terminated the connection
    before responding, a KDBUS_ITEM_REPLY_DEAD message is
    generated. Similarly, it should be synthesized into a method error
    reply message.

 For synthesized messages we recommend setting the cookie field to
 (uint32_t) -1 (and not (uint64_t) -1!), so that the cookie is not 0
 (which the dbus1 spec does not allow), but clearly recognizable as
 synthetic.

 Note that the KDBUS_ITEM_NAME_XYZ messages will actually inform you
 about all kinds of names, including activatable ones. Classic dbus1
 NameOwnerChanged messages OTOH are only generated when a name is
 really acquired on the bus and not just simply activatable. This means
 you must explicitly check for the case where an activatable name
 becomes acquired or an acquired name is lost and returns to be
 activatable.

 NAME REGISTRY

 To acquire names on the bus, use the KDBUS_CMD_NAME_ACQUIRE ioctl(). It
 takes a flags field similar to dbus1's RequestName() bus driver call,
 however the NO_QUEUE flag got inverted into a QUEUE flag instead.

 To release a previously acquired name use the KDBUS_CMD_NAME_RELEASE
 ioctl().

 To list acquired names use the KDBUS_CMD_CONN_INFO ioctl. It may be
 used to list unique names, well known names as well as activatable
 names and clients currently queuing for ownership of a well-known
 name. The ioctl will return an offset into the memory pool. After
 reading all the data you need, you need to release this via the
 KDBUS_CMD_FREE ioctl(), similar how you release a received message.

 CREDENTIALS

 kdbus can optionally attach various kinds of metadata about the sender at
 the point of time of sending ("credentials") to messages, on request
 of the receiver. This is both supported on directed and undirected
 (broadcast) messages. The metadata to attach is selected at time of
 the HELLO ioctl of the receiver via a flags field (see above). Note
 that clients must be able to handle that messages contain more
 metadata than they asked for themselves, to simplify implementation of
 broadcasting in the kernel. The receiver should not rely on this data
 to be around though, even though it will be correct if it happens to
 be attached. In order to avoid programming errors in applications, we
 recommend though not passing this data on to clients that did not
 explicitly ask for it.

 Credentials may also be queried for a well-known or unique name. Use
 the KDBUS_CMD_CONN_INFO for this. It will return an offset to the pool
 area again, which will contain the same credential items as messages
 have attached. Note that when issuing the ioctl, you can select a
 different set of credentials to gather, than what was originally requested
 for being attached to incoming messages.

 Credentials are always specific to the sender's domain that was
 current at the time of sending, and of the process that opened the
 bus connection at the time of opening it. Note that this latter data
 is cached!

 POLICY

 The kernel enforces only very limited policy on names. It will not do
 access filtering by userspace payload, and thus not by interface or
 method name.

 This ultimately means that most fine-grained policy enforcement needs
 to be done by the receiving process. We recommend using PolicyKit for
 any more complex checks. However, libraries should make simple static
 policy decisions regarding privileged/unprivileged method calls
 easy. We recommend doing this by enabling KDBUS_ATTACH_CAPS and
 KDBUS_ATTACH_CREDS for incoming messages, and then discerning client
 access by some capability, or if sender and receiver UIDs match.

 BUS ADDRESSES

 When connecting to kdbus use the "kernel:" protocol prefix in DBus
 address strings. The device node path is encoded in its "path="
 parameter.

 Client libraries should use the following connection string when
 connecting to the system bus:

    kernel:path=/sys/fs/kdbus/0-system/bus;unix:path=/var/run/dbus/system_bus_socket

 This will ensure that kdbus is preferred over the legacy AF_UNIX
 socket, but compatibility is kept. For the user bus use:

    kernel:path=/sys/fs/kdbus/$UID-user/bus;unix:path=$XDG_RUNTIME_DIR/bus

 With $UID replaced by the callers numer user ID, and $XDG_RUNTIME_DIR
 following the XDG basedir spec.

 Of course the $DBUS_SYSTEM_BUS_ADDRESS and $DBUS_SESSION_BUS_ADDRESS
 variables should still take precedence.

 DBUS SERVICE FILES

 Activatable services for kdbus may not use classic dbus1 service
 activation files. Instead, programs should drop in native systemd
 .service and .busname unit files, so that they are treated uniformly
 with other types of units and activation of the system.

 Note that this results in a major difference to classic dbus1:
 activatable bus names can be established at any time in the boot process.
 This is unlike dbus1 where activatable names are unconditionally available
 as long as dbus-daemon is running. Being able to control when
 activatable names are established is essential to allow usage of kdbus
 during early boot and in initrds, without the risk of triggering
 services too early.

 DISCLAIMER

 This all is so far just the status quo. We are putting this together, because
 we are quite confident that further API changes will be smaller, but
 to make this very clear: this is all subject to change, still!

 We invite you to port over your favorite dbus library to this new
 scheme, but please be prepared to make minor changes when we still
 change these interfaces!