Writing systemd Units 2016-06-23


Systemd has become the defacto new standard init for Linux-based systems. While not everyone has made the switch yet, pretty much all the major distros have made the decision to switch.

For most people this has not meant all that much yet, other than a lot of controversy. Systemd has built in SysV init system compatibility, and so it's possible to avoid dealing with it quite well.

But there is much to be gained from picking up some basics. Systemd is very poweful.

I'm not going to deal with the basics of interacting with systemd as that's well covered elsewhere. You can find a number of basic tips and tricks here.

Instead I want to talk about how to write systemd units.


(Click here to read about my consulting services - Devops, Linux, Systemd, Cloud/AWS/GCE, Software development and more)


What is a systemd unit?

A systemd unit is pretty much the "superclass" of a bunch of other systemd concepts.

Systemd lets you manage dependencies between services (your applications, daemons etc.), sockets, mountpoints and a number of other things. Each of these are a unit.

All units share a number of possible configuration options.

Systemd schedules the activation and deactivation of such units based on a dependency graph and various events.

Here is an example of a displaying an installed unit file (dbus.service) using systemctl:


    $ systemctl cat dbus
    # /usr/lib64/systemd/system/dbus.service
    [Unit]
    Description=D-Bus System Message Bus
    Documentation=man:dbus-daemon(1)
    Requires=dbus.socket
    
    [Service]
    ExecStart=/usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
    ExecReload=/usr/bin/dbus-send --print-reply --system --type=method_call --dest=org.freedesktop.DBus / org.
    OOMScoreAdjust=-900

A couple of observations:

  1. Using systemctl for this is quite useful as the unit files can be installed any number of places. Systemctl both saves you from having to know where, and shows you which file various parts of the aggregate systemd config for a given unit comes from.
  2. as you can see, the format is a ".ini" style (or ".desktop") text format.

The [Unit] section is generic for all unit files. [Service] is specific to service files.

The "Unit" section above is fairly standard and self-explanatory. "Requires" creates a dependency on the "dbus.socket" unit. Systemd provides a rich set of ways of indicating dependencies to allow it to parallelise startup as much as possible while also making it easy to ensure units are brought up and down in the pre-requisite order.

Systemd units can also specify conditions indicating whether or not the actual activation of the unit should be skipped. For example you may chose to only activate a unit if a given file or directory exists.

Systemd services

The first thing you may want to actually use is likely a systemd service. A service is any process that should be controlled by systemd.

This can be "oneshot" services: A process that's executed, runs and then exits and is not meant to be restarted other than when something explicitly asks for it (via dependencies, or user action). This might be for example a process that reads a config file and updates various files on the server on every boot.

Services can also be long-running processes, such as e.g. a web server.

Unlike a SysV-init, systemd can and typically will handle automatic restarts, and comes with a rich set of configuration options for determining how it should deal with errors, cleanup, notifications etc.

With the right setup it can also handle cleanup of child-processes etc. pretty much automatically.

Let's write a simple one-shot service first.

For the purposes of testing, if you have a systemd system handy (a VM works fine; recent versions of Debian can run Systemd, and CoreOS is another good alternative), you can put this in /etc/systemd/system/oneshot.service:


    [Unit]
    Description=Our test oneshot service
    
    [Service]
    Type=oneshot
    ExecStartPre=/bin/echo "Starting"
    ExecStart=/bin/echo "Hello World"
    ExecStopPost=/bin/echo "Stopping"

Then try this:


    $ systemctl start oneshot
    $ systemctl status oneshot
    ● oneshot.service - Our test oneshot service
       Loaded: loaded (/etc/systemd/system/oneshot.service; static)
       Active: inactive (dead)
          
    Oct 27 04:02:21 example.com echo[3570]: Starting
    Oct 27 04:02:21 example.com echo[3575]: Hello World
    Oct 27 04:02:21 example.com echo[3578]: Stopping

It should seem reasonably self-explanatory, I hope.

The ExecStartPre command is executed before the ExecStart command. ExecStopPost is executed after the service has been stopped. ExecStart represents the command that will start the service. As long as this command is running, the service is running.

There can be more than one ExecStartPre and ExecStopPost, and their execution does not influence when the service is deemed to be started or stopped respectively for the purposes of systemd's dependency resolution (that is, e.g. a service that depends on our example service starting will not be scheduled until the ExecStart command has started executing)

In this case, since Type=oneshot, the service instantly stops without any attempt to restart it.

Notice how I wrote /bin/echo instead of echo? systemd is picky about the syntax for the "Exec*" commands. You can read more about the details of that on the systemd.service man-page

You could, if you wanted to, use this to package up stuff to execute on your server with systemctl start, but in itself it would not buy you anything over just putting the commands in a script.

But if you want to execute something that needs to be executed based on dependencies, or that needs to pull in other dependencies, you now have a starting point. The missing piece to make the above execute on boot is to add an [Install] section:


    [Install]
    WantedBy=multi-user.target

Basically the "Install" section lists directives you want systemd to add to another unit when you call "systemctl enable myservicefile" (instead of systemctl start myservicefile)

(NOTE: If you want to test changes to an existing service file, you need to do systemctl daemon-reload in between to make systemd pick up the changes, unless you reboot or call systemctl enable)

Starting a long-running service

A long-running service is very similar to a oneshot command.


    [Unit]
    Description=Foo
    
    [Service]
    ExecStart=/bin/sh -c 'while true; do echo "Running"; sleep 5; done'
    
    [Install]
    WantedBy=multi-user.target

You'll note the lack of Type=onshot. And I've omitted the ExecStartPre / ExecStopPost directives, though it's perfectly valid to include them for long-running services.

And after copying this to /etc/systemd/system and running systemctl enable foo and systemctl start "foo, systemctl status foo now gives (after waiting a few seconds to show it runs through more than once):


    ● foo.service - Foo
       Loaded: loaded (/etc/systemd/system/foo.service; enabled)
       Active: active (running) since Tue 2015-10-27 04:24:43 GMT; 5s ago
     Main PID: 7818 (sh)
       CGroup: /system.slice/foo.service
               ├─7818 /bin/sh -c while true; do echo "Running"; sleep 5; done
               └─7923 sleep 5
    
    Oct 27 04:24:43 example.com sh[7818]: Running
    Oct 27 04:24:48 example.com sh[7818]: Running
    

The biggest difference you'll note is that it now shows information about the processes that has been started (and compared to SysV init systems, notice how awesome it is to have this information so readily available). Systemd tries very hard to keep track of all processes started for your service, by using cgroups to isolate them so that it can clean up even quite misbehaved services.

If you now do systemctl stop foo and systemctl status foo again you'lll note it gives you the exit code etc.

Stopping, reloading etc

Using ExecStop and ExecReload you can determine which commands systemd will execute to try to stop or reload your process gracefully. In the absence of ExecStop, systemd will send SIGTERM and then SIGKILL to your processes when asked to stop or restart them, so for many applications you can leave out explicit stop instructions. If your app supports a graceful "reload" operation, then you should specify ExecReload so people can reload it without looking up your specific mechanism.

Adding dependencies

But just the simple examples above does not give much over SysV init and derivatives, so let us do something more advanced.

Lets say you want have a service that needs another service to run before it will work. E.g. before starting your web server, you'll probably want the network to be up.

You can do this easily by specifying scheduling dependencies.

Let's say our "oneshot" service requires foo.service to be running first:


    [Unit]
    Description=Our test oneshot service
    After=foo.service
    Requires=foo.service
    
    [Service]
    Type=oneshot
    ExecStartPre=/bin/echo "Starting"
    ExecStart=/bin/echo "Hello World"
    ExecStopPost=/bin/echo "Stopping"

Notice how we've added both After and Requires:

After specifies an ordering dependency. In other words, which unit gets started first assuming both get scheduled to start. But without other directives, it does not specify that foo.service needs to be started.

Requires on the other hand specifies that foo.service must be started when oneshot.service is started, but does not explicitly require an order.

Without the After, they could have been started at the same time. In this case it wouldn't have mattered, but try to enforce ordering where there is a genuine dependency.

To test this, copy the new oneshot.service file into place and don't forget to do systemctl daemon-reload. Then try systemctl start oneshot followed by systemctl status foo (I'm assuming you still have it in place from earlier).

You should see that "foo" has been started.

See the man-page for more on the dependencies.

Automatically stopping units

You may have thought "hmm, but foo.service isn't needed anymore". If foo.service is only ever needed by services managed by systemd on this system, you may want to have it automatically shut down when no dependencies requires it to be running. You can do that like this:


    [Unit]
    Description=Foo
    StopWhenUnneeded=true
    
    [Service]
    ExecStart=/bin/sh -c 'while true; do echo "Running"; sleep 5; done'

Copy this into place in /etc/systemd/system/foo.service. Run systemctl stop foo to make sure the previous instance was stopped, and systemctl daemon-reload to load the modified service file.

Now try systemctl start oneshot again. If you do systemctl status foo it should be stopped, but there should be a single new "Running" line in the log output, proving that it was started, then stopped again when there were no longer any dependencies requiring it.

Ensuring services are automatically restarted

Systemd supports a wide range of mechanisms to determine how to restart failed services.

The most basic is to add a Restart=something line to the [Service] group.

"something" can be a number of things, but the most common for you are likely to be:

When restarting the service you may want to add a delay to prevent overloading the system if there's a real problem. You can do this with RestartSec=time-value where "time-value" can be e.g. 5s or 200ms or 5min 20s.

One gotcha is that systemd will enforce this interval even if you manually run systemctl restart servicename. While this can be useful to enforce e.g. safe restart intervals for some processes, don't overdo it.

You may also want to consider defining TimeoutStartSec= which specifies how long systemd should wait for the startup to complete before it considers the startup to have failed and tries to restart theunit.

If this is set too low, your service may end up just restarting repeatedly.

(Note that I specifically said it will wait for the startup to complete; for the units we have seen so far, this is until ExecStart has started execution (but not until it has exited), however systemd also has a mechanism to let the starting process notify systemd when it has fully initialized, which can be very useful if you have a process that takes a while before it is usefully accessible - this makes for better dependencies. We won't be looking into that here)

Also relevant is StartLimitInterval=. If a process restarts too often within a short time interval, systemd will mark the unit as failed and stop trying to restart it. If this is causing problems, either change the interval, or if you specify StartLimitInterval=0, systemd will never stop trying to restart the unit.

Be careful: if your process consumes a lot of resources when starting, and you set StartLimitInterval to 0 and sets RestartSec too low, systemd will happily do what you've told it and proceed to put your system under a lot of load.

Handling failure

If your unit fails, systemd provides two mechanisms (at least...) to take corrective action; for example by notifying you:

OnFailure= in the [Unit] section lets you specify a unit to activate if the current unit fails.

For services, a FailureAction= lets you tell systemd to reboot or shut down your system if a service fails.

OnFailure is perhaps the most likely one to be something you might use.

Let's break oneshot.service and add something to get run on failure:


    [Unit]
    Description=Our test oneshot service
    OnFailure=failure.service
    
    [Service]
    Type=oneshot
    ExecStart=/bin/exit 1

And failure.service:


    [Unit]
    Description=Our failure handler
    
    [Service]
    Type=oneshot
    ExecStart=/bin/echo "You failed"

After copying these into place, and issuing a "systemctl daemon-reload", you can do this:


    $ systemctl status failure
    ● failure.service - Our failure handler
       Loaded: loaded (/etc/systemd/system/failure.service; static)
       Active: inactive (dead)
    $ systemctl start oneshot
    Job for oneshot.service failed. See 'systemctl status oneshot.service' and 'journalctl -xn' for details.
    $ systemctl status failure
    ● failure.service - Our failure handler
       Loaded: loaded (/etc/systemd/system/failure.service; static)
       Active: inactive (dead) since Tue 2015-10-27 05:11:31 GMT; 3s ago
      Process: 19420 ExecStart=/bin/echo You failed (code=exited, status=0/SUCCESS)
     Main PID: 19420 (code=exited, status=0/SUCCESS)
    
    Oct 27 05:11:31 example.com echo[19420]: You failed

You can trigger e-mails; log more details; Wake someone up with a page; reboot the system. Anything you want. And since this is handled by systemd, it will execute as long as systemd is still running and able to start it, regardless of how broken your service is, so it works even in e.g. cases where your service gets killed without being able to attempt to handle errors.

Template units

Here's a neat trick: Systemd supports "template" units. These have names like foo@.service. If you try to start foo@bar.service, systemd first looks for foo@bar.service. But if it doesn't exist, it will look for foo@.service. This is great for spinning up units with different arguments. In this case, let us quickly parameterise failure.service so that it can tell us what failed. First lets update oneshot.service:


    [Unit]
    Description=Our test oneshot service
    OnFailure=failure@oneshot.service
    
    [Service]
    Type=oneshot
    ExecStart=/bin/exit 1

And replace failure.service with failure@.service:


    [Unit]
    Description=Our failure handler
    
    [Service]
    Type=oneshot
    ExecStart=/bin/echo "Service '%i' failed"

Reload as usual, and try systemctl start oneshot followed by systemctl status failure@oneshot:


     systemctl status failure@oneshot
     ● failure@oneshot.service - Our failure handler
        Loaded: loaded (/etc/systemd/system/failure@.service; static; vendor preset: disabled)
           Active: inactive (dead) since Tue 2016-01-12 02:57:54 UTC; 1min 6s ago
             Process: 9053 ExecStart=/bin/echo Service '%i' failed (code=exited, status=0/SUCCESS)
              Main PID: 9053 (code=exited, status=0/SUCCESS)
              
              Jan 12 02:57:54 example.com systemd[1]: Starting Our failure handler...
              Jan 12 02:57:54 example.com echo[9053]: Service 'oneshot' failed
              Jan 12 02:57:54 example.com systemd[1]: Started Our failure handler.
              

(There's nothing specific to failure handling about this, you can use it for anything. '%i' in your Exec* commands will be replaced with the part after '@' in the unit name - use it as you please)

A slight taste of more advanced features

The above is just a tiny little sliver of what systemd provides. The real power comes once you start combining the above with conditions, timers, paths, mounts, sockets and more. This allows you to do fun things like wait for changes to paths, wait for a specific time, wait until a specific drive has been mounted, or until someone tries to connect to a socket, and much more.

But if you don't feel like deciphering those man-pages, here's a little (contrived) example: A poor mans message queue.

Lets replace oneshot.service again:


    [Unit]
    Description=Our test oneshot service
    
    [Service]
    Type=oneshot
    ExecStart=/bin/bash -c 'F=/tmp/oneshot-queue/$(ls /tmp/oneshot-queue/ | head -n1) && cat $F && rm $F

The above ExecStart just picks the first filename in /tmp/oneshot-queue, cat's it so its content ends up in the journal, and then removes the file. It's just a crude example.

Now create a oneshot.path that will monitor a path for us:


    [Unit]
    Description=Triggers oneshot when a message arrives in /tmp/oneshot-queue
    
    [Path]
    DirectoryNotEmpty=/tmp/oneshot-queue
    MakeDirectory=true
    Unit=oneshot.service

What this does is tell systemd to monitor /tmp/oneshot-queue (creating it if it doesn't exist), and as long as there are files in there, keep restarting oneshot.service repeatedly.

Copy them in place in /etc/systemd/system and issue systemctl daemon-reload as before, and you can test it like this:


    $ systemctl start oneshot.path
    $ systemctl status oneshot.path
    ● oneshot.path - Triggers oneshot when a message arrives in /tmp/oneshot-queue
       Loaded: loaded (/etc/systemd/system/oneshot.path; enabled)
       Active: active (waiting) since Tue 2015-10-27 05:34:38 GMT; 4s ago
    $ echo "hi there" >/tmp/oneshot-queue/x
    $ systemctl status oneshot
    ● oneshot.service - Our test oneshot service
       Loaded: loaded (/etc/systemd/system/oneshot.service; static)
       Active: inactive (dead) since Tue 2015-10-27 05:34:52 GMT; 4s ago
      Process: 25414 ExecStart=/bin/bash -c F=/tmp/oneshot-queue/$(ls /tmp/oneshot-queue/ | head -n1) && cat $F && rm $F (code=exited, status=0/SUCCESS)
     Main PID: 25414 (code=exited, status=0/SUCCESS)
    
    Oct 27 05:34:52 example.com bash[25414]: hi there

Systemd paths uses inotify and can check for the presence of a specific path or pattern, or writes or changes to a file or directory too. E.g. imagine you have config files that depends on having the right nameserver details - you can use systemd paths to watch /etc/resolv.conf for changes and trigger a script to update your config accordingly.

That's it for this time. Please comment if you'd like examples of more advanced systemd usage.


(Click here to read about my consulting services - Devops, Linux, Systemd, Cloud/AWS/GCE, Software development and more)



blog comments powered by Disqus