blog/content/debian-packaging-101.md

31 KiB

+++

class = "post" date = "2022-12-02T00:00:00-05:00" tags = ["support", "floss", "debian", "packaging"] title = "Building a Debian Package 101" description = "It's not as confusing or complicated as you think" type = "post" weight = 1 draft = false

+++

One of the most oft-repeated reasons I've heard for software not packaging for Debian and its derivatives it that Debian packaging is complicated. Now, the thing is, it can be. If you look at the manual or a reasonably complicated program from the Debian repositories, it sure seems like it is. But I'm here today to show you that it can be easy with the right guide!

My target audience for this post is anyone who has software they want to build, but who currently thinks that making a .deb is too complex, difficult, or not worth the effort. Hopefully, by the end of this post, you'll understand exactly how to do it and be able to implement your own Debian package in under 30 minutes.

If that sounds good, read on!

For simplicity's sake, I assume you're doing all this on a Debian system, or one of its derivatives like Ubuntu. Note that things like cross-architecture building are well outside our scope here, but such things are possible. Your package will match what you build it under, so if you want an Ubuntu 22.04 package, be sure to build it on an Ubuntu 22.04 system, etc.

Prerequisites

Before starting, you'll need a few dependencies. First and foremost is anything you need to actually build your program; for a lot of things that's build-essential plus a few supplemental libraries, but it could include anything else.

Keep track of what build dependencies you need, because we'll need that list later on when creating the control file.

Next install dpkg-dev, debhelper, and devscripts packages, which provide the main Debian packaging tools and some helper programs. You might also want quilt if you plan to make package-specific patches to the code, but I don't cover quilt here.

The Basics: Creating your initial debian/ folder

Start with your source code in a directory, GIT repo, etc. To start you'll want all your code in the root level, so that you can build it right from there. This helps keep the complexity down.

Our first step is to build a basic, boilerplate debian/ folder, which is a sub-directory at the root of the source code repository that provides the Debian packaging instructions. So run mkdir debian and continue.

Within that debian folder are a few key files that every build needs. I'll go through each one in turn, explaining what it does and how to write one. At the end, you'll be able to run dpkg-buildpackage to get your binary package.

Boilerplate files (compat, source/format, and source/options)

These files define some basic configuration for the build system. Given how simple and boilerplate they are, I've collected all 3 under this heading.

compat defines the Debian packaging compatibility version, i.e. what version of debhelper the package supports. What version you support depends on how old the releases of Debian you want to support are, but 8 or 9 are good baselines.

The next two entries are under the sub-directory source within the debian folder.

source/format defines the package layout format, and is normally just 1.0 with no other content in the file.

source/options defines some additional options that will be passed to dpkg-source when it builds your package. There's two main categories of entries here that I have used in my packages, though there are many more:

  • tar-ignore='<pattern>': One or more entries will define file patterns (Perl regular expressions) to ignore when creating the source tar archive. It's usually a good practice to ignore things like .git*, *.deb, and any temporary files or directories your build might produce.

  • extend-diff-ignore='<pattern>': One or more entries will define file patterns (Perl regular expressions) to ignore when when creating the diff of your source code. Generally you want to ignore any binary files in your source tree.

A good, safe default would be something like:

tar-ignore='*.deb'
tar-ignore='.git*'
extend-diff-ignore='.git*'

The copyright file defines the copyright information for your package. Usually, for simple programs, this will just match your project's license.

The file is structured as follows:

Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0

This line defines the copyright format. The 1.0 format specified here is usually sufficient. This is a link to the full manual of the contents of the copyright file, so for more advanced situations it is worth the read.

Upstream-Name: mypackage

This line specifies the upstream name of the program. It should match your program's name and the name of the source package.

Source: https://github.com/aperson/myproject

This line provides a link to your source. It can be any URL you want, but you should provide something here.

Next is a newline, followed by one or more blocks:

Files: *

This line defines what file(s) this copyright entry belongs to. For a simple project all under one license, this can just be *. The * block should always be the last block; that is, define any more specific blocks first. If not *, this should be the relative path to the file(s) under the source repository.

Copyright: 2022 A. Person <aperson@email.tld>

This line defines the copyright year and name of the copyright owner (including email address in angle brackets). This is probably you unless you're packaging up someone else's code. While this email doesn't have to be valid, it should be in case a user wants to reach you about a copyright question, and will be shown in the information about the package.

License: GPL-3

This line, and subsequent lines prefixed with a single space, define the actual license of the files. The license name should be one of those found under /usr/share/common-licenses (e.g. GPL-3, or Apache-2.0). The subsequent lines should include the short version of the license text, i.e. what you would put at the top of your source files (not the full license text). Within this block, paragraph breaks should be delineated with . characters. At the bottom it's usually best to reference the aforementioned directory as a source of license text as these contain the full version of each license.

A complete example

Here is a complete example of a copyright file for a GPL v3 program:

Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: myprogram
Source: https://github.com/aperson/myproject

Files: *
Copyright: 2022 A. Person <aperson@email.tld>
License: GPL-3
 This package is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation, version 3.
 .
 This package is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 .
 You should have received a copy of the GNU General Public License
 along with this program. If not, see <https://www.gnu.org/licenses/>
 .
 On Debian systems, the complete text of the GNU General
 Public License version 3 can be found in "/usr/share/common-licenses/GPL-3".

The control file

Now we're getting into the meat of the package. The control file defines your package, both the source component and the binary component(s). There are many available options here, but I'll provide only the most basic ones needed to build a functional package.

The "source" package section

These entries define the source package information. The entries are structured as follows:

Source: myprogram

This line defines the name of the source package, and will usually match the name of the program.

Section: misc

This line defines the section of the repository that your application goes into. What you put here is pretty arbitrary unless you want your package to be included in the official Debian repositories, so go with misc.

Priority: optional

This line defines the priority of the package. Like the above entry, this only really matters if you're making an official package, so go with optional.

Maintainer: A. Person <aperson@email.tld>

This line defines who maintains the package (and thus, who to reach out for if help is needed by an end user. This uses the same format as the person entry from copyright above, and this format will be used again later as well.

Build-Depends:  debhelper (>= 8),
                libssl-dev,
                somebuilddep

These lines define any build dependencies your package requires, i.e. what you installed in the very first section. You can safely exclude dpkg-dev (as this is implied), as well as build-essential (for the same reason), but include here any specific development libraries, additional programs, etc. that you might need to build the program. Note too the first line, which should usually be debhelper at >= the version you specified in compat above. This entry also demonstrates how to define specific version(s) of dependencies; >=/<= (greater than/less than or equal) are the most common, to specify minimum dependency versions, though other comparisons are possible in more advanced cases.

Entries in this list can be placed on one line, comma separated, or on separate lines as shown here. The final entry should not have a comma after it.

Standards-Version: 3.9.4

The version of the package standards that the package uses. I usually use 3.9.4 as a baseline for my own packages; the latest version as of writing is 4.6.1.

Homepage: https://myproject.org

This line defines a URL to the homepage of your project.

The "binary" package section(s)

These entries define the output binary package information. There should be one block for each binary package you produce from the single source package, though for a simple project there is a 1-to-1 relationship here. The entries are structured as follows:

Package: myprogram

This line defines the name of the package, usually the name of your program.

Architecture: any

This line defines the architecture that the package will support. For simple packaging, this should be any (the program can be built against any architecture that Debian supports) or all (for native cross-platform packages like Python code or documentation).

Depends: mypackagedependency (>= 1.0),
         someotherdependency,
         afinaldependency
Recommends: asoftdependency

These line defines any package relationships that the final package will have, formatted like the Build-Depends entry in the "source" section above. These are optional: if your program doesn't depend on any other (binary) packages at runtime, just leave it out, but usually you'll depend on something.

The Depends entries are strict: the package will refuse to install if any of these are missing (when using dpkg --install), and will pull them in automatically when using the package manager (e.g. apt install). Use this for any hard dependencies the program has.

The Recommends entries are malleable: the package will still install if these are missing, but this relationship exists to define anything that might be "nice to have" alongside your program. By default, apt et al will not install recommended packages, but will show them when installing the package.

Description: The oneline description of your program for 'apt search'
 Some additional lines that will describe the program in more depth.
 .
 You may have multiple paragraphs here with . deliniators.

These lines provide a description of your package so users know what they're installing. The first line (along with the Description: label) is a short version that will be shown as output when running apt search and the like. Any additional lines provide more detail for use with apt info.

A complete example

Here is a complete example of a basic control file for a simple program:

Source: myprogram
Section: misc
Priority: optional
Maintainer: A. Person <aperson@email.tld>
Build-Depends:  debhelper (>= 8),
                libssl-dev,
                somebuilddep
Standards-Version: 3.9.4
Homepage: https://myproject.org

Package: myprogram
Architecture: any
Depends: mypackagedependency (>= 1.0),
         someotherdependency,
         afinaldependency
Recommends: asoftdependency
Description: The oneline description of your program for 'apt search'
 Some additional lines that will describe the program in more depth.
 .
 You may have multiple paragraphs here with . deliniators.

The changelog file

The changelog file defines the current, and any past, versions of your package, along with a (generally brief) changelog, as the name implies.

This file is important when releasing new versions: whatever entry is at the top of this file is the "current version" of your program, and will determine the version of the output package. Thus you will have to add a new entry to the top of this file each time you release a new version of your package.

It is required to have at least one entry here (to define the current version of the package), but also good practice to keep older versions in descending order for as long as feasible, so people can compare what changed between various versions of your program.

The entries are structured as follows:

mypackage (1.0-1) unstable; urgency=medium

The first line defines the values for the changelog entry, and is in a very specific format.

First is the name of the program, which must match the Source: entry in the control file.

Next is the version of the package enclosed in parentheses. This should be the real version of the program that you are building. The first part (before the -) defines the "upstream" version, so in this case, we're building version 1.0 of the program, corresponding to a hypothetical Git tag of v1.0. The second part (after the -) is the version of the package. This can be used to define multiple versions of the package that use the same underlying upstream version; unless you're doing complicated stuff involving delegating packaging, just set this to -1 or -0, or leave it out altogether.

Next is the code-name of the release of the package; just set this to unstable. Note the semicolon after this.

Finally is the "urgency" of the package. This is used by apt to determine how "important" the update is, but can be pretty arbitrary. I usually use urgency=medium as a safe default.

  * Here is a changelog entry
  * Here is another changelog entry

The next section, separated from the first line by an extra newline, contains individual changelog entries. You must provide at least one explaining what's changed, but you can specify several as shown here. Each entry must be prefixed by two spaces then an asterisk (*) character before starting the entry. Standard formatting is to capitalize the first letter, keep it short and sweet, and end without a full stop (.); if you're using Git and are writing good Git commit messages, you can just use your Git commit titles here! What you put in each line is up to you, and you can include any metadata or information you might want. Finally note the trailing newline before the final line.

 -- A. Person <aperson@email.tld>  Fri, 02 Dec 2022 14:28:01 -0500

The final line of the changelog entry specifies who wrote the entry, again in a very specific format. The line begins with a single space followed by two dashes (--) then another space, followed by author in the standard name + email format (I did say it would come up again!), then two spaces, and finally an RFC Email date (i.e. the output of date --rfc-email) defining when the entry was written.

A complete example

Here is a complete example of a single changelog file entry for version 1.0 of our simple program:

mypackage (1.0-1) unstable; urgency=medium

  * Here is a changelog entry
  * Here is another changelog entry

 -- A. Person <aperson@email.tld>  Fri, 02 Dec 2022 14:28:01 -0500

If we were to add version 1.1 of the program in the future, we would add it to the top, and the file would thus look like this (note the extra line between entries):

mypackage (1.1-1) unstable; urgency=medium

  * This is a newer version after fixing a bug (GitHub #123)

 -- A. Person <aperson@email.tld>  Fri, 03 Dec 2022 18:28:01 -0500

mypackage (1.0-1) unstable; urgency=medium

  * Here is a changelog entry
  * Here is another changelog entry

 -- A. Person <aperson@email.tld>  Fri, 02 Dec 2022 14:28:01 -0500

The dch helper program

The devscripts package provides a helper program to assist in automating changelog entries, named dch. In my experience, you have to change so much from the generated content (or set so many environment variables) as to not make it worthwhile, but is something to consider if you do a lot of packaging.

The rules file

The rules file is a make script that defines how to build your package. This is the part that usually trips a lot of people up, because this file can get very complicated. However, for most simple programs using standard build tools, dh - the Debian build helper - automates a lot of the grunt work for you, and this file can thus be very simple.

The file is structured as follows; note that this is make format, so indentations must be a tab (\t) character, not spaces, and the file must be executable to work:

#!/usr/bin/make -f

The first line is a shebang line defining that this is a make script with the -f option.

export DH_VERBOSE = 1

This line sets verbosity when building the package, useful for troubleshooting.

MY_FILE := binary.out

This line defines a variable that can be used later in the script. I show this example here only to specify the format (note the :=); a simple program likely won't need any variables.

%:
	dh $@

This section defines the basic rules for the build. The %: heading is "any stage"; there are about two dozen stages in a normal package build that can be defined, and % is the "wildcard" for all of them.

Next, the tab-indented line(s) specifies what commands happen during this stage. Note that each line here is executed in its own shell context, so if you were to e.g. cd, that would get lost on the next line. In this basic example though, all we do is pass all of the arguments for the stage on to the dh program.

And that's it! Really! If your program uses ./configure && make && make install style installation, or cmake, or is a properly-formatted Python module, or really any "standard" build type, this is all you need to do. dh takes care of it all, automatically determining how to build the program, putting it in the right places, and giving you a package out the other side.

Overriding build stages

Now, of course, you can do some more advanced things in this file as well. Any stage can be overridden by using an override_dh_<stage> section, which will replace this normal dh $@ with whatever you specify. For example, lets say that make clean doesn't actually clean up all of our artifacts, so we want to define some custom cleanup that will happen as well. We can override the default dh_auto_clean step with the following to achieve this:

override_dh_auto_clean:
	rm -f artifacts/out/$(MY_FILE)
	dh $@

Note here that we also use the variable we defined above as an example; variable references in make are surrounded by normal brackets (i.e. (/)) and not curly braces (i.e. {/}) like in BASH.

Another common example is overriding dh_auto_configure to run a ./configure script with special options. For example:

override_dh_auto_configure:
	./configure --my-option-1 --my-option-1 \
		--newlined-option

Note that this example doesn't include dh $@, so dh will not be executed for it. You can use this for completely manual control of a build stage if appropriate.

You have a lot of flexibility here, which is why rules files seem so complex. But don't be scared: start simple, see if it works, and only override if you find you really need it.

Handling the pesky shell context

As mentioned above, each line runs in its own shell context. This is mostly relevant if you're moving around directories. So for example, this is not valid:

override_dh_auto_clean:
	cd artifacts/out/
	rm -f $(MY_FILE)
	cd ../..
	dh $@

Because that first cd artifacts/out/ runs in its own shell context, the next line (rm -f $(MY_FILE)), in another context, is actually relative to the base directory, not the artifacts/out/ directory! You can work around this by putting both commands on one line with a command separator (e.g. && or ;) like so:

override_dh_auto_clean:
	cd artifacts/out/ && rm -f $(MY_FILE)
	dh $@

And since context is discarded, you don't even need to worry about the cd ../.. part; you will always be back at the root of the repository on the next line.

In-built variables

One final note is a special variable that can be used, $(CURDIR). This variable is a full path to the current directory (usually the root of the repository) and can be used for commands that need a full path, for example:

override_dh_auto_clean:
	cd $(CURDIR)/artifacts/out/ && rm -f $(MY_FILE)
	dh $@

There are several other in-built variables that you can use as well, but for simplicity, I won't cover them here.

A complete example

Here is a complete example of the basic rules file, with some comments:

#!/usr/bin/make -f

# Be verbose during the build
export DH_VERBOSE = 1

# This variable contains a pesky file that 'make clean' won't remove
MY_FILE := binary.out

# Main debhelper entry
%:
	dh $@

# Override dh_auto_clean to clean up MY_FILE
override_dh_auto_clean:
	cd $(CURDIR)/artifacts/out/ && rm -f $(MY_FILE)
	dh $@

Installing files manually with install

Sometimes, and in fact quite often, you will have some static files that will need to be manually installed into the package, i.e. that your build process doesn't take care of automatically. For example, if you had a systemd service unit file called myprogram.service that needs to be installed.

These custom files can be defined in the install file, which tells the package build to add the files to the resulting package after the build is completed.

Each line in the file is structured as a source and then a destination (either a directory or filename), just like a cp or mv command.

The source is always relative to the root of the repository, while the destination is always relative to / on the target system. So using our myprogram.service example, we might put that file in debian/conf/ and then have an entry in install like so:

debian/conf/myprogram.service lib/systemd/system/

This will ensure that the myprogram.service is installed to /lib/systemd/system/myprogram.service. This is smart: if the destination is known to be a directory, you don't need the trailing / (though adding it makes it clear), otherwise it will treat it as a filename.

install shenanigans: a build-less package

This file also allows shenanigans if you want to create a "source" package that doesn't actually do any "building", just moves files around. You could for example have a rules file that does nothing:

#!/usr/bin/make -f

%:
	/bin/true

And then use install to just copy a bunch of files into place:

src/myprogram.py /usr/bin/myprogram

This can be useful for things like pure documentation or a collection of scripts that are entirely static.

An install per package

While not explicitly covered here, control lets you make multiple binary packages out of one source package. It can thus be useful to have separate install lists for each binary package. To do this, you simply start the filename with the name of the binary package (i.e. what is defined in Package: in the control file) followed by .install. For example, you could have mypackage.install and mypackage-docs.install which install different sets of files.

The conffiles file

Sometimes, you might have configuration files shipped with your program that you want users to be able to edit themselves, and that won't be (automatically) overwritten by a new version of your package. You can handle this with the conffiles file.

By default Debian will treat any file under /etc as a conffile, so you don't need to explicitly define these. Thus, if your program follows the Linux filesystem hierarchy standard, you don't need this file.

However, if you have configuration files elsewhere on the system, you should define them in this file, one file per line.

The conffiles of a program are treated specially during a package removal. apt remove will not remove them by default, in order to preserve the configuration of a package; you must use apt purge to remove any defined conffiles, so keep this in mind if you want to define them.

Controlling installation and removal with maintainer scripts

When your package is installed on a user system, it can often be useful to do "things" to the system. A canonical example would be creating a service user and enabling our example myprogram.service unit on install, then deleting the service unit and user when the package is removed.

There are 4 types of maintainer scripts that can be specified. Each script is a /bin/sh script (starting with a #!/bin/sh shebang) which can then do arbitrary things to the system. They do not need to be executable in the source repository, but will be once installed by the package.

Each script has set -o errexit enabled by default; thus any failure of any step will be a fatal error, and will terminate the configuration (and, for pre scripts, the remaining installation) of the package, so be careful to explicitly "catch" errors with || as needed. Note too that the scripts run as root, so be very careful here!

  • preinst runs during package installation, before the actual files of the program are installed. You can use it to check the sanity of the system or other similar tasks, though this file is likely the least-used.

  • postinst runs during package installation, after the actual files of the program are installed. This is the most common maintainer script, often used to configure services, add users, chown directories, etc.

  • prerm runs during package removal, before the actual files of the program are removed. This is the second most common maintainer script, often used to de-configure services, remove users, remove created directories, etc.

  • postrm runs during package remove, after the actual files of the program are removed. Some tasks in prerm could likely also go in postrm, but where you put tasks depends on the specifics of your program and what the script is doing, e.g. stop servers in prerm but remove directories in postrm.

In very simple programs, you might not need any of these scripts, or might only need one or two of them. For our example we'll only need postinst and prerm to handle our service and user.

Thus we would have a postinst as follows:

#!/bin/sh

# Create the user and set their home to /var/lib/myprogram, shell /usr/bin/nologin to prevent login
useradd \
	--no-user-group \
	--create-home \
	--home-dir /var/lib/myprogram \
	--shell /usr/bin/nologin \
	--group daemon \
	--system \
	myprogram

# Enable and start the service
systemctl enable --now myprogram.service

# Explicitly exit 0
exit 0

And a prerm as follows:

#!/bin/sh

# Disable and stop the service
systemctl disable --now myprogram.service

# Remove the user
userdel myprogram

# Clean up the data directory (don't worry about program files, 'dpkg' handles that!)
rm -rf /var/lib/myprogram

# Explicitly exit 0
exit 0

Maintainer scripts per package

Like the install file above, these maintainer scripts can be defined per-binary-package, using the same <package name>.<script> format, if your package requires it.

Don't do sketchy things in maintainer scripts!

Finally I want to point out to not do sketchy things in maintainer scripts. 2 years ago, the Raspberry Pi Foundation abused their maintainer scripts in a critical package to install a completely unrelated repository for Microsoft VS Code without any obvious traces in the usual Debian places (i.e. anywhere visible with dpkg -L/apt-file search/etc.)

DO NOT do this, EVER. Maintainer scripts are NOT for adding files to the system; that's what install and the build process are for, which allow the files installed by packages to be tracked by the dpkg system. You could perhaps make a case for modifying files in maintainer scripts, but adding new files or trying to do anything "trixy" is verboten, and certainly do not do what the RPF did. Abuse of maintainer scripts like this not only destroys user trust, but it actively hides changes to the system from the package manager, and prevents these entries from being managed and modified in the future by new package versions. It's a horrible practice all around. Use maintainer scripts only to do the bare minimum tasks needed to ensure your package will work and to clean up after it, nothing more.

Building your package

Now that you've prepared your debian folder and package configuration, it's time to actually build your new package! In the root of your source repository, run the following command:

dpkg-buildpackage

This will build the package for you. You should get 5 files out of the build, one level higher than your current directory (i.e. at ../):

  • mypackage_1.0-1_amd64.deb: The actual binary package. The version and architecture are auto-populated based on the build.

  • mypackage_1.0-1_amd64.buildinfo: A file containing information on the build, including checksums, dependencies, environment, etc.

  • mypackage_1.0-1_amd64.changes: A file containing information about the package including changelog, checksums, and the description.

  • mypackage_1.0-1.dsc: The Debian source package information.

  • mypackage_1.0-1.tar.gz: An archive of the source for use with the .dsc file.

You can then install your .deb or add it to a repository manager like reprepro.

If something went wrong, that's OK! It's common to have errors the first time you try to build a package. Either errors in rules, parts that don't build write, typos, etc. Luckily the dpkg-buildpackage command is very verbose and shows, in real-time, all the build steps that are occurring. Pay close attention to what failed and tweak your scripts or configuration to match, and try again. Once you're at this stage, and assuming that your dh_auto_clean is actually cleaning everything up properly, it's safe to re-run the build as many times as needed to get it working - and if it isn't, the command will complain and tell you about it, so you're getting plenty of feedback.

Happy building!