Skip to main content

Signing elf binaries, or not?! Lessons learned.

· 9 min read
Jens Reimann
Maintainer

Trying to figure out what went into a binary can be a tricky thing. And once you figured it out, how do you transport this information? True, it all starts simple: Java, NodeJS, Go, or Rust, all languages1 bring their dependency management, which defines what the final command line tool you create is made of. Or, does it?

But let's take a step back: A typical use-case today is to download a command line application from the internet. Take helm for example. You navigate to their GitHub releases page, download the binary, unzip it into a local folder and run it. But, what exactly is inside the binary?

SBOMs

Software Bills of Material (SBOMs) are a thing which already exist. Yet, someone needs to create them, and creating an accurate SBOM can be tricky.

There are tools to create SBOMs from the source code of a project, but that doesn't tell the whole story in most cases. Those tools simply analyze the dependency information declared in e.g. a package.lock file, or a Maven pom.xml. Should be enough, right? Well, no. Maven projects for example have "profiles", and you need to exactly generate the SBOM for the profile that gets enabled when creating the final artifact. Also, with Maven mirrors and proxies, it's not always 100% clear what gets into a "binary" and where it comes from.

What really goes in

Then again, JAR files aren't real binaries are they. They are ZIP files, which contain compiled .class files. So it actually is pretty simple to understand what is in there. Assuming you trust your build process (which is a topic of its own). Even if you create a "fat", or "shaded" JAR, it is possible to understand what really ended up in your "binary".

And, others can do the same. Rust for example allows one use cargo auditable, to tap into the compilation process, and record the actual dependencies which go into a binary. On Linux, the binary will be an ELF file, which then contains the dependency information from the compilation process. And Go can do the same.

Null and void

But, if someone can write dependency information into a binary, then someone else can also overwrite it. So unless you protect the binary against modifications, this information isn't really trustworthy. Taking a look at the Helm release page, you will find SHA based checksums. Isn't that enough?

No, not really. Because if someone can alter the binary on the download page, the attacker can also swap out the SHA checksum file. The checksum, or digest, really isn't more than a checksum. By its own, it doesn't protect much.

If however, you encrypt the digest with a private key that only the author knows, and you publish the public key part, then this becomes a "signature". And this is good enough to prove, that only the person how knew the private key, could have signed the binary. And if we can trust this person to create the correct SBOM and binary, and keep the private key secure, we are good.

Usability

Or, not! In the Maven world JARs have been signed for quite a while. Everyone uploading JARs to Maven Central needs to sign their JAR with GPG. And I guess most people never validated a JAR during a build.

On the other side, the Eclipse IDE (as well as some other Java applications), did JAR file validation for quite a while. Whenever you install a plugin, it cryptographically validates the JAR. And it's easy, as the JAR file itself is signed, and the signature is part of the JAR file. As part of the build process in the Eclipse Foundation build system, JARs which got created by the build, can automatically get signed. No additional files needed, and only the actual build output is considered, no guessing of dependencies.

From a user perspective, the IDE automatically checks signatures, and only bothers the user if an issue was found. The user can override, because the idea is to give the final authority to the user.

Back to binaries

Now, just assume we could do the same with (ELF) binaries. In fact this was possible a while back, Solaris had some tools to sign ELF binaries. That doesn't help on modern Linux systems. But with a bit of Rust code, it was possible to create elfsign. The idea is simple:

  • Create a digest of all relevant ELF sections and headers
  • Sign this digest
  • Add the signature to the ELF binary as a new section

Solvable downsides

The first downside might be that people are afraid of signed binaries. Windows and macOS have been doing this for a while, and it happens that signatures get in the way of the user running a binary. Well, actually it is the operating system which gets into the user's way. To protect the user, that's the argument. And that might be true, but it also is true that some users indeed know better than the operating system and want to have the final word in what they can run.

This problem can easily be addressed. Checking a signature, and making a decision if a binary can be run or not, actually are two different things. Even if a system brings a default rule/policy set which would reject invalid signatures, it could still be possible to let the user customize the behavior and override, just like the Eclipse IDE does.

Another issue the handling of keys and certificates. Prices for code signing certificates can be quite high. Especially when we are talking about open source projects, this can become a truly limiting factor. It also takes a bit of care, handling a private key properly.

Luckily, we now have sigstore. Sigstore can help us with two things, creating short-lived private keys (Fulcio), and a tamper-resistant log (Rekor). We already talked a bit about both in the context of [gitsign]({% post_url 2022-12-02-sign-commits-with-sigstore %}).

Adding Fulcio and Rekor to elfsign, we gain a bunch of cool things:

  • Short-lived, disposable private keys: You don't need to store them, they are only valid for a few minutes.
  • X.509 certificates: Alongside the key, we get an X.509 certificates, with our identity, which we can use for signing.
  • An attestation that we provided the valid certificate and signature to Rekor, at a time the key was valid

And with that, we can easily implement elfsign sign to sign a binary, and elfsign verify to validate one. We could also create something like elfsign execute to verify and execute, but that's just a variant of verify.

As we can prove, using Rekor, that we did own the private key during the time the certificate was valid, and we provided the signature/digest at the same time, we now only need to decide if we want to trust the issuer and the subject the certificate was issued for. And by storing [the Rekor bundle]({% post_url 2023-01-13-sigstore-bundle-format %}), we can do this offline too.

Too good to be true?

So where's the catch?

Signing elf binaries adds a bit of extra complexity. Creating a digest of an ELF binary isn't as trivial as just running sha256sum on a file. Storing an additional "signature section" in the ELF binary, will actually alter it. So it is necessary have a normalized view on the ELF binary, which creates a reproducible digest, one that does include all important information, but excludes the signature information itself, and still is a valid ELF binary format.

It works, but is a bit complex. And more complexity might lead to more bugs, which isn't a good thing when it comes to cryptography. But if this allows one to drop handling additional files (like SBOMs or checksum files), and increase the usability, it may actually be worth it.

The problem is, that the tooling which creates the dependency information for the ELF binaries, isn't accurate enough.

In many cases it works, but as soon as you include a C library, add some JavaScript for an embedded frontend, or deviate from "standard artifact repositories", many of those tools just fall short. And I am not even talking about all those little hacks in build systems, or the mess called "vendor" folder in Go.

SBOM formats like CycloneDX allow you to compensate and fix up generated SBOMs. But, that's another step in the build process, and the output doesn't go into the ELF binary. As Go only cares about Go dependencies, and Cargo only about Cargo.

So adding all the complexity isn't good enough in the end. You still need to handle an extra file, and validate it.

Happy end?

The truth is, cosign, which is intended to sign containers, can actually sign any BLOB. Just the same way, using Fulcio to get a short-lived private key and certificate, and storing the signature in Rekor. So if we can make our peace with handling an extra file, we can just use cosign sign-blob and cosign verify-blob to sign anything we want. Using cosign attest-blob, we can even "attach" an SBOM to the Rekor log entry.

Yes, we need to handle the extra "bundle" and "signature" files. Or we can accept the fact that we need rely on the uptime of the Rekor instance (or our ISP). But it definitely improves the situation over the status quo.

So what?

elfsign was a nice experiment. And while it didn't work out, I still learned a lot. I also still believe that the idea works out in general. It just needs more work for a more specialized solution. So the approach of "cosign blob" is a more generic one. However, through that, also less user-friendly one.

But this situation could also be improved. Just assume someone would create a more convenient version of cosign, which "downloads and verifies" or "verifies and executes". That would definitely lead to a similar user experience, and help with adoption.

And, having a policy engine like Seedwing, you could actually define checks like: Only run this binary if it is signed, and does not contain a dependency which has an active CVE.

If you are interested in things like this, maybe this blog post gave you a few insights and ideas.

Footnotes

  1. Yes, C/C++ is missing here. Let's not talk about build systems and dependency management for C/C++ 😉