Monday, November 23, 2015

An overview of GnuTLS 3.4.x

This April GnuTLS 3.4.0 was released, as our stable-next  branch, i.e., the branch to replace the current stable branch of GnuTLS (which as of today is the 3.3.x branch). During that time, we also moved our development infrastructure from gitorious to, a move that has improved the project management significantly. We now have automated builds (on servers running on an Openstack system supplied by Red Hat), continuous integration results integrated to the development pages, as well as simple to use milestone and issue trackers. For a project which as it stands is mostly managed by me, that move was godsend, as we can have several new development and management features with considerable less effort. That move unfortunately was not without opposition, but at the end a decision had to be taken, which is driven by pragmatism.

Assessing the last 12 months we had quite a productive year. Several people offered fixes and enhancements, and I still remain the main contributor -though I would really like for that to change.

Anyway, let me get to the overview of the GnuTLS 3.4.x feature set. For its development we started with a basic plan describing the desired features (it still be seen at this wiki page), and we also allowed for features which did not threaten backwards compatibility to be included during in the first releases. These were tracked using the milestone tracker.

As we are very close to tag the current release as stable, I'll attempt to summarize the new features and enhancements in the following paragraphs, and describe our vision for the future of the library.
  •  New features:
    • Added a new and simple API for authenticated encryption. The new API description can be seen in the manual. The idea behind that change was to make programming errors in encryption / decryption routine usage as difficult as possible. That was done by combining encryption and tagging, as well as decryption and verification of the tag (MAC). That is of course taking advantage of the AEAD cipher constructions and currently can be used by the following AEAD ciphers: AES-GCM, AES-CCM, CAMELLIA-GCM, and CHACHA20-POLY1305.
    • Simplified the verification API. One of the most common complains on TLS library APIs is their involved complexity to perform a simple connection. To our defense part of that complexity is due to specifications like PKIX which forwards the security decisions to application code, but partly also because the APIs provided are simply too complicated. In that particular case we needed to get rid the need for elaborate callbacks to perform certificate verification checks, and include the certificate verification in the handshake process. That means, to move the key purpose, and hostname verification to the handshake process.  For that we simplified the current API by introducing gnutls_session_set_verify_cert() and gnutls_session_set_verify_cert2(), when a key purpose is required. The call of the former function instructs GnuTLS to verify the peer's certificate hostname as part of the handshake, hence eliminating the need for callbacks. An example of a session setup with the new API can be seen in the manual. If we compare the old method and the new, we can see that it reduces the minimum client size by roughly 50 lines of code.
    • Added an API to access system keys. Several operating systems provide a system key store where private keys and certificates may be present. The idea of this feature was to make these keys readily accessible to GnuTLS applications, by using a URL as a key identifiers (something like system:uuid=xxx). Currently we support the windows key store. More information can be seen at the system keys manual section.
    • Added an API for PKCS #7 structure generation and signing. Given that PKCS #7 seems to be the format of choice for signing kernel modules and firmware it was deemed important to provide such an API. It is currently described in gnutls/pkcs7.h and can be seen in action in the modsign tool.
    • Added transparent support for internationalized DNS names. Internationalized DNS names require some canonicalization prior to being compared or used. Previous versions of GnuTLS didn't support that, disallowing the comparison of such names, but this is the first version which adds support for the RFC6125 recommendations.
  •  TLS protocol updates:
    • Added new AEAD cipher suites. There are new ciphersuites being defined for the TLS protocol, some for the Internet of Things profile of TLS, and some which involve stream ciphers like Chacha20. This version of TLS adds support for both of these ciphersuites, i.e., AES-CCM, AES-CCM-8, and CHACHA20-POLY1305.
    • Added support for Encrypt-then-authenticate mode for CBC. During the last decade we have seen several attacks on the CBC mode, as used by the TLS ciphersuites. Each time there were mitigations for the issues at hand, but there was never a proper fix. This version of GnuTLS implements the "proper" fix as defined in rfc7366. That fix extends the lifetime of the CBC ciphersuites.
    • Included Fix for the triple-handshake attack. An other attack on the TLS protocol which required a protocol fix is the triple handshake attack. This attack took advantage of clients not verifying the validity of the peer's certificate on a renegotiation. This release of GnuTLS implements the protocol fix which ensures that the applications which rely on renegotiation are not vulnerable to the attack.
    • Introduced support for fallback SCSV. That is provide the signaling mechanism from RFC7507 so that applications which are forced to use insecure negotiation can reasonably prevent downgrade attacks.
    • Disabled SSL 3.0 and RC4 in default settings. Given the recent status on attacks on both the SSL 3.0 protocol, and the RC4 cipher, it was made apparent that these no longer are a reasonable choice for the default protocol and ciphersuite set. As such they have been disabled.

That list summarizes the major enhancements in the 3.4.x branch. As you can see our focus was not to only add new security related features, but to also simplify the usage of the currently present features. The latter is very important, as we see in practice that TLS and its related technologies (like PKIX) are hard to use even by experts. That is partially due to the complexity of the protocols involved (like PKIX), but also due to very flexible APIs which strive to handle 99% of the possible options allowed by the protocols, ignoring the fact that 99% of the use cases found in the Internet utilize a single protocol option. That was our main mistake with the verification API, and I believe with the changes in 3.4.x we have finally addressed it.

Furthermore, a long time concern of mine was that the stated protocol complexity for PKIX and TLS, translates to tens of thousands lines of code, hence to a considerable probability for the presence of a software error. We strive to reduce that probability, for example by using fuzzying or static analyzers like coverity and clang, and audit, but it cannot be eliminated. No matter how low the probability of an error per line of code is, it will become certainty in large code bases. Thus my recommendation for software which can trade performance for security, would be to consider designs where the TLS and PKIX protocols are confined in sandboxed environments like seccomp. We have tested such a design on the openconnect VPN server project and we have noticed no considerable obstacles or significant performance reduction. The main requirement in such an environment is IPC, and being able to pack structures in an efficient way (i.e., check protocol-buffers). For such environments, we have added a short text in our manual describing the system calls used by GnuTLS, to assist with setting up filters, and we welcome comments for improving that text or the approach.

In addition, to allow applications to offer, a multi-layer security, involving private key isolation, we continue to enhance, the strong point of GnuTLS 3.3.x. That is, the transparent support for PKCS #11. Whilst sometimes it overlooked as a security feature, PKCS #11 can provide an additional layer of security by allowing to separate private keys from the applications using them. When GnuTLS is combined with a Hardware Security Module (HSM), or a software security module like caml-crush, it will prevent the leak of the server's private key even if the presence of a catastrophic attack which reveals the server's memory (e.g., an attack like heartbleed).

Hence, a summary of our vision for the future, would be to be able to operate under and contribute to multi-layer security designs. That is, designs where the TLS library isn't the Maginot line, after which there is no defense, but rather where the TLS library is considered to be one of the many defense layers.

So, overall, with this release we add new features to simplify the library, modernize its protocol support but also enhance GnuTLS in a way to be useful in security designs where multiple layers of protection are implemented (seccomp, key isolation via PKCS #11). If that is something that you are interested at, we welcome you to join our development mailing list, visit our development pages at gitlab, and help us make GnuTLS better.

Monday, June 15, 2015

Software Isolation in Linux

Starting from the assumption that software will always have bugs, we need a way to isolate and neutralize the effects of the bugs. One approach is by isolating components of the software such that a breach in one component doesn't compromise another. That, in the software engineering field became popular with the principle of privilege separation used by the openssh server, and is the application of an old idea to a new field. Isolation for protection of assets is  widespread in several other aspects of human activity; the most prominent example throughout history is the process of quarantine which isolates suspected carriers of epidemic diseases to protect the rest of the population. In this post, we briefly overview the tools available in the Linux kernel to offer isolation between software components. It is based on my FOSDEM 2015 presentation on software isolation in security devroom, slightly enhanced.

Note that this text is intended to developers and software engineers looking for methods to isolate components in their software.

An important aspect of such an analysis is to clarify the threats that it targets. In this text I describe methods which protect against code injection attacks. Code injection provides the attacker a very strong tool to execute code, read arbitrary memory, etc.

In a typical Linux kernel based system, the available tools we can use for such protection, are the following.
  1. fork() + setuid() + exec()
  2. chroot()
  3. seccomp()
  4. prctl()
  5. SELinux
  6. Namespaces

The first allows for memory isolation by using different processes, ensuring that a forked process has no access to parent's memory address space. The second allows a process to restrict itself in a part of the file system available in the operating system. These two are available in almost all POSIX systems, and are the oldest tools available dating to the first UNIX releases. The focus of this post are the methods 3-6 which are fairly recent and Linux kernel specific.

Before proceeding, let's make a brief overview of what is a process in an UNIX-like operating system. It is some code in memory which is scheduled to have a time slot in the CPU. It can access the virtual memory assigned to it, make calculations using the CPU user-mode instructions, and that's pretty much all. To access anything else, e.g., get additional memory, access files, read/write the memory of other processes, system calls from the operating system have to be used.

Let's now proceed and describe the available isolation methods.


After that introduction to processes, seccomp comes as the natural method to list, since it is essentially a filter for the system calls available in a process. For example, a process can install a filter to allow read() and write() but nothing else. After such a filter applied, any code which is potentially injected to that process will not be able to execute any other system calls, reducing the attack impact to the allowed calls. In our particular example with read() and write() only the data written and read by that process will be affected.

The simplest way to access seccomp is via the libseccomp library which has a quite intuitive API. An example using that library, which creates a whitelist of three system calls is shown below.

    #include <seccomp.h>

    scmp_filter_ctx ctx;
    ctx = seccomp_init(SCMP_ACT_ERRNO(EPERM))
    assert(ctx == 0);
    assert(seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(read), 0) == 0);
    assert(seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 0) == 0);
    assert (seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(ioctl), 1, SCMP_A1(SCMP_CMP_EQ, (int)SIOCGIFMTU)) == 0);
    assert (seccomp_load(ctx) == 0);

The example above installs a filter which allows the read(), write() and ioctl() system calls. The latter is only allowed if the second argument of ioctl() is SIOCGIFMTU. The first line of filter setup, instructs seccomp to return -1 and set errno to EPERM when an instruction outside this filter is called.

The drawback of seccomp is the tedious process which a developer has to go in order to figure out the used system calls in his application. Manual inspection of the code is required to discover the system calls, as well as inspection of run traces using the 'strace' tool. Manual inspection will allow a making a rough list which will provide a starting point, but will not be entirely accurate. The issue is that calls to libc functions may not correspond to the expected system call. For example, a call exit() often results to an exit_group() system call.

The trace obtained using 'strace' will help clarify, restrict or extend the initial list. Note however, that using traces alone may prevent getting the system calls used in error condition handling, and different versions of libc may use different system calls for the same function. For example, the libc call select() uses the system call select() in x86-64 architecture, but the _newselect() in the x86.

The performance cost of seccomp is the cost of executing the filter, and for most cases it is a fixed cost per system call. In the openconnect VPN server I estimated the impact of seccomp on a worker process to 2% slowdown in transfer speed (from 624 to 607Mbps). The measured server worker process is executing read/send and recv/write in a tight loop.


The PR_SET_DUMPABLE flag of the prctl() system call protects a process from other processes accessing its memory. That is, it will prevent processes with the same privilege as the protected to read its memory it via ptrace(). A usage example is shown below.

    #include <prctl.h>

    prctl(PR_SET_DUMPABLE, 0);

While this approach doesn't protect against code injection in general, it may prove a useful tool with low-performance cost in several setups.


SELinux is an operating system mechanism to prevent processes from accessing various "objects" of the OS. The objects can be files, other processes, pipes, network interfaces etc. It may be used to enhance seccomp using more fine-grained access control. For example one may setup a rule with seccomp to only allow read(), and enhance the rule with SELinux for read() to only accept certain file descriptors.

On the other hand, a software engineer may not be able to rely too much on  SELinux to provide isolation, because it is often not within the developer's control. It is typically used as an administrative tool, and the administrator may decide to turn it off, set it to non-enforcing mode, etc.

The way for a process to transition to a different SELinux ruleset is via exec() or via the setcon() call, and its cost is perceived to be high. However, I have no performance tests with software relying on it.

The drawbacks of this approach are the centralized nature of the system policy, meaning that individual applications can only apply the existing system policy, not update it, the obscure language (m4) a system policy needs to be written at, and the fact that any policy written will not be portable across different Linux based systems.

Linux Namespaces

One of the most recent additions to the Linux kernel are the Namespaces feature. These allow "virtualizing" certain Linux kernel subsystems in processes, and the result is often referred to as containers. It is available via the clone() and unshare() system calls. It is documented in the unshare(2) and clone(2) man pages, but let's see some examples of the subsystems they can be restricted.

  • NEWPID: Prevents processes to "see" and access process IDs (PIDs) outside their namespace. That is the first isolated process will believe it has PID 1 and see only the processes it has forked.
  • NEWIPC: Prevents processes to access the main IPC subsystem (shared memory segments, messages queues etc.). The processes will have access to their own IPC subsystem.
  • NEWNS: Provides filesystem isolation, in a way as a feature rich chroot(). It allows for example to create isolated mount points which exist only within a process.
  • NEWNET: Isolates processes from the main network subsystem. That is it provides them with a separate networking stack, device interfaces, routing tables etc.
Let's see an example of an isolated process being created on its own PID namespace. The following code operates as fork() would do.

    #if defined(__i386__) || defined(__arm__) || defined(__x86_64__) || defined(__mips__)
        long ret;
        int flags = SIGCHLD|CLONE_NEWPID;
        ret = syscall(SYS_clone, flags, 0, 0, 0);
        if (ret == 0 && syscall(SYS_getpid) != 1)
                return -1;
        return ret;

This approach, of course has a performance penalty in the time needed to create a new process. In my experiments with openconnect VPN server the time to create a process with NEWPID, NEWNET and NEWIPC flags increased the process creation time to 10 times more than a call to fork().

Note however, that the isolation subsystems available in clone() are by default reversible using the setns() system call. To ensure that these subsystems remain isolated even after code injection seccomp must be used to eliminate calls setns() (many thanks to the FOSDEM participant who brought that to my attention).

Furthermore, the approach of Namespaces follows the principle of blacklisting, allowing a developer to isolate from certain subsystems but not from every one available in the system. That is, one may enable all the isolation subsystems available in the clone() call, but then he may realize that the kernel keyring is still available to the isolated process. That is because there is no implementation of such an isolation mechanism for the kernel keyring so far.


In the following table I attempt to summarize the protection offerings of each of the described methods.

Prevent killing
other processes
Prevent access to memory
of other processes
Prevent access to
shared memory
Prevent exploitation
of an unused system call bug
Seccomp True True True True
prctl(SET_DUMPABLE) False True False False
SELinux True True True False
Namespaces True True True False

In my opinion, seccomp seems to be the best option to consider as an isolation mechanism when designing new software. Together with Linux Namespaces it can prevent access to other processes, shared memory and filesystem, but the main distinguisher I see, is the fact that it can restrict access to unused system calls.

That is an important point, given the number of available system calls in a modern kernel. It is not only that it reduces the overall attack surface by limiting them, but it will also deny access to functionality that was not intended to --see setns() and the kernel keyring issue above.