OpenZFS on Linux encryption – Now testing on a Linux kernel near you

20171029_openzfs

One of the holy grails for OpenZFS has been at rest encryption. (The bigger one being block pointer re-write, a type of de-fragmentation, which could be used for re-striping data.) Encryption has been an often requested ZFS feature, since Oracle Solaris ZFS has had it for years, maybe even 7 years. Several people have written about Open ZFS encryption, but I have my own thoughts on it.


A word about history.

ZFS was developed by the old Sun Microsystems beginning about 2001. It was released to Solaris production in June, 2006, though the open source version of Solaris had ZFS before then in a preview mode.

When Oracle bought Sun Microsystems about 2009, there was concern about loosing access to open source code, including Solaris and ZFS, (plus Open Office, but that is another story). So various open source repositories were made for Open Solaris. As part of that, ZFS source was also copied up to pool version 28. ZFS encryption came to Oracle Solaris ZFS as pool feature 30, after Oracle closed sourced Solaris.

So, other than some preliminary work, no fully functional and tested encryption existed for the open source version of ZFS.

The closing of the Open Solaris project by Oracle fragmented the various open source versions of Solaris and ZFS. By then, ZFS was being used on FreeBSD, (and probably other BSDs), Linux, MacOS and various open sourced Solaris work alikes.

Lawrence Livermore National Laboratory wanted to use an open source version of ZFS on Linux for their own purposes. Around 2013 it became quite stable as an outside tree kernel module. This brought lots of changes and more fragmentation to open sourced ZFS, so an overall project called OpenZFS was created. It’s purpose was to act as a focal point for changes occurring in the open sourced ZFS from the various OSes that used ZFS, like FreeBSD and Linux.

Since then, changes have increased, code sharing works, though individual projects may get features they develop first. Those features may remain local to that project, (like ZFS on Linux), until the feature is deemed stable and the source code is vetted for both correctness and ability to be supported.


Now we need to clarify at rest encryption.

Encryption has been around for thousands of years. Things like soldiers using code words to describe secret activities. Computers simply automated some of the aspects of encryption. Web pages that use HTTPS have SSL, (Secure Sockets Layer), for encrypting data transfers over the network. In some cases this is to prevent man in the middle attacks and replacement of some or all of the web page’s data.

At rest encryption is different. There are generally 2 reasons to use file system at rest encryption, theft of hardware and disposal of hardware. Let’s take the second reason first.

Disposal of disks can lead to loss of proprietary information, various secret information or out right theft of money. Getting rid of current or old disks can be for many reasons. Bad blocks, so you send it back to the vendor for replacement. Too slow, so you upgrade. Too small, so you get bigger ones. Not compatible with your fancy new disk array, so the old one needs to go away. This leads to recycling issues.

First class militaries with secret data on the disks, may simply destroy the old disks. I’ve seen it, and they are serious about it. Vendors taking back disks for hardware replacement contracts are usually required by the contract to wipe the disks, in case they try to repair them. Some disposal companies are supposed to wipe and if not possible, destroy. Does this occur in all cases? Not likely, especially with fallible humans around. So potential loss of data. But at rest encryption dramatically reduces the risk, (as long as the keys or passphrases are not available).

Back to the first case, theft. Criminals steal hardware, (servers with disks, or just the disks themselves), because they want something, generally money. Valuable financial information is a good motivator for a criminal. For lighter weight criminals, (not after the information, but the tech), just using a less common OS and or file system may block access to the data. But, can we be sure?

So, at rest encryption of file system data is good, but limited. It does nothing for applications that have bugs which leak private data. Nor does it prevent criminals using OS, network or application exploits to access the data. At rest simply means data on the disk is encrypted. But whence the file system is mounted, the encryption appears not to exist to the application(s).

Of course, careful planning can add a third reason to use at rest file system encryption. Let’s say your computers are under constant attack because you have lots of valuable financial data. Further, your IDS, (Intrusion Detection System), can signal when something weird is happening, but the IDS software is not really clear what happened. Thus not block it. So, as a proactive step, (which most companies DON’T do), you stop the applications using the encrypted file systems, and drop the encryption keys. Thus, any attacker that get’s in, can’t read the valuable financial data.


Back to the future, (or really now).

Last year Tom Caputi from Datto announced at rest OpenZFS encryption at the OpenZFS Developer Summit in 2016. He explained how this would work, as well as giving details on limitations. Quite interesting.

Much existing OpenZFS source code put limits on what we could do easily for at rest encryption. A total re-write is out of the question. As are some more complex encryption features, like 2 stage authentication.

After OpenZFS encryption source code was put in GitHub, I decided to test it out. Since I use Gentoo Linux, with full root ZFS, it was pretty straight forward. (Not changing my root pool, simply adding new partitions to play with encryption.) I found a few odd things, some known, but not well documented. Some not thought through yet. I’ve reported them.

So, is OpenZFS at rest encryption useful?
Absolutely. It’s use cases are simply similar to all the other file systems with at rest encryptions out there.

May I convert a ZFS dataset to encryption?
Not really. A dataset or Zvol has to be created as encrypted. You would have to copy an existing ZFS dataset to an encrypted one.

Are the passphrases or encryption keys changeable?
Yes, they are simply wrapper keys to the master key used to encrypt the data. Thus, changing the passphrase or key does not cause or require any data to be re-encrypted.

Is OpenZFS at rest encryption ready for production environments?
No. Lots of minor work is needed. Then some production issues, like where to store the encryption passphrases or key files need to be resolved. And how to use the passphrases or keys.

Which OSes support OpenZFS at rest encryption today?
Only Linux. And then only certain distros, like Gentoo. Whence many of the issues are resolved, like ZFS send & receive with encryption are working well, the BSDs, Ilumos and MacOS will get at rest encryption through the OpenZFS project.

Can I use an encrypted OpenZFS pool?
Yes. But it has to be created that way. And all child datasets will be encrypted.

What are some of the limitations of OpenZFS at rest encryption?

  • You have to decide on passphrase or key file, can’t use both.
  • There is only one slot for passphrase or key. Linux LUKS has 8 slots.
  • All children, snapshots and clones of an OpenZFS encrypted dataset are encrypted.
  • Sending and receiving encrypted datasets has some quirks.

 

Hello encrypted datasets and Zvols for OpenZFS. And welcome to the nightmare of people over-using a feature without understanding it, (then loosing data).

Edit 2019/05/23: Native ZFS encryption arrived for ZFS on Linux with;
GitHub for ZFS on Linux, zfs-0.8.0
A bit longer to release than expected, but it does seem initially feature complete and reasonably reliable.