Saturday, December 17, 2011

Linux KVM virtualization and PCI-DSS

With the release of RedHat 6, KVM is now the default virtualization solution in both the RHEL and Ubuntu worlds.  With KVM virtualization, the Linux kernel itself acts as a hypervisor to manage the host hardware, allocating the resources to the guest virtual machines.  This is quite different to VMware, where a small, custom written hypervisor manages the host machine hardware, and management software runs in a Linux-like environment on that hypervisor.

This move to using a general purpose OS as the hypervisor has some significant advantages, as the full capabilities of Linux (e.g. RAID, encrypted storage, vast hardware support) can be leveraged when building a solution.  Also, relative to VMware, there can be significant cost advantages to using Linux.

However, in a high security environment, moving to a general purpose OS as the hypervisor can introduce additional risks which need to be mitigated.  A custom written hypervisor like VMware is designed to do one thing: run VMs.  In principle, as long as secure account management policies are followed, patches are installed in a timely manner, and management access is restricted to secure hosts, then the hypervisor is likely to be 'secure'.  Host environment security is mostly a function of securing the guest virtual machines themselves.

With a Linux KVM hypervisor, the situation can be very different.  Modern Linux distributions provide all sorts of software that are invaluable when deployed appropriately, but which would be poor candidates for installation on a host intended to be a dedicated hypervisor.  In this environment, any unnecessary services are just potential vulnerabilities to be exploited in gaining unauthorized access to the host.  Once an intruder gains access to the hypervisor, there are many tools which can be used to extract information from a running VM without security tools running inside the guest being aware that anything is happening.

To illustrate this, I've created the following scenario:

1) a host running Ubuntu 11.10 as a KVM hypervisor called 'kvm-host'
2) a VM running Ubuntu 11.10 called 'iso8583', simulating a transaction processor
3) a VM running Ubuntu 11.10 called 'switch' that will connect to iso8583 and send messages

On iso8583, the role of the processing software is simulated by the 'echo' service in inetd.  This is essentially the most trivial network server imaginable: you create a TCP connection to the service, and any data that you send is echoed back to you.  The data is not logged or processed in any other way, just received by the server and echoed back.

For this example, I'm assuming that our processing BIN is 412356, so all PANs (card numbers) will be of the form '412356' + 10 more digits.

We start by connecting from switch to iso8583 and sending a couple of messages (just fake PANs, in this case).  The 'netcat' utility is used to connect to the remote service, a PAN is sent to the processor, which is then echoed back:

18:05:43 switch:~> echo 4123560123456789 | nc iso8583 echo
4123560123456789
18:05:47 switch:~> echo 4123569876543210 | nc iso8583 echo
4123569876543210

Now, on kvm-host (the hypervisor), we dump a copy of the full memory of the virtual machine using the useful gdb tool 'gcore'.  Note that gcore produces a core dump of any process (including a virtual machine), without actually terminating the process:

# Get the PID of the VM called iso8583
18:06:05 kvm-host:~> pgrep -f iso8583
18170
# Now get a copy of the in-memory process
18:06:09 kvm-host:~> sudo gcore 18170
[Thread debugging using libthread_db enabled]
[New Thread 0x7f5b8542e700 (LWP 18244)]
[New Thread 0x7f5b89436700 (LWP 18241)]
[New Thread 0x7f5b8ac39700 (LWP 18239)]
[New Thread 0x7f5b87c33700 (LWP 18238)]
[New Thread 0x7f5b89c37700 (LWP 18236)]
[New Thread 0x7f5b86430700 (LWP 18216)]
[New Thread 0x7f5b87432700 (LWP 18214)]
[New Thread 0x7f5b88c35700 (LWP 18205)]
[New Thread 0x7f5b9d9d4700 (LWP 18180)]
0x00007f5ba2213913 in select () from /lib/x86_64-linux-gnu/libc.so.6
Saved corefile core.18170

The file core.18170 now contains a copy of the memory from within the VM - it's as if we lifted the DRAM chips out of a live system and extracted their contents to a file.  We now perform a trivial analysis of the core using the strings tool to extract all ASCII text strings from the dump, then look for anything that could be one of our PANs, i.e. anything of the form 412356+10 digits using egrep:

18:06:14 kvm-host:~> strings core.18170 | egrep '412356[[:digit:]]{10}'
4123569876543210
>4123560123456789
4123569876543210
4123569876543210

Sure enough, both PANs are there, even though the server software never attempted to log them to disk, and even though the process which was handling them exited the moment we disconnected.  This exposure would not be possible to catch by any software running inside the guest VM, since the exposure is occurring outside of the VM.  Therefore, the only way to catch this is by monitoring all actions taken on the hypervisor itself, and the only way to prevent it is to securely lock down the hypervisor.

Worse, if those had been real ISO8583 messages, then the full content of the message would likely be recoverable.  This includes what the PCI SSC considers to be 'Sensitive Authentication Data', defined as full track data, PIN block and CAV2/CVC2/CVV2/CID.  This is data which you're never allowed to store, and which this echo server software (rightly) doesn't attempt to save to disk.  But it's still in memory for some period of time until overwritten, and can be pulled silently from the hypervisor environment.

In a similar vein, any encryption keys which are used to perform software encryption within a VM would be present in the VM dump.  Finding these keys would be more tricky than simply using grep to look for a text string, but it would be possible.  The worst case scenario would involve walking through the image, looking for aligned blocks of data with the size of the key which could be valid keys (e.g. a randomly generated key is unlikely to contain a NULL byte) and then testing them.  This is still many orders of magnitude easier than attempting to break a key by brute force.

I actually consider myself to be a strong proponent of Linux, and it is not my intention to put anybody off using Linux as a hypervisor in a secure environment.  I am hoping to draw attention to the fact that a standard Linux distribution cannot and should not be used as another 'appliance' hypervisor.  The hypervisor is more critical to your security posture than most other infrastructure components, since a hypervisor compromise allows every system running on top of it to be silently compromised.  The hypervisor should be thoughtfully deployed as a first-class security citizen, and must be secured as any other host would be, including hardened configuration standards, FIM, logging of administrative actions, and all the rest.

If in doubt, ask your QSA (auditor) for an opinion.  Contrary to what some people believe, they are there to help!

No comments: