Privacy, Virtually Speaking

January 9, 2017  |  Grant Foster

Privacy is something everyone should be concerned about, regardless of your political affiliation or threat model / perceived level of risk. Whether a malicious actor is foreign or domestic, they generally want to know as much about you as possible, and aren’t very likely to ask your permission first.

It would be great if common security measures could protect us completely, but unfortunately exploits happen even with the best security technology and training. Even when our machines aren’t actively compromised, we have to worry about being tracked. If you’ve ever maximized your browser window when using Tor, you’ve seen a warning about not having your browser size match the full size of your screen to avoid being tracked via that metric. There are many other ways to use metadata available to any site you visit to track your activities.

Our first desire is probably to not be tracked at all, but another option is to make the information that is retrieved by malicious actors as useless as possible. For that we can turn to something that specializes in abstracting, or hiding physical details, virtualization.

Virtualization

Virtualization is a technology now widely used in data centers that allows you to create a completely isolated machine, or virtual machine that runs on your physical computer, or host.

There are many different vendors offering virtualization solutions for the enterprise, but there are also virtualization options available for the individual desktop, usable on either a Windows or Linux machine, which is what we’re going to focus on.

Generally, any desktop virtualization solution will have software known as a hypervisor that will take some input files and turn them into a running virtual machine, or VM for short. Those files generally fall into the following categories:

  • Storage
  • Configuration
  • Logs

hypervisor takes files of storage configuration and logs to make VM

Storage and Logs are fairly straightforward, the former being files interpreted as virtual hard drives or the volatile RAM of a machine, and the latter logging actions the hypervisor takes in running the VM.

Configuration is the most critical input to the hypervisor, these files tell the hypervisor how the virtual machine should be constructed. How many virtual CPU’s, network card configuration and much more is found in the configuration files.

How does this help our privacy? To answer that, we have to look at a surveillance technique known as fingerprinting.

Fingerprinting

Much like the pattern of swirls on our fingertips, fingerprinting aims to establish as many unique elements of information about someone to enable tracking a trail of activity.

As it relates to virtualization, we’re going to focus on hardware fingerprinting. A collection of all the hardware installed in your physical machine like your video card, CPU type, motherboard model and so on constitutes your unique hardware fingerprint. Much like a web server can use User-Agent metadata, it is possible to collect information about your hardware and use that as a tracking mechanism.

This is of the most concern to those who build computers themselves rather than purchase pre-built machines, as a hardware fingerprint of a particular configuration of a Dell laptop is far less unique than the hardware fingerprint of a computer built with a specific video card, motherboard, and so forth. In both cases however, it offers whoever is able to collect that information to potentially associate your activity with a particular piece of physical hardware.

unwanted user can fingerprint our hardware and break privacy

Virtualization can help us defeat this type of fingerprinting by reducing the information available to an external actor to only which vendor’s virtualization software we happen to be using. Knowing only that someone is using a VMWare or VirtualBox virtual machine is far less useful than knowing what specific video card, physical CPU, or physical motherboard they have.

For example, if we were to go to an online fingerprinting tool such as https://amiunique.org with a physical machine, we would see something like the following:

online fingerprinting tool for physical machine

However, if we were to go to that same site with a virtual machine, we might see something like the following:

less useful information is available for fingerprinting tool in virtualized environment

The only thing we did that resulted in disclosing our video card was visit a website, so by no means are malicious programs the only way information about your hardware fingerprint can leak. By using a VM, however, we’ve greatly reduced the value of the fingerprint information that was disclosed. If a malicious entity was able to access this information, they would only know that we were using a VMware virtual machine, and not of any hardware details that directly translate to something in the physical realm.

A specific piece of information possibly of more value than the rest is your MAC address. This is something unique to every physical and virtual network card, and of particular interest are the first 6 octets:

01–23–45–67–89–0A

The first 3 pairs of characters are known as an Organizationally Unique Identifier, or OUI. The OUI references the manufacturer, a detail which could be useful for tracking purposes. The last 3 pairs are known as the Network Interface Controller Specific data, and are unique to the NIC itself. When combined, this information could allow a machine, or at the very least a network card to be traced back to its point of origin.

Virtualization also helps us here by the fact that hypervisors generate their own MAC addresses, and this information is local only to the machine the hypervisor runs on (and not recorded anywhere else as would often be the case in a physical scenario).

MAC addresses are one of the harder bits of information to obtain, but if that information was compromised from a VM, it would again only disclose what vendor we were using for virtualization, and gives no additional information beyond what was disclosed in our WebGL example above.

Since virtualization largely reduces the telemetry possible to be gathered by external actors to just a vendor (beyond notable exceptions such as RAM and virtual CPU count), it also allows us to make behavioural information gathered less useful.

Behavioural Fingerprinting

Let’s consider a hypothetical user, Jack. Jack uses anonymizing technology such as Tor and VPN’s from different locations, making the actual IP recorded by his traffic not dependent on where he actually is, assuming no de-anonymization occurs.

If Jack was to be hardware-fingerprinted though, it’s possible there would be telemetry that would roughly associate him with a location, time of day, and possibly more.

behavioural fingerprinting with hardware

Were Jack to use a VM, however, activity from any location would look the same (as long as Jack keeps in mind to use the same vendor at all locations), denying another piece of possibly personally identifying information.

in virtualized environment fingerprinting gets less data

Further, since VM’s are at their core a collection of files, they are also inherently portable. With advances in flash storage technology, large capacity USB drives are now plentiful and inexpensive, making it possible for us to use a VM at home, copy it to a flash drive and use it elsewhere. One caveat here is that incompatibility with differing processor types / processor families may occur, so testing functionality between machines you wish to run a VM on is critical.

Many hypervisors also allow for encryption of a VM, which allows easy addition of 2-factor authentication to an entire machine: The VM files are something you have, and the encryption password is something you know.

Operational Considerations

As with any technology, there are also operational considerations. If we’re going to go through the effort of creating and maintaining a VM, we want to make sure that we verify nothing malicious goes into the VM, which means doing things like checking file SHA checksums on both the hypervisor software and operating system installation media.

Most hypervisors allow for cloning, or quickly creating a duplicate of a VM. This gives us a way to create a machine that we know to be in a good state, often referred to as a golden master. This is useful because we can quickly create a portable copy, or even a new local copy in case we suspect a compromise of our current VM.

Clones are usually generated with a new MAC address, but if desired we can also keep the old one if we wish to present the same complete hardware fingerprint at multiple locations. As mentioned above, it is difficult, but not impossible for a malicious website to harvest your MAC address, and something to be considered depending on your threat model.

For users with high risk profiles, it is recommended to create golden master machines on a system that is not connected to the internet, to curtail surveillance and/or tampering as much as possible.

Using Your VM

Based on your threat model, you may wish to utilize the VM in different ways. If all of your activity is of a highly sensitive nature, you may wish to utilize the VM for all activity, and leave your physical machine acting as only a platform for the hypervisor. Otherwise you may want to only use your VM for certain sensitive activities, and your physical machine for more mundane tasks that you aren’t concerned about being tracked. Taken to the extreme, this would mean completely isolating a VM from networking by not adding a virtual network card in the machine configuration.

A reverse of the approach mentioned above is also possible. By confining ‘high-risk’ activities to a VM, any possible compromise and associated C2 (Command and Control) activities will only occur on the VM, isolating it from other areas of the physical machine. This is the recommended approach for most cases, as a compromise of a physical host is far worse than that of any individual VM, unless individual circumstances warrant a different strategy.

We also have to keep in mind the hypervisor configuration for networking, specifically whether we use a bridged configuration or NAT (Network Address Translation) configuration. The former will truly appear as an independent machine, whereas with the latter requests from VM’s will appear to endpoints as though they originated from the physical host.

NAT or Bridged configuration determines whether VM will use physical host machine's IP

Wrap-Up

Privacy and anonymity go hand in hand, and the first step to dismantling privacy is to compromise anonymity. With measures like the Snoopers’ Charter now a fact of life in the UK, protecting anonymity has become even more important than before. Using anonymization products like Tor and VPN’s is one part of protecting your privacy, but not enough in the face of determined adversaries.

Through virtualization, we can pull the proverbial shade over a part of our online presence, and make sure we are not just another stop on the path of least resistance to gathering information. Your threat model may not call for using all of the measures outlined above, or even any of them, but you now have another tool in your repertoire to protect your privacy.

Share this with others

Get price Free trial