Personal Data in the Cloud


The topic personal data in the cloud has at least two perspectives. There is the end user perspective where we as users ask ourselves whether our data in the cloud are safe. And there is the enterprise perspective. Suppose you want to start a cloud based service and store personal data of end users in the cloud. What are the issues to be taken care of here? In this article we take the enterprise perspective. Of course we can hardly answer the above question in full. We rather focus on some legal aspects in cases where your company resides in the EU.

One of the key ideas behind cloud computing is that end users do not need to care where their data reside. Servers hosting these data may be located anywhere on the globe. Since cloud computing saw its strongest early development in the US it is a fact that most companies offering cloud services are US based and many data centers are located in the US. We currently see more and more date centers being opened in Asia and Europe as well.

If data being stored on these servers are personal data, there may be an issue because there is no uniform agreement about how personal data may be used in different countries around the world. There may not even be a uniform understanding about the term personal data. And that is in fact the case. The European Union is known to have the strictest regulations about how personal data may be handled by authorities, administrations, and private companies. The base for this is the data protection directive of the EU.

Part of this directive is the ban to export any personal data to countries or organisations that do not offer the same level of data protection as is installed in the EU by this directive. So how can we ensure that we do not violate the directive when storing personal data in the cloud? To answer the question we first have to understand what personal data are.

Personal Data

The EU directive deliberately chooses to take a wide notion of the term personal data. According to the directive,
Personal data are defined as “any information relating to an identified or identifiable natural person (“data subject”); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity;” (art. 2 a). Typical examples are addresses, bank account or credit card information, medical records, criminal records, company employee data and the like.

The use of personal data is governed by the following three principles: transparency, legitimate purpose, and proportionality. Transparency basically requires that the persons whose data are used have to give their consent for doing so and know what is used, when it is used and for what purpose. Legitimate purpose means that the data collected may only be used for the purpose the persons agreed to and not for any other purposes. Proportionality requires that only those data are collected and processed that are needed to perform the legitimate task and that these data are kept up to date and correct.

Now that we understand the notion of personal data a bit better let’s look at some solutions to the data protection problem when storing data in the cloud.

No Solution: The Safe Harbor Agreement

When introducing the directive, the European Commission understood that there has been a regular exchange of personal data between EU member states and third countries, in particular the US. Therefore the so-called Safe Harbor data privacy program was initiated. The program basically allows US companies to audit themselves to adhere to the EU data protection directive and self-certify this adherence. A couple of reviews initiated by the EU to state the quality of this self-certification process revealed that the quality of the certification is extremely poor. The last survey, performed in 2008, showed that out of the then up-to-date list of about 1,600 US companies claiming their compliance only as little as 3% were in fact compliant. So, we cannot rely on such a certificate.

The situation for German companies that intend to store data in a cloud is even worse. The reason for this is that the German authorities decided to draw consequences of the review results. In particular the Düsseldorfer Kreis, the convention of Germany’s governmental data protection agencies, declared in April 2010 that data exporting companies cannot rely on Safe Harbor certificates but rather have to check with the companies receiving the data that the receiving companies comply to the directive. In effect this basically means that the exporting company has to perform and document the auditing of the compliance of the data receiving company. In other words it is valid to assume that the exporting company takes legal responsibility for the correct storage and use of personal data on the side of the importing company. This is plainly impossible to perform. How should a mid size company in Germany audit and check an internet giant like Microsoft or Google, which have legal departments that are bigger in size than the total number of employees in the German company? So we see here that

  • We can’t consider the Safe Harbor agreement as a solution to the original problem.
  • Neither does it seem a wise move to transfer any personal data to companies which certified themselves as such a Safe Harbor.

Simple Solution: Stay at Home

A number of US based cloud service providers have understood that EU based companies have problems to export personal data. In order to foster their business in the EU they decided to open data centers in the EU. Companies like Amazon, Google, and Microsoft nowadays offer their customers to choose the geographical location of the data center(s) where their data should reside. This is more or less the simplest solution.  Some companies including Amazon, Google, and Microsoft even run more than one data centre in the EU in case one is in need of a geographically remote backup centre.

Technical Solution: Depersonalise your Data

There may yet be another solution in cases where the cloud service provider does not operate data centers in the EU. The EU directive deliberately refrained from defining an explicit catalogue of what counts as personal data and rather defined personal data as any set of data that can be used to identify an individual person. This abstract definition allows for a technical solution to the problem. The technical solution consists in using data encryption. The following conditions have to be fulfilled for a valid solution.

  • All personal data have to be encrypted.
  • The encryption keys have to be individual to each customer.
    A general solution where the cloud service provider encrypts all stored data with one and the same key for all customers does not provide sufficient protection.
  • The customer and no one else must be capable of decrypting the data.
  • The decryption keys have to reside within the European Union.

Methods following these guidelines effectively depersonalise the data. Now, that the data are not personal any more, they are no longer subject to the EU data protection directive and may hence be stored at any data center around the world.


Since the Safe Harbor agreement should  currently be  regarded as a failure, the EU data protection directive basically prohibits the export of personal data out of the EU. As a consequence care has to be taken when thinking about transferring personal data into the cloud, because the principle of data storage virtualisation, a key concept of the cloud, stands in direct contrast with the data export ban.

Fortunately there are two solutions to this problem which allow us to adhere to the data protection directive and still use the cloud for the storage of personal data. One consists in choosing a provider that offers customers to store their data in data centers within the EU. The second consists in depersonalising the data so that it may legally be exported anywhere.

Post by Stephan Kepser

Agile Testing

Selenium WebDriver for Safari 8


Persistenz ohne Persistenz


  • Stephan Kepser

    This post deserves an important post scriptum. Microsoft recently admitted that they will compromise customer data on European servers to the FBI and other US agency under the USA Patriot Act and other US acts. And the affected customers may perhaps never be informed about this. It is to be expected that all US based cloud providers act the same way.

    As a consequence customers may either choose EU based cloud service providers or depersonalise their data.

  • Zvi

    Any way there is a data about us in the web, isn’t it?
    So it is better that I will publish my personal data then it is will be by others, and than I have no reason to ask whether my data in the cloud is safe…

    • Stephan Kepser

      True, there’s large amounts of data about most of us in the web. And it is of course an important question how we deal with this.

      But this isn’t really the perspective of the blog post. My question was what is that EU-based companies, German companies in particular, have to take care of in their day to day business if they intend to store personal data about their customers in the cloud. The EU impose via legislation pretty strong restrictions on collecting, storing and processing personal data by private companies. And I think companies intending to store customer data in the cloud should be aware of that.

      • Zvi

        Yes, I agree
        but at the same time we have to find way to present personal information with no need of lgislation protection.

        • Stephan Kepser

          I’m not sure what you mean with the word “present”. The limits imposed on us in Europe by law cannot simply be overcome. There’s now some discussion that the data protection directive of the EU – and its transfer into national laws – may be too restrictive for many modern web applications. But before the national laws aren’t changed we have to follow them. For example, German law does not even give end users the option to tell providers they may use their data for any purpose they want. For storage and processing the currently most promising way is data anoymisation or pseudonymisation by encryption and distribution.

          • Zvi

            I t is interesting. The best is to pub. an open data, for example see open personal page (only for older than 18 years old!),-
            Do you still think it should be under some restrictions?


Your email address will not be published.