Big Data analysis and anonymisation techniques under the EU General Data Protection Regulation



Big Data techniques have been widely put into practice by financial institutions, payment institutions and other players in the financial sector. These techniques have proven very useful when analysing credit risks, granting loans, KYC processes, personalising offers for customers, as well as attracting external customers to assist them in using financial information or in developing other value-added services.

Within the financial sector, Big Data projects often involve various entities contributing very diverse sources of information towards a common purpose. This is the case for those ‘smart cities’ projects involving profiling activities that may render benefits both at a private (retail trade, financial entities, health-related issues or transport services) and public (municipalities, public companies delivering community services) level. Fintech related businesses are also growing exponentially around Big Data techniques and blockchain technologies.

Consequently, it is apparent that databases containing, for instance, credit scoring or information on transactions executed through payment instruments, contain a wide range of individuals’ identification details (such as name, last name, age, wage, expenses and so on) whose analysis may lead to the processing of personal data and, therefore, are subject to data protection legal regulations.

Enterprise resource planning (ERP) systems, management software or more generic databases are also home to a wide variety of information sources on which Big Data techniques are commonly used. By their very nature, these sources have as a common denominator being statistical tools that might be traced back up to the individual should they fall into the wrong hands. In a social context, such as our information society in which consumers’ electronic devices integrate internet access, GPS, speedometers, accelerometers and other location tools, the most basic daily activities such as buying groceries – an activity in which the individual’s health habits, tastes and financial data are exposed – can give rise to a relevant amount of actual and inferred data about a single individual (personal data).

From May 2018 onwards, the above described processing of personal data will have to be made in observance of the legal provisions of the European Union (EU) regulation 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data and repealing directive 95/46/EC General Data Protection Regulation (GDPR). One of the main changes derived from this legal overhaul is that the addressees of this EU regulation are required to meet certain goals and objectives at their own discretion and adopting their own business decisions to keep them in compliance with the following principles: (i) lawfulness, fairness and transparency; (ii) purpose limitation; (iii) data minimisation; (iv) accuracy; (v) storage limitation; and (vi) integrity and confidentiality. Therefore, all Big Data projects should be assessed bearing all these legal principles in mind.

In a code of conduct released in May 2017 by the Spanish Data Protection Agency (DPA) titled ‘Code of Good Practice in Data Protection for Projects of Big Data’, the Spanish DPA states that in Big Data projects it is essential to fulfill data minimisation and purpose limitation legal requirements by means of providing detailed and transparent information to the affected data subjects so they can validly consent the processing of their personal data. It should be noted that achieving these requirements can actually be very challenging for those entities behind a Big Data project, as they may not always be in direct contact with the data subject.

Considering the fact that the collection of individuals’ consent is usually a burdensome task, the concept of ‘legitimate interest’ – another legal ground for the processing of personal data replacing the consent of the data subject contemplated in the GDPR – may be in a position to become an alternative option to be considered in the processing of personal data in the context of Big Data projects. However, it should be emphasised that the concept of legitimate interest may only be validly used as legal ground for data processing when a previous and individualised assessment is executed on whether the fundamental rights and freedoms of the data subject should prevail over the legitimate interest of the data controller. The Court of Justice of the EU resolved on the use of the legitimate interest concept also contemplated under Directive 95/46/EC in the joined cases C-468/10 and C-469/10 by underlining the importance of carrying out a proper assessment of the interests at stake.

In this context, anonymisation processes – understood as the process of rendering “anonymous” a set of personal data – came to be an actual option for those entities carrying out Big Data projects, provided that Recital 26 of the GDPR sets out that data protection principles are not applicable when the information processed cannot be linked to an identified or identifiable person. In this sense, it should be pointed out that anonymisation processes actually constitute personal data processing activities that – depending on the circumstances of the case at hand – are also subject to the applicable legal provisions.

Thus, anonymisation requires ensuring, to the maximum extent possible, the complete dissociation of the personal data from the data subject, as well as avoiding the re-identification of individuals so that the efforts necessary to re-identify the relevant data subjects are considerable in comparison with the benefits that may be obtained from the re-association. In conclusion, it is required that identification of individuals is complete and irreversible after the anonymisation process has taken place and not only for third parties but also for the entity carrying out the anonymisation process, as consistently upheld by the Spanish DPA in various opinions and decisions.

Prior to the adoption of the GDPR, Article 29 Working Party (which will be replaced by the European Data Protection Board) issued Opinion 5/2014 on anonymisation techniques. In order to assess the level of robustness of the anonymisation techniques used, the Working Party proposes that a compliant anonymisation process should meet a threefold criteria: singling out (possibility to isolate some or all records which identify an individual in the dataset), linkability (the ability to link, at least, two records concerning the same data subject or a group of data subjects) and inference (the possibility to deduce, with significant probability, the value of an attribute from the values of a set of other attributes). In this Opinion, Article 29 Working Party also reinforces the fact that the more anonymisation techniques being used, the more likely it is that the robustness of the result is guaranteed.

Unlike Directive 95/46/EC, the GDPR regulates the concept of pseudonymisation, which should not be confused with anonymisation. The GDPR sets out a twofold requirement to be met by those companies using pseudonymisation techniques. On the one hand, pseudonymisation requires that the processing of personal data is not attributable to a specific individual without using additional information, information that, on the other hand, must exist separately and under specific technical and organisational security measures that guarantee that no possible link with an identified or identifiable individual can be made. It is noteworthy that, although pseudonymisation can be regarded as a useful security measure in the process of anonymising personal data, this cannot be considered by itself as an anonymisation process if no further measures are implemented as to properly avoid the identification of individuals.

In late 2016, the Spanish DPA published a document containing certain guidelines and guarantees about the procedures of anonymisation of personal data. The Spanish DPA reflects in its anonymisation guidance that the execution of anonymisation processes should take as a starting point the ‘privacy by design’ principle, which is one of the new data protection concepts introduced by the GDPR. This principle states that data protection shall be observed from the very beginning of designing a product or technology which, in the field of anonymisation, actually implies that information systems, products or techniques used for these purposes shall ensure the confidentiality of individuals. It is important to emphasise that the GDPR does not state the precise technical and organisational measures that legal entities shall adopt. Therefore, state-of-the-art techniques and standardisation rules are now key in order to evidence before authorities and national courts – should it be the case – a sufficient degree of due diligence in the execution of their personal data processing activities.

Prior to each anonymisation process, the Spanish DPA recommends carrying out a risk analysis or even a ‘data protection impact assessment’ so that the risks of re-identifying individuals are duly managed by adopting technical, organisational and other types of security measures. The Spanish authority’s view is that this risk analysis or data protection impact assessment should be periodically reviewed, particularly taking into account that the risk of re-identification may increase over time.

Particularly in the financial industry, sector-specific pieces of legislation at EU level require the observance of data protection rules in relation to, among others, the setting up of anti-money laundering/KYC centralised databases with relevant beneficial ownership information made available to national Financial Intelligence Units (FIUs) by those entities legally obliged by their national laws. In this sense, Directive (EU) 2015/2366 (PSD II) regulates that payment systems and payment services providers shall solely process personal data “when necessary to safeguard the prevention, investigation and detection of payment fraud”. PSD II also establishes a general reference to the application of Directive 95/46/EC and to the limitation of the access, processing and retention of personal data, by payment service providers, “necessary for the provision of their payment services, with the explicit consent of the payment service user”. Payment service providers, as well as other financial players, will have to be particularly cautious regarding Big Data projects. It is highly likely that entities will find ‘purpose limitation’ constraints which should be carefully assessed by obtaining the appropriate advice from legal experts.

In our opinion, the implementation of lawful Big Data projects and anonymisation techniques will very much depend on how quickly technology evolves and how difficult it becomes for legislators and other stakeholders to keep pace to provide safe and reliable requirements regardless of the efforts made by data protection authorities in their attempt to set out effective rules and guidelines.


Rafael García del Poyo is a partner, Roger Segarra is an associate and Samuel Martínez is an associate director at Osborne Clarke. Mr García del Poyo can be contacted on +34 60 884 8406 or by email: Mr Segarra can be contacted on +34 68 637 0932 or by email: Mr Martínez can be contacted on +34 62 014 4377 or by email:

© Financier Worldwide


Rafael García del Poyo, Roger Segarra and Samuel Martínez

Osborne Clarke

©2001-2019 Financier Worldwide Ltd. All rights reserved.