Since the EU’s General Data Protection Regulation (GDPR) came into effect just over two years ago, data privacy has rarely been out of the headlines. From accidental leaks to criminal cyber attacks and nation state espionage, breaches have become unnervingly common.
As a database administrator, consumers entrust their personal information to you. But are you thinking about that personal information in light of privacy regulations, and are you certain you have the tools to protect those databases?
What is the state of data privacy regulation?
Under GDPR, the maximum penalty for non-compliance is four percent of annual worldwide turnover or €20m, whichever is higher. GDPR is considered the most comprehensive when it comes to protecting personal data. Given that similar regulations could eventually be enacted in the US, we will consider the requirements of GDPR and how they are likely to affect DBAs.
Are we storing personal data?
Article 4 of GDPR defines personal data broadly: “‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location number, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”
Common examples of personal data include, but are not limited to: biographical, physical, cultural and health data. If your organization collects and stores personal information of EU citizens, then it is subject to GDPR.
I’m a DBA. How do data privacy and protection affect my job?
GDPR specifies multiple roles to ensure a company complies with all regulatory requirements. The roles apply even if the company engages contractors or agencies to process personal data.
Natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data.
Natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller.
A third-party data processor which processes personal data on behalf of the controller.
A data subject is an identified or identifiable natural person whose personal data needs to be protected and get processed in the context of GDPR.
Does your company have a dedicated “Controller,” such as a data protection officer or manager? If not, then this is typically the starting point for most DBAs.
You need to consider to ensure your business is protected is an aspect of GDPR called Privacy by Design. It requires a fresh look at how you should design your systems. To put such a policy in place, you must first know where sensitive data exists in your databases.
What are the main GDPR security requirements?
The main GDPR data security requirements fall into three categories:
GDPR also requires compliance with principles to enhance the quality and rigor of protection of the data.
How do I assess and discover security risks?
GDPR mandates that Controllers perform data protection impact assessments when certain types of processing of personal data are likely to present a “high risk” to the data subject. Each assessment must include a systematic and extensive evaluation of the organization’s processes and profiles, and how they safeguard the personal data.
Why should you, as a DBA, care about data privacy? What does assessment and discovery of security risks mean for you?
• Databases across the enterprise likely contain personal and other sensitive data.
• Databases are a primary target for malicious actors attempting a data breach.
• Most regulations specifically prescribe methods and techniques that must be used for databases.
• The DBA is often primarily responsible for implementing compliance controls and technical measures for protecting data.
Look for GDPR solutions that automate the process of discovering sensitive data in all your databases and running reports for you. Your goal is to monitor that data in real time as your database developers change objects like procedural code, and notify them of potential breaches before they deploy changes.
How do I prevent personal data breaches? How protected does my data need to be?
There are two main ways to protect personal data: pseudonymization and anonymization.
Pseudonymization enhances privacy by replacing most identifying fields within a data record by one or more artificial identifiers, or pseudonyms.
Anonymization obscures personal data by masking it, for example. But if the data can be reversed, it’s still personal data. If the method of anonymization is irreversible, then the data is truly anonymized.
If you employ data masking with reversible encryption, you need to ensure the masked data is more difficult to attack. To accomplish that, store the encryption key used to mask the data in a table with access controls that are tighter than the access control to see the SQL code behind the view.
Better still, consider using a one-way hashing algorithm to create the hashed value. Then, maintain a lookup table somewhere secure inside your data warehouse, with the hashed values and their corresponding clear text values. Use a salt key.
The best tools for this purpose take advantage of the database vendor’s data protection features such as masking and encryption.
How do I detect and monitor potential data breaches when the data is constantly moving?
Traditionally, data has been stored in one place — the database — with backup copies on physical media. That media is usually in a different location from which it can be restored in the event of a data loss.
But in this era of data protection strategies that include high availability (HA) and disaster recovery (DR) systems, data is continuously replicated to other locations and to the cloud (DBaaS or IaaS). That continuous movement makes it more difficult to identify and protect personal data. This and other trends, including the rise ofDevOps, put additional pressure on DBAs to protect personal data before moving it.
HA, DR, cloud and DevOps are ripe for monitoring tools that can alert you to the use of sensitive data. Most database vendors provide database security and auditing tools, but those tools can be resource-intensive if not configured properly.
With database activity monitoring tools, however, you can monitor important aspects of user behavior.
As a DBA, where should I start?
First, research the data privacy regulations to which your company must adhere.
Next, consult the information your database vendor already provides.
Then, roll up your sleeves and begin the work of figuring out where sensitive data exists in all of your databases.
Many products help you understand where potentially sensitive data is located in your databases. But most of those tools merely rely on metadata like the column names of your tables to infer locations of sensitive data. They assume that your tables and columns follow strict naming conventions, which is not normally the case.
It’s better to search with a tool that performs data sampling across all the tables, then uses a range of regular expressions encapsulated in a set of pre-defined rules to judge what is sensitive data. When you can customize and create your own rules, you can refine the search parameters for your databases.
Finally, once you have located sensitive data in a table, apply some of the techniques described above to prevent breaches.
The role of data controller brings a new dimension to database administration, given the number and variety of databases you manage. Look for tools that simplify and automate the identification and reporting of sensitive data, based on regular expressions and data sampling. Then, quickly apply the necessary data protection measures.
Above all, the protection of sensitive data needs to be an integral and enforceable part of every company’s data privacy policies, adopted by all employees.