Protecting Sensitive Data - 2

Tuesday, 20 June 2006 14:07 by RanjanBanerji

Everyone must have heard the news on how data for 26 million veterans was lost when an analyst's laptop and external driver were stolen from his house.  The latest news on this theft is that information on all active military personnel in the United States was on that disk.  As surprising this may sound this is not new.  Having worked for many years in both the government and private/commercial sector I have seen plenty of situations when data has been taken home and therefore has been susceptible to theft.  In fact I have heard of many cases where the theft has occurred but discussing those is not the purpose of this article. 

This is part 2 of my article on this issue.  In part 2, I hope to discuss alternate ways of designing and architecting databases to ensure security of personal and sensitive data.  Before doing so I would like to discuss why new architectures are required.

  • Use of personal and sensitive data is rampant.  Past designers of databases and applications have exhibited complete disregard for this issue.  Policy makers are even more responsible for this problem.

  • The low cost of storage has resulted in the mentality of "lets store all data".  The problem is that we now end up storing information that we should not be.

  • Absolute apathy on behalf of data architects, business analysts and policy makers.  Nearly every system designed by default will use a SS# for identifying people rather than creating a mechanism to do this identification.

Given the prevalence data such as SS# in databases all across the government, private institutions, banks, etc what is it that can be done to start protecting this data?

Several database and application design concepts can be put to place.  Each, however, requires detailed analysis.  There is no point in repeating mistakes of the past by hastily trying to protect you data merely to satisfy some auditor.  Unfortunately, in most cases this is exactly what will be done and more cases of lost or stolen data will be in the news.  But I digress.  To remedy this problem the following steps need to be taken immediately:

  • Database and application Audit.
  • Data obfuscation and obfuscation policy and process.
  • Service Oriented Architecture.  Separate sensitive data from your day to day transaction database and application

1. Database and Application Audit

First step is to identify all databases, all copies of these databases and all applications in your organization.  Use subversive methods if necessary to sniff out these systems.  In large organizations, specially in the government there are many systems that are built under the covers.  I know, I have built many of them.  For all these systems the following must be collected:

  • Where are they located - Which server, which data center, which building, which room etc.  Determine security of location.
  • What do they do.  What is the purpose of this application.
  • Who do they serve.  Who in the organization uses the system
  • Who are the end users.  Are there other end users?  Who are they?
  • What data is stored.  Where is the database?  What DBMS is used?
  • What data is transmitted.  What data leaves this application.
  • How is the data transmitted.  How is the data transmitted?  Is it encrypted, obfuscated?

2. Data obfuscation and obfuscation policy and process

Establish organization wide IT polices for obfuscation of data.  The policies should be supported with well documented processes which in turn need to be audited on a routine basis.  See my other post on this subject Securing Sensitive Data.

Using data obfuscation lets you achieve the following:

  • Immediate protection of sensitive data.
  • Flexibility in your current environment.

3.  Service Oriented Architecture

Based on information collected in step one you should now know what all data your organization is collecting, what data is actually needed, what data is being inappropriately stored and used.  Based on this you need to create the following categorization of systems/applications:

  1. Applications or databases that genuinely require sensitive data such as SS#.  These applications require this data as a part of every transaction, and absolutely no other substitute is available.
  2. Applications or databases that have limited requirements for sensitive data such as SS#.  These applications require this data as a part of a very limited set of transactions, and absolutely no other substitute is available.
  3. Applications or databases that require sensitive data such as SS# but could do without them.  These applications require this data as a part of every transaction, however, the decision to use sensitive data was a matter of convenience and not need.
  4. Applications or databases  that do not even require sensitive data, but store them because of the "it can't hurt to store as much data as possible" mentality.

By the time this information is gathered your obfuscation policy and process should already be well underway. 

Applications that fall under category one above will require immense protection and further analysis.  In most cases the fact that sensitive data is always required is a false statement.  But clearly there are systems that require such information.

The Service Oriented Architecture (SOA) approach to data security attempts to move as much of the sensitive data as possible away from the main transaction database.  Access to the sensitive data is possible only through secure channels and services.  The transaction database merely holds a proxy for the sensitive data that is being protected.  This enables most business processes such as reporting, analysis, statistics etc to proceed without any need for the sensitive data such as SS#.  Now even if your transaction data falls in wrong hands it does not reveal anything of value pertaining to peoples individual data.

The argument against this approach is that it  may impede performance, i.e., each time you need sensitive data for a transaction you need to communicate between multiple databases using a service that will do a security check.  My counter to this argument is that except in some very extreme cases sensitive data should not be required as a part of transactions.  If they are then you must evaluate the design of your system and the purpose of the system.

Lets look at this from a real life example.  I recently applied for a credit card from my bank.  I already have a checking and savings account with the bank.  In the credit card application, despite the fact I have two account with the bank, I had to provide my SS#.  Upon receiving the card, I was asked if I wanted a photocard, if so I had to send my photograph along with a form that also requested my SS#.  My first reaction was WTF????

So what should the bank have done?  Well, the moment I created an account with them, they should have created a customer number for me.  This number should uniquely identify me to the bank.  If needed they could link this number to my SS# and keep that relationship extremely secure and very seldom used.  from this point on all transactions should be based on this new unique identifier.  Now the bank know who I am, and in the event of any data theft, the thief has information only with regards to this one bank rather than all information on me.

Lets take another example.  "Name, Rank, Serial Number" is what we hear in movies,  POWs not revealing any information other than their name, rank and serial number.  DoD (Department of Defense) in the United States figured, oh! what the hell.  Let's make people reveal a lot more.  How did they manage that?  Well the serial number, which once upon a time was just a "serial number" is now the Social Security number.  Hence the fiasco when a laptop carrying veterans database was stolen.  Had the DoD not used SS# and had used a truly synthetic serial number the loss of this data may not have been all that damaging to 26 million veterans who now face risk of identity theft.

The SOA approach that I am proposing (though not truly SOA by the book) requires that:

  • Understand what data is truly required for what parts of your application.

  • Isolate all sensitive customer data into a secure database and provide a service to access this data.  No direct access to this database from other systems.

  • Create a unique identifier that identifies your customer for your organization only.  DO NOT USE SS# or any other natural key.  Create a new synthetic key.

  • If necessary, associate this new key with any sensitive data identifier, example SS#, but the association or mapping table must be in the Secure database not the transaction database.

Now you have a system where sensitive data is protected and you do not face performance problems on most transactions as sensitive data is no longer required for these transactions.  In addition, now your data obfuscation requirements are also minimized.


If you want to ensure that your customers or employees information is secure then now is the time to act.  Starting immediately take the following steps:

  1. Database and application audit.

  2. For each database you find, immediately establish an obfuscation process and policy.

  3. Start evaluating and analyzing each database and application and start establishing policies for creating synthetic unique identifiers for your customers, employees etc.  Create a separate database with all sensitive data and protect the database offering access only via a secured service.

Categories:   IT
Actions:   E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

Add comment

(Will show your Gravatar icon)

  Country flag

  • Comment
  • Preview