It’s All About The
It often starts with the sales and marketing folks. You build a public-facing, web-based application and collect visitor information (name, address, phone number, etc.). Then you get requests from marketing to capture email contacts, demographics and user activity information. You don’t really know why they want this info or what they plan to do with it, but it’s easy enough to collect and they assure you there’s a sound business purpose. After all, the more information you put in your data lake, the better.
Historically, this kind of data collection hasn’t been a significant problem in the United States. Most companies were able to focus data protection efforts such as encryption on SSNs and credit card numbers. However, in light of recent laws, including GDPR, LGPD and CCPA (along with a growing number of additional privacy and data security laws being enacted at the state level), you need to honestly answer some questions:
• What data do you have?
• How is the data being used?
• Which partners do you share data with or sell data to?
• How long are you retaining the data?
• Do you have consent of the customer/consumer to retain the data?
• If requested, could you identify, modify or delete specific customer/consumer data?
The data landscape is becoming ever more complicated to navigate. Something that once seemed simple is growing to be the monster under the bed—and there’s good reason for concern. Imagine sitting on the witness stand having to explain the reasons (or lack thereof) for having vast amounts of arbitrary data after a breach. Managing vast amounts of data, arbitrary or not, isn’t easy, and it’s remarkably tedious. If the job were easy, and the lines of engagement were clear, we wouldn’t be subject to growing regulation.
Software tools are available to aid in data discovery and classification of databases and data flow. But until you get to a point of solid data discovery and classification, consider taking the following manual steps to get started:
1. Talk to your product owners/developers about what data exists, where it is stored, why and for how long.
2. Review your application’s input screen to verify that the type of information being requested is the minimum necessary. Effectively, don’t collect info if you can’t absolutely justify the need for it. After all, in Europe you will have to state the legal basis for collecting. Defaulting to “legitimate interest” isn’t good enough.
3. Classify and categorize your data.
4. Draw out the flow. This may resemble a cracked windshield and that is ok. Use these questions to guide your drawing:a. What type of data is collected (PII, SPII, PHI)? b. What applications do you send it to internally? c. Who do you send it to externally?
5. Once the picture is drawn, stand back and ask WHY, then ask WHY again. Keep asking until you are confident that you understand and can explain the data flow and why it flows that way.
6. Ask who holds that data, and how it is protected. Is it encrypted, at rest and in transit (internally and externally)?
7. Once you pick up your jaw from off the floor, it’s time to get to work.
Admittedly, this manual process isn’t scalable long-term, and the result can be stale as soon as it’s on paper. Hey, you have to start somewhere, right? As you think about how to move forward, plan to begin including privacy or security-by-design or conduct data protection impact assessments (DPIAs) when building or updating applications. These are not just trendy terms; they are descriptions of how to effectively manage and protect your data. In the same way that you build disaster recovery plans for your apps when you build them, you need to include an understanding of the data and how it will be used and managed. Including the foundational elements (what data do you have, why, where is it, who do you send it to, and how do you protect it?) as early and as regularly as possible will lead to big benefits down the road.
Collecting and tracking this information up front and during the application update process provides intelligence that you will need, such as:
• Understanding what data is held and why starts to answer the regulatory questions. These may not be in play yet. Count on them coming.
• Providing the basis of how to protect your data. Naturally, you should encrypt everything. If you don’t, this type of information can give you an idea of what you need to protect.
• Giving you a head start when a breach occurs with your company or a third party. Wouldn’t it be great to know whether you need to worry about a third-party breach and what your potential exposure might be?
Much of IT, security and privacy work isn’t as glamorous as it is made out to be. Analyzing hundreds of gigabytes or terabytes of logs, reviewing spreadsheets of vulnerability data, and checking out the latest in the NVD database can be tedious and monotonous. Mostly it’s just plain hard work. So, roll up your sleeves and start doing the work. Data is currency, and you have to think about what you must do to protect it. You can bet that more regulation and pressure will come, so now is the time to get a solid understanding of your data at hand