Jan
26
2019
--

Has the fight over privacy changed at all in 2019?

Few issues divide the tech community quite like privacy. Much of Silicon Valley’s wealth has been built on data-driven advertising platforms, and yet, there remain constant concerns about the invasiveness of those platforms.

Such concerns have intensified in just the last few weeks as France’s privacy regulator placed a record fine on Google under Europe’s General Data Protection Regulation (GDPR) rules which the company now plans to appeal. Yet with global platform usage and service sales continuing to tick up, we asked a panel of eight privacy experts: “Has anything fundamentally changed around privacy in tech in 2019? What is the state of privacy and has the outlook changed?” 

This week’s participants include:

TechCrunch is experimenting with new content forms. Consider this a recurring venue for debate, where leading experts – with a diverse range of vantage points and opinions – provide us with thoughts on some of the biggest issues currently in tech, startups and venture. If you have any feedback, please reach out: Arman.Tabatabai@techcrunch.com.


Thoughts & Responses:


Albert Gidari

Albert Gidari is the Consulting Director of Privacy at the Stanford Center for Internet and Society. He was a partner for over 20 years at Perkins Coie LLP, achieving a top-ranking in privacy law by Chambers, before retiring to consult with CIS on its privacy program. He negotiated the first-ever “privacy by design” consent decree with the Federal Trade Commission. A recognized expert on electronic surveillance law, he brought the first public lawsuit before the Foreign Intelligence Surveillance Court, seeking the right of providers to disclose the volume of national security demands received and the number of affected user accounts, ultimately resulting in greater public disclosure of such requests.

There is no doubt that the privacy environment changed in 2018 with the passage of California’s Consumer Privacy Act (CCPA), implementation of the European Union’s General Data Protection Regulation (GDPR), and new privacy laws enacted around the globe.

“While privacy regulation seeks to make tech companies betters stewards of the data they collect and their practices more transparent, in the end, it is a deception to think that users will have more “privacy.””

For one thing, large tech companies have grown huge privacy compliance organizations to meet their new regulatory obligations. For another, the major platforms now are lobbying for passage of a federal privacy law in the U.S. This is not surprising after a year of privacy miscues, breaches and negative privacy news. But does all of this mean a fundamental change is in store for privacy? I think not.

The fundamental model sustaining the Internet is based upon the exchange of user data for free service. As long as advertising dollars drive the growth of the Internet, regulation simply will tinker around the edges, setting sideboards to dictate the terms of the exchange. The tech companies may be more accountable for how they handle data and to whom they disclose it, but the fact is that data will continue to be collected from all manner of people, places and things.

Indeed, if the past year has shown anything it is that two rules are fundamental: (1) everything that can be connected to the Internet will be connected; and (2) everything that can be collected, will be collected, analyzed, used and monetized. It is inexorable.

While privacy regulation seeks to make tech companies betters stewards of the data they collect and their practices more transparent, in the end, it is a deception to think that users will have more “privacy.” No one even knows what “more privacy” means. If it means that users will have more control over the data they share, that is laudable but not achievable in a world where people have no idea how many times or with whom they have shared their information already. Can you name all the places over your lifetime where you provided your SSN and other identifying information? And given that the largest data collector (and likely least secure) is government, what does control really mean?

All this is not to say that privacy regulation is futile. But it is to recognize that nothing proposed today will result in a fundamental shift in privacy policy or provide a panacea of consumer protection. Better privacy hygiene and more accountability on the part of tech companies is a good thing, but it doesn’t solve the privacy paradox that those same users who want more privacy broadly share their information with others who are less trustworthy on social media (ask Jeff Bezos), or that the government hoovers up data at rate that makes tech companies look like pikers (visit a smart city near you).

Many years ago, I used to practice environmental law. I watched companies strive to comply with new laws intended to control pollution by creating compliance infrastructures and teams aimed at preventing, detecting and deterring violations. Today, I see the same thing at the large tech companies – hundreds of employees have been hired to do “privacy” compliance. The language is the same too: cradle to grave privacy documentation of data flows for a product or service; audits and assessments of privacy practices; data mapping; sustainable privacy practices. In short, privacy has become corporatized and industrialized.

True, we have cleaner air and cleaner water as a result of environmental law, but we also have made it lawful and built businesses around acceptable levels of pollution. Companies still lawfully dump arsenic in the water and belch volatile organic compounds in the air. And we still get environmental catastrophes. So don’t expect today’s “Clean Privacy Law” to eliminate data breaches or profiling or abuses.

The privacy world is complicated and few people truly understand the number and variety of companies involved in data collection and processing, and none of them are in Congress. The power to fundamentally change the privacy equation is in the hands of the people who use the technology (or choose not to) and in the hands of those who design it, and maybe that’s where it should be.


Gabriel Weinberg

Gabriel Weinberg is the Founder and CEO of privacy-focused search engine DuckDuckGo.

Coming into 2019, interest in privacy solutions is truly mainstream. There are signs of this everywhere (media, politics, books, etc.) and also in DuckDuckGo’s growth, which has never been faster. With solid majorities now seeking out private alternatives and other ways to be tracked less online, we expect governments to continue to step up their regulatory scrutiny and for privacy companies like DuckDuckGo to continue to help more people take back their privacy.

“Consumers don’t necessarily feel they have anything to hide – but they just don’t want corporations to profit off their personal information, or be manipulated, or unfairly treated through misuse of that information.”

We’re also seeing companies take action beyond mere regulatory compliance, reflecting this new majority will of the people and its tangible effect on the market. Just this month we’ve seen Apple’s Tim Cook call for stronger privacy regulation and the New York Times report strong ad revenue in Europe after stopping the use of ad exchanges and behavioral targeting.

At its core, this groundswell is driven by the negative effects that stem from the surveillance business model. The percentage of people who have noticed ads following them around the Internet, or who have had their data exposed in a breach, or who have had a family member or friend experience some kind of credit card fraud or identity theft issue, reached a boiling point in 2018. On top of that, people learned of the extent to which the big platforms like Google and Facebook that collect the most data are used to propagate misinformation, discrimination, and polarization. Consumers don’t necessarily feel they have anything to hide – but they just don’t want corporations to profit off their personal information, or be manipulated, or unfairly treated through misuse of that information. Fortunately, there are alternatives to the surveillance business model and more companies are setting a new standard of trust online by showcasing alternative models.


Melika Carroll

Melika Carroll is Senior Vice President, Global Government Affairs at Internet Association, which represents over 45 of the world’s leading internet companies, including Google, Facebook, Amazon, Twitter, Uber, Airbnb and others.

We support a modern, national privacy law that provides people meaningful control over the data they provide to companies so they can make the most informed choices about how that data is used, seen, and shared.

“Any national privacy framework should provide the same protections for people’s data across industries, regardless of whether it is gathered offline or online.”

Internet companies believe all Americans should have the ability to access, correct, delete, and download the data they provide to companies.

Americans will benefit most from a federal approach to privacy – as opposed to a patchwork of state laws – that protects their privacy regardless of where they live. If someone in New York is video chatting with their grandmother in Florida, they should both benefit from the same privacy protections.

It’s also important to consider that all companies – both online and offline – use and collect data. Any national privacy framework should provide the same protections for people’s data across industries, regardless of whether it is gathered offline or online.

Two other important pieces of any federal privacy law include user expectations and the context in which data is shared with third parties. Expectations may vary based on a person’s relationship with a company, the service they expect to receive, and the sensitivity of the data they’re sharing. For example, you expect a car rental company to be able to track the location of the rented vehicle that doesn’t get returned. You don’t expect the car rental company to track your real-time location and sell that data to the highest bidder. Additionally, the same piece of data can have different sensitivities depending on the context in which it’s used or shared. For example, your name on a business card may not be as sensitive as your name on the sign in sheet at an addiction support group meeting.

This is a unique time in Washington as there is bipartisan support in both chambers of Congress as well as in the administration for a federal privacy law. Our industry is committed to working with policymakers and other stakeholders to find an American approach to privacy that protects individuals’ privacy and allows companies to innovate and develop products people love.


Johnny Ryan

Dr. Johnny Ryan FRHistS is Chief Policy & Industry Relations Officer at Brave. His previous roles include Head of Ecosystem at PageFair, and Chief Innovation Officer of The Irish Times. He has a PhD from the University of Cambridge, and is a Fellow of the Royal Historical Society.

Tech companies will probably have to adapt to two privacy trends.

“As lawmakers and regulators in Europe and in the United States start to think of “purpose specification” as a tool for anti-trust enforcement, tech giants should beware.”

First, the GDPR is emerging as a de facto international standard.

In the coming years, the application of GDPR-like laws for commercial use of consumers’ personal data in the EU, Britain (post-EU), Japan, India, Brazil, South Korea, Malaysia, Argentina, and China will bring more than half of global GDP under a similar standard.

Whether this emerging standard helps or harms United States firms will be determined by whether the United States enacts and actively enforces robust federal privacy laws. Unless there is a federal GDPR-like law in the United States, there may be a degree of friction and the potential of isolation for United States companies.

However, there is an opportunity in this trend. The United States can assume the global lead by doing two things. First, enact a federal law that borrows from the GDPR, including a comprehensive definition of “personal data”, and robust “purpose specification”. Second, invest in world-leading regulation that pursues test cases, and defines practical standards. Cutting edge enforcement of common principles-based standards is de facto leadership.

Second, privacy and antitrust law are moving closer to each other, and might squeeze big tech companies very tightly indeed.

Big tech companies “cross-use” user data from one part of their business to prop up others. The result is that a company can leverage all the personal information accumulated from its users in one line of business, and for one purpose, to dominate other lines of business too.

This is likely to have anti-competitive effects. Rather than competing on the merits, the company can enjoy the unfair advantage of massive network effects even though it may be starting from scratch in a new line of business. This stifles competition and hurts innovation and consumer choice.

Antitrust authorities in other jurisdictions have addressed this. In 2015, the Belgian National Lottery was fined for re-using personal information acquired through its monopoly for a different, and incompatible, line of business.

As lawmakers and regulators in Europe and in the United States start to think of “purpose specification” as a tool for anti-trust enforcement, tech giants should beware.


John Miller

John Miller is the VP for Global Policy and Law at the Information Technology Industry Council (ITI), a D.C. based advocate group for the high tech sector.  Miller leads ITI’s work on cybersecurity, privacy, surveillance, and other technology and digital policy issues.

Data has long been the lifeblood of innovation. And protecting that data remains a priority for individuals, companies and governments alike. However, as times change and innovation progresses at a rapid rate, it’s clear the laws protecting consumers’ data and privacy must evolve as well.

“Data has long been the lifeblood of innovation. And protecting that data remains a priority for individuals, companies and governments alike.”

As the global regulatory landscape shifts, there is now widespread agreement among business, government, and consumers that we must modernize our privacy laws, and create an approach to protecting consumer privacy that works in today’s data-driven reality, while still delivering the innovations consumers and businesses demand.

More and more, lawmakers and stakeholders acknowledge that an effective privacy regime provides meaningful privacy protections for consumers regardless of where they live. Approaches, like the framework ITI released last fall, must offer an interoperable solution that can serve as a model for governments worldwide, providing an alternative to a patchwork of laws that could create confusion and uncertainty over what protections individuals have.

Companies are also increasingly aware of the critical role they play in protecting privacy. Looking ahead, the tech industry will continue to develop mechanisms to hold us accountable, including recommendations that any privacy law mandate companies identify, monitor, and document uses of known personal data, while ensuring the existence of meaningful enforcement mechanisms.


Nuala O’Connor

Nuala O’Connor is president and CEO of the Center for Democracy & Technology, a global nonprofit committed to the advancement of digital human rights and civil liberties, including privacy, freedom of expression, and human agency. O’Connor has served in a number of presidentially appointed positions, including as the first statutorily mandated chief privacy officer in U.S. federal government when she served at the U.S. Department of Homeland Security. O’Connor has held senior corporate leadership positions on privacy, data, and customer trust at Amazon, General Electric, and DoubleClick. She has practiced at several global law firms including Sidley Austin and Venable. She is an advocate for the use of data and internet-enabled technologies to improve equity and amplify marginalized voices.

For too long, Americans’ digital privacy has varied widely, depending on the technologies and services we use, the companies that provide those services, and our capacity to navigate confusing notices and settings.

“Americans deserve comprehensive protections for personal information – protections that can’t be signed, or check-boxed, away.”

We are burdened with trying to make informed choices that align with our personal privacy preferences on hundreds of devices and thousands of apps, and reading and parsing as many different policies and settings. No individual has the time nor capacity to manage their privacy in this way, nor is it a good use of time in our increasingly busy lives. These notices and choices and checkboxes have become privacy theater, but not privacy reality.

In 2019, the legal landscape for data privacy is changing, and so is the public perception of how companies handle data. As more information comes to light about the effects of companies’ data practices and myriad stewardship missteps, Americans are surprised and shocked about what they’re learning. They’re increasingly paying attention, and questioning why they are still overburdened and unprotected. And with intensifying scrutiny by the media, as well as state and local lawmakers, companies are recognizing the need for a clear and nationally consistent set of rules.

Personal privacy is the cornerstone of the digital future people want. Americans deserve comprehensive protections for personal information – protections that can’t be signed, or check-boxed, away. The Center for Democracy & Technology wants to help craft those legal principles to solidify Americans’ digital privacy rights for the first time.


Chris Baker

Chris Baker is Senior Vice President and General Manager of EMEA at Box.

Last year saw data privacy hit the headlines as businesses and consumers alike were forced to navigate the implementation of GDPR. But it’s far from over.

“…customers will have trust in a business when they are given more control over how their data is used and processed”

2019 will be the year that the rest of the world catches up to the legislative example set by Europe, as similar data regulations come to the forefront. Organizations must ensure they are compliant with regional data privacy regulations, and more GDPR-like policies will start to have an impact. This can present a headache when it comes to data management, especially if you’re operating internationally. However, customers will have trust in a business when they are given more control over how their data is used and processed, and customers can rest assured knowing that no matter where they are in the world, businesses must meet the highest bar possible when it comes to data security.

Starting with the U.S., 2019 will see larger corporations opt-in to GDPR to support global business practices. At the same time, local data regulators will lift large sections of the EU legislative framework and implement these rules in their own countries. 2018 was the year of GDPR in Europe, and 2019 be the year of GDPR globally.


Christopher Wolf

Christopher Wolf is the Founder and Chair of the Future of Privacy Forum think tank, and is senior counsel at Hogan Lovells focusing on internet law, privacy and data protection policy.

With the EU GDPR in effect since last May (setting a standard other nations are emulating),

“Regardless of the outcome of the debate over a new federal privacy law, the issue of the privacy and protection of personal data is unlikely to recede.”

with the adoption of a highly-regulatory and broadly-applicable state privacy law in California last Summer (and similar laws adopted or proposed in other states), and with intense focus on the data collection and sharing practices of large tech companies, the time may have come where Congress will adopt a comprehensive federal privacy law. Complicating the adoption of a federal law will be the issue of preemption of state laws and what to do with the highly-developed sectoral laws like HIPPA and Gramm-Leach-Bliley. Also to be determined is the expansion of FTC regulatory powers. Regardless of the outcome of the debate over a new federal privacy law, the issue of the privacy and protection of personal data is unlikely to recede.

Jul
23
2018
--

SessionM customer loyalty data aggregator snags $23.8 M investment

SessionM announced a $23.8 million Series E investment led by Salesforce Ventures. A bushel of existing investors including Causeway Media Partners, CRV, General Atlantic, Highland Capital and Kleiner Perkins Caufield & Byers also contributed to the round. The company has now raised over $97 million.

At its core, SessionM aggregates loyalty data for brands to help them understand their customer better, says company co-founder and CEO Lars Albright. “We are a customer data and engagement platform that helps companies build more loyal and profitable relationships with their consumers,” he explained.

Essentially that means, they are pulling data from a variety of sources and helping brands offer customers more targeted incentives, offers and product recommendations “We give [our users] a holistic view of that customer and what motivates them,” he said.

Screenshot: SessionM (cropped)

To achieve this, SessionM takes advantage of machine learning to analyze the data stream and integrates with partner platforms like Salesforce, Adobe and others. This certainly fits in with Adobe’s goal to build a customer service experience system of record and Salesforce’s acquisition of Mulesoft in March to integrate data from across an organization, all in the interest of better understanding the customer.

When it comes to using data like this, especially with the advent of GDPR in the EU in May, Albright recognizes that companies need to be more careful with data, and that it has really enhanced the sensitivity around stewardship for all data-driven businesses like his.

“We’ve been at the forefront of adopting the right product requirements and features that allow our clients and businesses to give their consumers the necessary control to be sure we’re complying with all the GDPR regulations,” he explained.

The company was not discussing valuation or revenue. Their most recent round prior to today’s announcement, was a Series D in 2016 for $35 million also led by Salesforce Ventures.

SessionM, which was founded in 2011, has around 200 employees with headquarters in downtown Boston. Customers include Coca-Cola, L’Oreal and Barney’s.

Jun
25
2018
--

BigID scores $30 million Series B months after closing A round

BigID announced a big $30 million Series B round today, which comes on the heels of closing their $14M A investment in January. It’s been a whirlwind year for the NYC data security startup as GDPR kicked in and companies came calling for their products.

The round was led by Scale Venture Partners with participation from previous investors ClearSky Security, Comcast Ventures, Boldstart Ventures, Information Venture Partners and SAP.io.

BigID has a product that helps companies inventory their data, even extremely large data stores, and identify the most sensitive information, a convenient feature at a time where GDPR data privacy rules, which went into effect at the end of May, require that companies doing business in the EU have a grip on their customer data.

That’s certainly something that caught the eye of Ariel Tseitlin from Scale Venture Partners. “We talked to a lot of companies, how they feel more specifically about GDPR, and more broadly about how they think about data within in their organizations, and we got a very strong signal that there is a lot of concern around the regulation and how to prepare for that, but also more fundamentally, that CIOs and chief data officers don’t have a good sense of where data resides within their organizations,” he explained.

Dimitri Sirota, CEO and co-founder, says that GDPR is a nice business driver, but he sees the potential to grow the data security market much more broadly than simply as a way to comply with one regulatory ruling or another. He says that American companies are calling, even some without operations in Europe because they see getting a grip on their customer data as a fundamental business imperative.

BigID product collage. Graphic: BigID

The company plans to expand their partner go-to market strategy in the coming the months, another approach that could translate to increased sales. That will include global systems integrators. Sirota says to expect announcements involving the usual suspects in the coming months. “You’ll see over the next little bit, several announcements with many of the names that you’re familiar with in terms of go-to market and global relationships,” he said.

Finally there are the strategic investors in this deal, including Comcast and SAP, which Sirota thinks will also ultimately help them get enterprise deals they might not have landed up until now. The $30 million runway also gives customers who might have been skittish about dealing with a young-ish startup, more confidence to make the deal.

BigID seems to have the right product at the right time. Scale’s Tseitlin, who will join the board as part of the deal, certainly sees the potential of this company to scale far beyond its current state.

“The area where we tend to spend a lot of time, and I think is what attracted Dimitri to having us as an investor, is that we really help with the scaling phase of company growth,” he said. True to their name, Scale tries to get the company to that next level beyond product/market fit to where they can deliver consistently and continually grow revenue. They have done this with Box and DocuSign and others and hope that BigID is next.

Jun
04
2018
--

Egnyte releases one-step GDPR compliance solution

Egnyte has always had the goal of protecting data and files wherever they live, whether on-premises or in the cloud. Today, the company announced a new feature to help customers comply with GDPR privacy regulations that went into effect in Europe last week in a straight-forward fashion.

You can start by simply telling Egnyte that you want to turn on “Identify sensitive content.” You then select which sets of rules you want to check for compliance including GDPR. Once you do this, the system goes and scans all of your repositories to find content deemed sensitive under GDPR rules (or whichever other rules you have selected).

Photo: Egnyte

It then gives you a list of files and marks them with a risk factor from 1-9 with one being the lowest level of risk and 9 being the highest. You can configure the program to expose whichever files you wish based on your own level of compliance tolerance. So for instance, you could ask to see any files with a risk level of seven or higher.

“In essence, it’s a data security and governance solution for unstructured data, and we are approaching that at the repository levels. The goal is to provide visibility, control and protection of that information in any in any unstructured repository,” Jeff Sizemore, VP of governance for Egnyte Protect told TechCrunch.

Photo: Egnyte

Sizemore says that Egnyte weighs the sensitivity of the data against the danger it could be exposed and leave a customer in violation of GDPR rules. “We look at things like public links into groups, which is basically just governance of the data, making sure nothing is wide open from a file share perspective. We also look at how the information is being shared,” Sizemore said. A social security number being shared internally is a lot less risky than a thousand social security numbers being shared in a public link.

The service covers 28 nations and 24 languages and it’s pre-configured to understand what data is considered sensitive by country and language. “We already have all the mapping and all the languages sitting underneath these policies. We are literally going into the data and actually scanning through and looking for GDPR-relevant data that’s in the scope of Article 40.”

The new service is generally available on Tuesday morning. The company will be makign an announcement at the InfoSecurity Conference in London. It has had the service in Beta prior to this.

May
24
2018
--

Box expands Zones to manage content in multiple regions

When Box announced Zones a couple of years ago, it was providing a way for customers to store data outside the U.S., but there were some limits. Each customer could choose the U.S. and one additional zone. Customers wanted more flexibility, and today the company announced it was allowing them to choose to multiple zones.

The new feature gives a company the ability to store content across any of the 7 zones (plus the U.S) that Box currently supports across the world. A zone is essentially a Box co-location datacenter partner in various locations. The customer can now choose a default zone and then manage multiple zones from a single customer ID in the Box admin console, according to Jeetu Patel, chief product officer at Box.

Current Box Zones. Photo: Box

Content will go to a defined default zone unless the admin creates rules specifying another location. In terms of data sovereignty, the file will always live in the country of record, even if an employee outside that country has access to it. From an end user perspective, they won’t know where the content lives if the administrators allow access to it.

This may not seem like a huge deal on its face, but from a content management standpoint, it presented some challenges. Patel says the company designed the product with this ability in mind from the start, but it took some development time to get there.

“When we launched Zones we knew we would [eventually require] multi-zone capability, and we had to make sure the architecture could handle that,” Patel explained. They did this by abstracting the architecture to separate the storage and business logic tiers. Creating this modular approach allowed them to increase the capabilities as they built out Zones.

It doesn’t hurt that this feature is being made available just days before the EU’s GDPR data privacy rules are going into effect. “Zones is not just for GDPR, but it does help customers meet their GDPR obligations,” Patel said.

Overall, Zones is part of Box’s strategy to provide content management services in the cloud and give customers, even regulated industries, the ability to control how that content is used. This expansion is one more step on that journey.

May
05
2018
--

Data Protection and GDPR

The cat picture (Bella) is just to soothe you because this isn't a thrilling post, but I feel this is important information you should be aware of.

Data Protection and Privacy has been in the news a lot in recent years, what with the Facebook/Cambridge Analytica scandal, web site breaches galore, and now the introduction of the GDPR in Europe. For those that care, that's the General Data Protection Regulation that governs the gathering, use and security of data.

Yeah, boring stuff. Stop yawning at the back! ? If you want to waste a weekend, go and Google GDPR.

Meantime, and this is where I want you to start reading ?

For most of you, the only personal information I store is your email address, but I take data privacy and security extremely seriously.

I never sell or share your email address or ANY personal information.

If you want to know more about the privacy and protection of your data, please read my privacy policy.

Apr
21
2018
--

BigID lands in the right place at the right time with GDPR

Every startup needs a little skill and a little luck. BigID, a NYC-based data governance solution has been blessed with both. The company, which helps customers identify sensitive data in big data stores, launched at just about the same time that the EU announced the GDPR data privacy regulations. Today, the company is having trouble keeping up with the business.

While you can’t discount that timing element, you have to have a product that actually solves a problem and BigID appears to meet that criteria. “This how the market is changing by having and demanding more technology-based controls over how data is being used,” company CEO and co-founder Dimitri Sirota told TechCrunch.

Sirota’s company enables customers to identify the most sensitive data from among vast stores of data. In fact, he says some customers have hundreds of millions of users, but their unique advantage is having built the solution more recently. That provides a modern architecture that can scale to meet these big data requirements, while identifying the data that requires your attention in a way that legacy systems just aren’t prepared to do.

“When we first started talking about this [in 2016] people didn’t grok it. They didn’t understand why you would need a privacy-centric approach. Even after 2016 when GDPR passed, most people didn’t see this. [Today] we are seeing a secular change. The assets they collect are valuable, but also incredibly toxic,” he said. It is the responsibility of the data owner to identify and protect the personal data under their purview under the GDPR rules, and that creates a data double-edged sword because you don’t want to be fined for failing to comply.

GDPR is a set of data privacy regulations that are set to take effect in the European Union at the end of May. Companies have to comply with these rules or could face stiff fines. The thing is GDPR could be just the beginning. The company is seeing similar data privacy regulations in Canada, Australia, China and Japan. Something akin go this could also be coming to the United States after Facebook CEO, Mark Zuckerberg appeared before Congress earlier this month. At the very least we could see state-level privacy laws in the US, Sirota said.

Sirota says there are challenges getting funded as a NYC startup because there hadn’t been a strong big enterprise ecosystem in place until recently, but that’s changing. “Starting an enterprise company in New York is challenging. Ed Sim from Boldstart [A New York City early stage VC firm that invests in enterprise startups] has helped educate through investment and partnerships. More challenging, but it’s reaching a new level now,” he said.

The company launched in 2016 and has raised $16.1 million to date. It scored the bulk of that in a $14 million round at the end of January. Just this week at the RSAC Sandbox competition at the RSA Conference in San Francisco, BigID was named the Most Innovative Startup in a big recognition of the work they are doing around GDPR.

Mar
13
2018
--

Don’t Get Hit with a Database Disaster: Database Security Compliance

Percona Live 2018 security talks

In this post, we discuss database security compliance, what you should be looking at and where to get more information.

As Percona’s Chief Customer Officer, I get the opportunity to talk with a lot of customers. Hearing about the problems that both their technical teams face, as well as the business challenges their companies experience first-hand is incredibly valuable in terms of what the market is facing in general. Not every problem you see has a purely technical solution, and not every good technical solution solves the core business problem.

Matt Yonkovit, Percona CCOAs database technology advances and data continues to be the core blood of most modern applications, DBA’s will have a say in business level strategic planning more than ever. This coincides with the advances in technology and automation that make many classic manual “DBA” jobs and tasks obsolete. Traditional DBA’s are evolving into a blend of system architect, data strategist and master database architect. I want to talk about the business problems that not only the C-Suite care about, but DBAs as a whole need to care about in the near future.

Let’s start with one topic everyone should have near the top of their list: security.

We did a recent survey of our customers, and their biggest concern right now is security and compliance.

Not long ago, most DBA’s I knew dismissed this topic as “someone else’s problem” (I remember being told that the database is only as secure as the network, so fix the network!). Long gone are the days when network security was enough. Even the DBA’s who did worry about security only did so within the limited scope of what the database system could provide out of the box.  Again, not enough.

So let me run an experiment:

Raise your hand if your company has some bigger security initiative this year. 

I’m betting a lot of you raised your hand!

Security is not new to the enterprise. It’s been a priority for years now. However, it has not been receiving a hyper-focus in the open source database space until the last three years or so. Why? There have been a number of high profile database security breaches in the last year, all highlighting a need for better database security. This series of serious data breaches have exposed how fragile some security protocols in companies are. If that was not enough, new government regulations and laws have made data protection non-optional. This means you have to take the security of your database seriously, or there could be fines and penalties.

Percona Live 2018 security talksGovernment regulations are nothing new, but the breadth and depth of these are growing and are opening up a whole new challenge for databases systems and administrators. GDPR was signed into law two years ago (you can read more here: https://en.wikipedia.org/wiki/General_Data_Protection_Regulation and https://www.dataiq.co.uk/blog/summary-eu-general-data-protection-regulation) and is scheduled to take effect on May 25, 2018. This has many businesses scrambling not only to understand the impact, but figure out how they need to comply. These regulations redefine simple things, like what constitutes “personal data” (for instance, your anonymous buying preferences or location history even without your name).

New requirements also mean some areas get a bit more complicated as they approach the gray area of definition. For instance, GDPR guarantees the right to be forgotten. What does this mean? In theory, it means end-users can request that all their personal information is removed from your systems as if they did not exist. Seems simple, but in reality, you can go as far down the rabbit hole as you want. Does your application support this already? What about legacy applications? Even if the apps can handle it, does this mean previously taken database backups have to forget you as well? There is a lot to process for sure.

So what are the things you can do?

  1. Educate yourself and understand expectations, even if you weren’t involved in compliance discussions before.
  2. Start working on incremental improvements now on your data security. This is especially true in the area’s where you have some control, without massive changes to the application. Encryption at rest is a great place to start if you don’t have it.
  3. Start talking with others in the organization about how to identify and protect personal information.
  4. Look to increase security by default by getting involved in new applications early in the design phase.

The good news is you are not alone in tackling this challenge. Every company must address it. Because of this focus on security, we felt strongly about ensuring we had a security track at Percona Live 2018 this year. These talks from Fastly, Facebook, Percona, and others provide information on how companies around the globe are tackling these security issues. In true open source fashion, we are better when we learn and grow from one another.

What are the Percona Live 2018 security talks?

We have a ton of great security content this year at Percona Live, across a bunch of technologies and open source software. Some of the more interesting Percona Live 2018 security talks are:

Want to attend Percona Live 2018 security talks? Register for Percona Live 2018. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Mar
09
2018
--

InfoSum’s first product touts decentralized big data insights

Nick Halstead’s new startup, InfoSum, is launching its first product today — moving one step closer to his founding vision of a data platform that can help businesses and organizations unlock insights from big data silos without compromising user privacy, data security or data protection law. So a pretty high bar then.

If the underlying tech lives up to the promises being made for it, the timing for this business looks very good indeed, with the European Union’s new General Data Protection Regulation (GDPR) mere months away from applying across the region — ushering in a new regime of eye-wateringly large penalties to incentivize data handling best practice.

InfoSum bills its approach to collaboration around personal data as fully GDPR compliant — because it says it doesn’t rely on sharing the actual raw data with any third parties.

Rather a mathematical model is used to make a statistical comparison, and the platform delivers aggregated — but still, says Halstead — useful insights. Though he says the regulatory angle is fortuitous, rather than the full inspiration for the product.

“Two years ago, I saw that the world definitely needed a different way to think about working on knowledge about people,” he tells TechCrunch. “Both for privacy [reasons] — there isn’t a week where we don’t see some kind of data breach… they happen all the time — but also privacy isn’t enough by itself. There has to be a commercial reason to change things.”

The commercial imperative he reckons he’s spied is around how “unmanageable” big data can become when it’s pooled for collaborative purposes.

Datasets invariably need a lot of cleaning up to make different databases align and overlap. And the process of cleaning and structuring data so it can be usefully compared can run to multiple weeks. Yet that effort has to be put in before you really know if it will be worth your while doing so.

That snag of time + effort is a major barrier preventing even large companies from doing more interesting things with their data holdings, argues Halstead.

So InfoSum’s first product — called Link — is intended to give businesses a glimpse of the “art of the possible”, as he puts it — in just a couple of hours, rather than the “nine, ten weeks” he says it might otherwise take them.

“I set myself a challenge… could I get through the barriers that companies have around privacy, security, and the commercial risks when they handle consumer data. And, more importantly, when they need to work with third parties or need to work across their corporation where they’ve got numbers of consumer data and they want to be able to look at that data and look at the combined knowledge across those.

“That’s really where I came up with this idea of non-movement of data. And that’s the core principle of what’s behind InfoSum… I can connect knowledge across two data sets, as if they’ve been pooled.”

Halstead says that the problem with the traditional data pooling route — so copying and sharing raw data with all sorts of partners (or even internally, thereby expanding the risk vector surface area) — is that it’s risky. The myriad data breaches that regularly make headlines nowadays are a testament to that.

But that’s not the only commercial consideration in play, as he points out that raw data which has been shared is immediately less valuable — because it can’t be sold again.

“If I give you a data set in its raw form, I can’t sell that to you again — you can take it away, you can slice it and dice it as many ways as you want. You won’t need to come back to me for another three or four years for that same data,” he argues. “From a commercial point of view [what we’re doing] makes the data more valuable. In that data is never actually having to be handed over to the other party.”

Not blockchain for privacy

Decentralization, as a technology approach, is also of course having a major moment right now — thanks to blockchain hype. But InfoSum is definitely not blockchain. Which is a good thing. No sensible person should be trying to put personal data on a blockchain.

“The reality is that all the companies that say they’re doing blockchain for privacy aren’t using blockchain for the privacy part, they’re just using it for a trust model, or recording the transactions that occur,” says Halstead, discussing why blockchain is terrible for privacy.

“Because you can’t use the blockchain and say it’s GDPR compliant or privacy safe. Because the whole transparency part of it and the fact that it’s immutable. You can’t have an immutable database where you can’t then delete users from it. It just doesn’t work.”

Instead he describes InfoSum’s technology as “blockchain-esque” — because “everyone stays holding their data”. “The trust is then that because everyone holds their data, no one needs to give their data to everyone else. But you can still crucially, through our technology, combine the knowledge across those different data sets.”

So what exactly is InfoSum doing to the raw personal data to make it “privacy safe”? Halstead claims it goes “beyond hashing” or encrypting it. “Our solution goes beyond that — there is no way to re-identify any of our data because it’s not ever represented in that way,” he says, further claiming: “It is absolutely 100 per cent data isolation, and we are the only company doing this in this way.

“There are solutions out there where traditional models are pooling it but with encryption on top of it. But again if the encryption gets broken the data is still ending up being in a single silo.”

InfoSum’s approach is based on mathematically modeling users, using a “one way model”, and using that to make statistical comparisons and serve up aggregated insights.

“You can’t read things out of it, you can only test things against it,” he says of how it’s transforming the data. “So it’s only useful if you actually knew who those users were beforehand — which obviously you’re not going to. And you wouldn’t be able to do that unless you had access to our underlying code-base. Everyone else either users encryption or hashing or a combination of both of those.”

This one-way modeling technique is in the process of being patented — so Halstead says he can’t discuss the “fine details” — but he does mention a long standing technique for optimizing database communications, called bloom filters, saying those sorts of “principles” underpin InfoSum’s approach.

Although he also says it’s using those kind of techniques differently. Here’s how InfoSum’s website describes this process (which it calls Quantum):

InfoSum Quantum irreversibly anonymises data and creates a mathematical model that enables isolated datasets to be statistically compared. Identities are matched at an individual level and results are collated at an aggregate level – without bringing the datasets together.

On the surface, the approach shares a similar structure to Facebook’s Custom Audiences Product, where advertisers’ customer lists are locally hashed and then uploaded to Facebook for matching against its own list of hashed customer IDs — with any matches used to create a custom audience for ad targeting purposes.

Though Halstead argues InfoSum’s platform offers more for even this kind of audience building marketing scenario, because its users can use “much more valuable knowledge” to model on — knowledge they would not comfortably share with Facebook “because of the commercial risks of handing over that first person valuable data”.

“For instance if you had an attribute that defined which were your most valuable customers, you would be very unlikely to share that valuable knowledge — yet if you could safely then it would be one of the most potent indicators to model upon,” he suggests.

He also argues that InfoSum users will be able to achieve greater marketing insights via collaborations with other users of the platform vs being a customer of Facebook Custom Audiences — because Facebook simply “does not open up its knowledge”.

“You send them your customer lists, but they don’t then let you have the data they have,” he adds. “InfoSum for many DMPs [data management platforms] will allow them to collaborate with customers so the whole purchasing of marketing can be much more transparent.”

He also emphasizes that marketing is just one of the use-cases InfoSum’s platform can address.

Decentralized bunkers of data

One important clarification: InfoSum customers’ data does get moved — but it’s moved into a “private isolated bunker” of their choosing, rather than being uploaded to a third party.

“The easiest one to use is where we basically create you a 100 per cent isolated instance in Amazon [Web Services],” says Halstead. “We’ve worked with Amazon on this so that we’ve used a whole number of techniques so that once we create this for you, you put your data into it — we don’t have access to it. And when you connect it to the other part we use this data modeling so that no data then moves between them.”

“The ‘bunker’ is… an isolated instance,” he adds, elaborating on how communications with these bunkers are secured. “It has its own firewall, a private VPN, and of course uses standard SSL security. And once you have finished normalising the data it is turned into a form in which all PII [personally identifiable information] is deleted.

“And of course like any other security related company we have had independent security companies penetration test our solution and look at our architecture design.”

Other key pieces of InfoSum’s technology are around data integration and identity mapping — aimed at tackling the (inevitable) problem of data in different databases/datasets being stored in different formats. Which again is one of the commercial reasons why big data silos often stay just that: Silos.

Halstead gave TechCrunch a demo showing how the platform ingests and connects data, with users able to use “simple steps” to teach the system what is meant by data types stored in different formats — such as that ‘f’ means the same as ‘female’ for gender category purposes — to smooth the data mapping and “try to get it as clean as possible”.

Once that step has been completed, the user (or collaborating users) are able to get a view on how well linked their data sets are — and thus to glimpse “the start of the art of the possible”.

In practice this means they can choose to run different reports atop their linked datasets — such as if they want to enrich their data holdings by linking their own users across different products to gain new insights, such as for internal research purposes.

Or, where there’s two InfoSum users linking different data sets, they could use it for propensity modeling or lookalike modeling of customers, says Halstead. So, for example, a company could link models of their users with models of the users of a third party that holds richer data on its users to identify potential new customer types to target marketing at.

“Because I’ve asked to look at the overlap I can literally say I only know the gender of these people but I would also like to know what their income is,” he says, fleshing out another possible usage scenario. “You can’t drill into this, you can’t do really deep analytics — that’s what we’ll be launching later. But Link allows you to get this idea of what would it look like if I combine our datasets.

“The key here is it’s opening up a whole load of industries where sensitivity around doing this — and where, even in industries that share a lot of data already but where GDPR is going to be a massive barrier to it in the future.”

Halstead says he expects big demand from the marketing industry which is of course having to scramble to rework its processes to ensure they don’t fall foul of GDPR.

“Within marketing there is going to be a whole load of new challenges for companies where they were currently enhancing their databases, buying up large raw datasets and bringing their data into their own CRM. That world’s gone once we’ve got GDPR.

“Our model is safer, faster, and actually still really lets people do all the things they did before but while protecting the customers.”

But it’s not just marketing exciting him. Halstead believes InfoSum’s approach to lifting insights from personal data could be very widely applicable — arguing, for example, that it’s only a minority of use-cases, such as credit risk and fraud within banking, where companies actually need to look at data at an individual level.

One area he says he’s “very passionate” about InfoSum’s potential is in the healthcare space.

“We believe that this model isn’t just about helping marketing and helping a whole load of others — healthcare especially for us I think is going to be huge. Because [this affords] the ability to do research against health data where health data is never been actually shared,” he says.

“In the UK especially we’ve had a number of massive false starts where companies have, for very good reasons, wanted to be able to look at health records and combine data — which can turn into vital research to help people. But actually their way of doing it has been about giving out large datasets. And that’s just not acceptable.”

He even suggests the platform could be used for training AIs within the isolated bunkers — flagging a developer interface that will be launching after Link which will let users query the data as a traditional SQL query.

Though he says he sees most initial healthcare-related demand coming from analytics that need “one or two additional attributes” — such as, for example, comparing health records of people with diabetes with activity tracker data to look at outcomes for different activity levels.

“You don’t need to drill down into individuals to know that the research capabilities could give you incredible results to understand behavior,” he adds. “When you do medical research you need bodies of data to be able to prove things so the fact that we can only work at an aggregate level is not, I don’t think, any barrier to being able to do the kind of health research required.”

Another area he believes could really benefit is M&A — saying InfoSum’s platform could offer companies a way to understand how their user bases overlap before they sign on the line. (It is also of course handling and thus simplifying the legal side of multiple entities collaborating over data sets.)

“There hasn’t been the technology to allow them to look at whether there’s an overlap before,” he claims. “It puts the power in the hands of the buyer to be able to say we’d like to be able to look at what your user base looks like in comparison to ours.

“The problem right now is you could do that manually but if they then backed out there’s all kinds of legal problems because I’ve had to hand the raw data over… so no one does it. So we’re going to change the M&A market for allowing people to discover whether I should acquire someone before they go through to the data room process.”

While Link is something of a taster of what InfoSum’s platform aims to ultimately offer (with this first product priced low but not freemium), the SaaS business it’s intending to get into is data matchmaking — whereby, once it has a pipeline of users, it can start to suggest links that might be interesting for its customers to explore.

“There is no point in us reinventing the wheel of being the best visualization company because there’s plenty that have done that,” he says. “So we are working on data connectors for all of the most popular BI tools that plug in to then visualize the actual data.

“The long term vision for us moves more into being more of an introductory service — i.e. one we’ve got 100 companies in this how do we help those companies work out what other companies that they should be working with.”

“We’ve got some very good systems for — in a fully anonymized way — helping you understand what the intersection is from your data to all of the other datasets, obviously with their permission if they want us to calculate that for them,” he adds.

“The way our investors looked at this, this is the big opportunity going forward. There is not limit, in a decentralized world… imagine 1,000 bunkers around the world in these different corporates who all can start to collaborate. And that’s our ultimate goal — that all of them are still holding onto their own knowledge, 100% privacy safe, but then they have that opportunity to work with each other, which they don’t right now.”

Engineering around privacy risks?

But does he not see any risks to privacy of enabling the linking of so many separate datasets — even with limits in place to avoid individuals being directly outed as connected across different services?

“However many data sets there are the only thing it can reveal extra is whether every extra data has an extra bit of knowledge,” he responds on that. “And every party has the ability to define  what bit of data they would then want to be open to others to then work on.

“There are obviously sensitivities around certain combinations of attributes, around religion, gender and things like that. Where we already have a very clever permission system where the owners can define what combinations are acceptable and what aren’t.”

“My experience of working with all the social networks has meant — I hope — that we are ahead of the game of thinking about those,” he adds, saying that the matchmaking stage is also six months out at this point.

“I don’t see any down sides to it, as long as the controls are there to be able to limit it. It’s not like it’s going to be a sudden free for all. It’s an introductory service, rather than an open platform so everyone can see everything else.”

The permission system is clearly going to be important. But InfoSum does essentially appear to be heading down the platform route of offloading responsibility for ethical considerations — in its case around dataset linkages — to its customers.

Which does open the door to problematic data linkages down the line, and all sorts of unintended dots being joined.

Say, for example, a health clinic decides to match people with particular medical conditions to users of different dating apps — and the relative proportions of HIV rates across straight and gay dating apps in the local area gets published. What unintended consequences might spring from that linkage being made?

Other equally problematic linkages aren’t hard to imagine. And we’ve seen the appetite businesses have for making creepy observations about their users public.

“Combining two sets of aggregate data meaningfully is not easy,” says Eerke Boiten, professor of cyber security at De Montfort University, discussing InfoSum’s approach. “If they can make this all work out in a way that makes sense, preserves privacy, and is GDPR compliant, then they deserve a patent I suppose.”

On data linkages, Boiten points to the problems Facebook has had with racial profiling as illustrative of the potential pitfalls.

He also says there may also be GDPR-specific risks around customer profiling enabled by the platform. In an edge case scenario, for example, where two overlapped datasets are linked and found to have a 100% user match, that would mean people’s personal data had been processed by default — so that processing would have required a legal basis to be in place beforehand.

And there may be wider legal risks around profiling too. If, for example, linkages are used to deny services or vary pricing to certain types or blocks of customers, is that legal or ethical?

“From a company’s perspective, if it already has either consent or a legitimate purpose (under GDPR) to use customer data for analytical/statistical purposes then it can use our products,” says InfoSum’s COO Danvers Baillieu, on data processing consent. “Where a company has an issue using InfoSum as a sub-processor, then… we can set up the system differently so that we simply supply the software and they run it on their own machines (so we are not a data processor) –- but this is not yet available in Link.”

Baillieu also notes that the bin sizes InfoSum’s platform aggregates individuals into are configurable in its first product. “The default bin size is 10, and the absolute minimum is three,” he adds.

“The other key point around disclosure control is that our system never needs to publish the raw data table. All the famous breaches from Netflix onwards are because datasets have been pseudonymised badly and researchers have been able to run analysis across the visible fields and then figure out who the individuals are — this is simply not possible with our system as this data is never revealed.”

‘Fully GDPR compliant’ is certainly a big claim — and one that it going to have a lot of slings and arrows thrown at it as data gets ingested by InfoSum’s platform.

It’s also fair to say that a whole library of books could be written about technology’s unintended consequences.

Indeed, InfoSum’s own website credits Halstead as the inventor of the embedded retweet button, noting the technology is “something that is now ubiquitous on almost every website in the world”.

Those ubiquitous social plugins are also of course a core part of the infrastructure used to track Internet users wherever and almost everywhere they browse. So does he have any regrets about the invention, given how that bit of innovation has ended up being so devastating for digital privacy?

“When I invented it, the driving force for the retweet button was only really as a single number to count engagement. It was never to do with tracking. Our version of the retweet button never had any trackers in it,” he responds on that. “It was the number that drove our algorithms for delivering news in a very transparent way.

“I don’t need to add my voice to all the US pundits of the regrets of the beast that’s been unleashed. All of us feel that desire to unhook from some of these networks now because they aren’t being healthy for us in certain ways. And I certainly feel that what we’re not doing for improving the world of data is going to be good for everyone.”

When we first covered the UK-based startup it was going under the name CognitiveLogic — a placeholder name, as three weeks in Halstead says he was still figuring out exactly how to take his idea to market.

The founder of DataSift has not had difficulties raising funding for his new venture. There was an initial $3M from Upfront Ventures and IA Ventures, with the seed topped up by a further $5M last year, with new investors including Saul Klein (formerly Index Ventures) and Mike Chalfen of Mosaic Ventures. Halstead says he’ll be looking to raise “a very large Series A” over the summer.

In the meanwhile he says he has a “very long list” of hundreds customers wanting to get their hands on the platform to kick its tires. “The last three months has been a whirlwind of me going back to all of the major brands, all of the big data companies, there no large corporate that doesn’t have these kinds of challenges,” he adds.

“I saw a very big client this morning… they’re a large multinational, they’ve got three major brands where the three customer sets had never been joined together. So they don’t even know what the overlap of those brands are at the moment. So even giving them that insight would be massively valuable to them.”

Jan
29
2018
--

BigID pulls in $14 million Series A to help identify private customer data across big data stores

 As data privacy becomes an increasingly important notion, especially with the EU’s GDPR privacy laws coming online in May, companies need to find ways to understand their customer’s private data. BigID thinks it has a solution and it landed a $14 million Series A investment today to help grow the idea.
Comcast Ventures, SAP (via SAP.io), ClearSky Security Fund and one of the… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com