State of Change, Chapter 20: The Demand for Ubiquitous Security
We pretend that the so-called “consumerization of IT” is the most important unresolved technological issue facing us in this still-young 21st century. It isn’t... although that’s certainly the easiest issue with which one may craft headlines. Certainly, it’s the more obvious issue, which we encounter whenever we watch a movie made five years or so ago, where someone uses a clamshell phone that doesn’t get Facebook. It’s like peering into an historical archive, or an issue of Look Magazine.
The real technological issue of our time is tucked away in the background, in the grey area where right and wrong are not so black and white, and the fruit for juicy headlines doesn’t hang so low. It’s the unanswered question of the ownership of data. Put bluntly, to whom does data belong?
It was a simpler question back when databases were stored files locked away in data centers. Not any longer.
“With the cloud, there is no perimeter,” declares Eric Chiu, president of access control systems provider HyTrust. “There is no ability to say, ‘Okay, this person is inside of my data center, he’s got physical card-key access, he has the key for the lock on the rack, and he can manage this set of boxes in this rack enclosure.’ All of that is out the window with virtualization and cloud. The entire environment is remotely administered, which also means that it can be remotely compromised and potentially copied, and it can also be remotely destroyed.”
“In” Put
Access control is perhaps not as explanatory a phrase as it should be. For the manufacturers of doors, a doorknob might be an “access control gateway monitor.” When network security was based on physical oversight of physical resources, and those resources were all self-contained, the most reliable form of access control was local identity. A company should be expected to know its own employees, and even if that list is comprised of a few thousand people, it’s easy enough to maintain. Someplace on the user’s PC was a digital certificate that represented her local identity. And someplace in that network was a grouping or organizational unit that said, the people on this short list have the rights to use these programs, access these files, and print to these printers.
The system developed to confirm that identity was called Kerberos, named after a myth of a three-headed dog. Kerberos was functional and reliable because it relied on some outside source — the third head of the proverbial dog — to corroborate the user’s claim to her identity. This is where the certificate authority (CA) first entered the picture.
The validity of identity providers was established through a network of trust — of CAs and other providers being able to corroborate that the providers themselves were who they said they were. So for a while, when networks began being networked themselves, there seemed to be a reasonable path toward a system of sharing trust. If one system validated the identity of a user and that system was trusted by another system, that identity could travel to that other system without necessarily being challenged. The network of trusts were for a time called trees, and Microsoft suggested that a federation of such trees be called, naturally, a forest. It seemed logical enough at the time.
But in 2010, it suddenly became difficult to see the trees on account of the forest. For a federation to work on a cross-network scale, it needed a common language. Literally dozens of companies and coalitions sought to provide that language, and all of them gave it a shot. The result, if you can imagine it on audio, sounded something like lunch at the UN Commissary.
The Web and the cloud differ from each other in one important respect: While the Web may have millions of users at any one time, no single system anywhere is responsible for identifying each of them. Individual Web servers may keep track of their own respective lists of users, and more servers these days rely on social services like Facebook, Twitter, and Gmail to validate their logons. But the Web’s core protocol, HTTP, is based on the principle of anonymous access. Even the services that authenticate users and provide security, are anonymously accessed. Even encrypted sessions don’t really require access control.
But the cloud is a system of pooled resources, all of which belong to their respective data centers. Cloud computing without some form of access control, however weak it may be under testing, doesn’t work.
So cloud service providers need some common mechanism for learning to trust one another. One reason is because, as I described in the previous article in this series, what may appear to a user to be a single database may actually be a product of several interoperable platforms. For that user to be authenticated and for access to all of that data to be permitted, all of those platforms need to validate that user. If these platforms utilize different methods for authenticating identity, then there needs to be at least one mechanism for resolving all those methods. Otherwise, the user would have to somehow log in to all the platforms individually. Imagine having to do that hundreds of times per day.
“The cloud” must look, feel, and act like one cloud, otherwise it’s just a Web hosting service. Accomplishing this requires every service to appear to belong to one system. Keeping up this appearance requires a service called single sign-on (SSO). The ideal for SSO is that the user signs on once to begin her session, establishing a root of trust for her identity. Every other service (storage, e-mail, ERP, EDI, etc.) would then utilize identity federation to resolve that identity, rather than making her log on successive times.
Now the race is on, because where this single sign-on takes place has become a jump ball. A few years ago, this didn’t seem like it would even be an issue. Most office PC users logged onto Windows, and typed their passwords or entered their fingerprints there. This is no longer a given, as many cloud services — including, rather importantly, virtual desktops — can be accessed from tablets. And you don’t normally log onto your tablet.
But you might log onto Facebook, at least at some point, perhaps in the background when you’re not paying attention. This is why social networks are now in competition with identity providers, such as Ping Identity, to serve as the universal checkpoint for the cloud. For social networks, there’s extra incentive to win this race: If they know when their users have logged on, they might be able to learn why. And these facts might be of use to the marketing firms that have become social networks’ biggest clients.
“The whole vendor management circle — from selection to negotiation to implementation, all the way around — you have to think about all these aspects as you move to the cloud,” advises Jared Hamilton, senior manager for security at IT consulting firm Crowe Horwath. With respect to how attendees at a recent security conference should assess the security of a cloud services provider, Hamilton advised:
From an identity and access standpoint, it’s all about account management — how the accounts are set up, how they’re terminated, what are the options for particular vendors. Can you routinely view the accounts, or have the ability to? I’ve seen some systems where it’s confusing to look at who’s set up for what — there’s this big matrix with all these check boxes. How can you get a real report showing, who still has access, how do they use it, what type of logging and monitoring is in play? There’s so much that’s tied up into this issue. It’s fifty percent of assessing a cloud computing environment.
Single sign-on solutions are the other part. The more services that you have, these start becoming even more intriguing — Ping Identity and OneLogin are cloud computing services that tie you to other cloud computing services. Nice, centralized management platforms which solve some of these issues – or at least, that’s the marketing ploy. “Manage your accounts here; we’ll tie them into all these crazy vendors, and how they do user accounts. And you’ll understand who has roles to do what.” Fairly early on, even when you start getting into 20+ user accounts, you’re going to want to start looking at something that makes this a little bit more helpful.
Cloud dynamics, as I’ve stated frequently over the course of this series, is essentially about pooling resources. This produces the appearance of one and the same resource. While at first this would appear to simplify the whole problem of authentication, you start to realize that in a pool of pools — in “the cloud,” as opposed to “a cloud” — a collective, single directory of identities may not only be impossible to manage but easier to spoof.
So easy, in fact, that at times the people who have stepped up the plate most often to resolve this issue, sometimes seem like they want to throw in the towel. Appearing at the same security conference as Hamilton, the co-founder of the Cloud Security Alliance, Jeff Bardin, shared a good chunk of his frustration, telling attendees:
Yes, we need a sophisticated capability to defend our environments; and more and more, that sophistication need to increase. The typical toolsets, firewalls, and [intrusion detection systems] are not going to work. They’re basically “see, detect, and arrest” types of things, after the fact. And with some of the things we’ve built... we’ve become cyber-janitors. All’s we are is a clean-up crew after we’ve been hacked. In my view, we need to start preventing these things – not only prevent them, but eventually try to predict them. And that means intelligence collection.
Then there is the other problem that should be obvious, and will likely become more so in due course: In a world full of cloud platforms, you don’t have to “break in” to steal anything. So an authentication mechanism ceases to be about letting users “in.” Technically speaking, they’re already “in.” What can they do, now that literally everyone is “here?”
HyTrust’s Eric Chiu recognizes that business managers are leveraging the self-provisioning and easy maintenance capabilities of cloud platforms to exert more influence over their departments and the information they use. That leaves IT, quite possibly, out in the cold without much of a job, unless perhaps they’d like to join Jeff Bardin’s cyber intelligence rebellion. Says Chiu:
The idea has been that centralized IT has always been there to support the business. Technically, it’s always been that way. But more and more, the business unit has power and has the ability to make decisions... A lot of decisions these made are not being made by central IT. It’s the user going out and building their own data centers, or going out to Amazon. The CIO has to support it, because the CIO has to support agility. And that definitely puts a lot of challenges on the organization, especially in terms of security, compliance, and governance.
The CSA’s Bardin is a vocal proponent of platforms that actually conduct counter-intelligence operations on malicious actors, pooling together resources that would, under careful analysis, shed light on potential espionage activity that can, in turn, be shared with all participants. But not all insecurity is a result of conspiracy or cyber warfare. Much of it is simply ensuring that resources are protected from accidental misuse, so that risk can be reduced and costs can be controlled.
HyTrust’s Chiu continues:
Now I’ve got all these pools of resources: I’ve got virtual machines running internally within this private cloud, and within a public cloud environment. And I might be moving systems back and forth. I might be developing in a public cloud environment, but then presumably I’m pulling those same systems back when they’re going to go into staging and production. Giving access to those resources is a big deal, and making sure that you can secure, control, and also monitor that activity is extremely crucial to the business.
Fall Out
When vital network resources were kept behind a perimeter, it was simple enough to design network security around the principle of sealing or opening the perimeter. Now that networks have no perimeters, so-called endpoint security fails, with consequences that would be hilarious if they weren’t so staggering.
So the topic of cloud computing typically enters the discussion as a way of compounding the existing problem, of making the consequences we’ve faced thus far seem magnified. Yet this may not be the case.
“I actually don’t believe that cloud solutions are less secure than on-premise solutions,” remarks Lubor Ptacek, vice president of strategic marketing for enterprise information management platform provider OpenText. Ptacek continues:
You could make the argument that a lot of cloud vendors, because they are so vitally dependent on their security reputation, could probably put much more security measures in place, and hire much better security experts, than a lot of agencies could have done themselves. It cuts both ways, of course, because being that kind of vendor, they also represent an interesting target, so they attract more of the kooks who will try to do bad things. But in general, we can assume that cloud solutions are as secure as, if not more secure than, on-premise solutions.
The problem lies actually somewhere else, and that is, the consumerization — the way our organizations embrace consumer-grade technologies to do enterprise-class work... If you have a file-sharing solution that is being designed for consumers, and your consumers are simply embracing and adopting it on their own, they create their own private accounts, and then start using them to exchange your corporate documents and files. That’s a huge concern.
Global networking is about accessibility; security is about placing reasonable limits on that accessibility. As we learned at the turn of the century, though, systems tend to be insecure by default, when their default setting for accessibility is “everyone and everything.” (For more, see “Snowden, Edward.”) Bolt-on security, as a rule, fails. Ptacek goes on:
A lot of the security measures that have been put in place over the last two decades have been focused on perimeter security — basically keeping the bad actors, not allowing them to penetrate the system, to snoop on the wires, to decrypt our data. That was all the security we were putting in place. But the fundamental flaw of all of this was, there was always a number of people who were entrusted with the “master keys.” They were given not necessarily admin-level privileges, but certainly privileges; and they were authorized to access the data. If those actors actually represented a threat of a breach, whether voluntarily or even unwillingly (it probably happens more often unwillingly than not), that’s a different story. But ultimately, we had to trust the people that they were doing the right thing. If the people have access to everything, then that breach, if it happens, will have very severe consequences.
While the Web’s basic security setting is “everyone and everything,” the Internet (the carrier of the Web) runs differently. It can be engineered to deny access by default and permit access when challenges are met. But the first set of challenges such a system has to meet are the political kind, and so far it hasn’t had much luck with that.
So up to now, either through concerted skill or dumb luck, we’ve managed to avoid facing the enormous ethical, legal, moral, and even philosophical questions about the nature of how we do business in this world, and how our governments conduct their business with us. We’ve accepted anonymous access as a fact of life on the Web, until we protest how our lives are being accessed anonymously. We’ve supported the construction of colossal social networks — global exchanges of all the data facts that comprise our everyday existence, so rich with exploitable data that an algorithm could probably ascertain, based on the patterns of our sharing, how often and when we go to the bathroom.
Then we demand that Facebook give us our privacy, as if social media is the repository for rights now. And we’re shocked, shocked that an intelligence gathering agency would be the first to gather the tools and resources required to make such judgments.
The Web we built — the first one — is a kind of yard sale of our lives, careers, and businesses, where everything is strewn out on the lawn, and an amateur detective could ascertain our habits in seconds with a drive-by. It’s difficult to conduct business on the Web, because business requires confidentiality and confidentiality requires confidence.
These emerging issues are so delicate that the various stakeholders in their outcome have yet to agree upon a basic terminology. Where laws and regulations make attempts to settle the arguments, disputes erupt as to whether such new rules are even applicable in a modern society. And in the absence of laws, there arise the inevitable arguments over rights — for instance, whether privacy is a basic human right, compared to whether data qualifies as property.
Ask yourself this: If DNA is information, and your DNA belongs to you, then is any use of your genetic code without your consent a violation of your privacy? Or is information only yours when you understand it personally? Face it: Would you know what all those G’s, A’s, T’s, and C’s would mean if you read your own genome? If it is not information to you, then how can you claim a right to it? This question is made all the more pressing by the revelation that research into the nature of diseases such as cancer once presumed to require decades, could potentially be conducted in mere weeks.
Breaking “In”
Up until a few years ago, the most effective form of access control anyone had ever designed for a database was based around containment. Back when the largest database could not possibly exceed the capacity of the largest hard drive, the easiest form of containment was to turn the hard drive off. Access involved being in the same room. When file systems first became networked, the first level of stealth was created. You no longer had to be in the same room to operate the right account. Every Hollywood-style break-in involves some version of this scenario: Someone you don’t see having stolen the password, or configured an account that doesn’t need a password, gets past the access control system and brazenly poaches the goods.
When a massive breach does occur, and its profile matches the ordinary Hollywood scenario to the letter, it’s usually someplace that’s still stuck in the 20th century.
Almost any effort by Hollywood to characterize the problem of data ownership reverts to a 20th century context, perhaps because only in that old world can the players be portrayed as black and white, good and evil. The most sophisticated special effects bring shock and awe to the typical Hollywood scenario, where the villain uses clever, but unseen, techniques to break into the good guys’ network. The bad guy’s objective is always the “core database,” “central nexus,” “main complex,” or some equally euphemistic grand prize. And for the first few minutes, the good guys are powerless to stop the fiendish, real-time plots of the anonymous antagonists.
Even James Bond — the world’s most recognizable “good guy” — in his most recent portrayal, stood and watched as the villain infiltrated MI6 on the big screen... then, in the least characteristic Bond move ever portrayed on film, turned and ran. “Can someone tell me how the hell he got into our system?” asked Q helplessly. You expect one of the secret agents to pull out his shoe phone and plea for help.
Our lack of even a basic comprehension of the technological and cultural change that’s impacting us now, is only compounded by the bewildered fiction produced by people experiencing the disruption at a distance. We kind of understand the principles behind America’s debt crisis, almost enough to justify our inability to suggest solutions to it. But when a former NSA contractor just three months on his job as an all-access admin released secret documents to a journalist, there was more uproar over how a massive data collection mechanism could possibly work, than there was over the fact that it actually doesn’t and, even more importantly, can’t.
As a society, we don’t really get this problem, and we’re almost powerless to explain it to ourselves. Its roots are evolutionary. Like a fairy tale giant who grows beyond the boundaries of his steel cage, data has burst through the boundaries of databases and firewalls. The principles of how data is contained and how it is accessed are being fundamentally redrawn.
Strangely enough, almost any effort by academia to explain the dilemma of big data as a business asset, reverts to a 20th century Hollywood context, complete with searchlights, fanfare, and sci-fi. For no less than the Stanford Law Review, Professor Neil M. Richards and Saavis consultancy vice president Jonathan H. King predicted that, well before we as a society come up with a set of best practices for how corporations and cloud providers (especially Google) will apply discretion to the use of personal data in databases, we will all quite literally lose our minds:
Without developing big data identity protections now, “you are” and “you will like” risk becoming “you cannot” and “you will not.” The power of Big Data is thus the power to use information to nudge, to persuade, to influence, and even to restrict our identities. Such influence over our individual and collective identities risks eroding the vigor and quality of our democracy. If we lack the power to individually say who “I am,” if filters and nudges and personalized recommendations undermine our intellectual choices, we will have become identified but lose our identities as we have defined and cherished them in the past.
They’re not wrong to be concerned. “Personally identifiable information” (PII) consists of any data that can be used — perhaps with leverage, perhaps as leverage — to ascertain some fact about a person. As such, it’s what folks normally refer to when they discuss “identity theft,” except that folks don’t usually extend that definition to include information inside the brain.
Outside of brains and inside databases, there exist records of data which, in and of themselves, may have been cleansed of the key identifiers necessary for an observer to attribute them with specific individuals. But it doesn’t take rocket science to create unions or joins of multiple tables of these records, and through isolating the matches, retrieve more complete records that obviously pertain to specific individuals. It’s by means of such joins that a table of bank transactions linking numbered bank accounts may be attributed to, say, a log of Web transactions registered to numbered IP addresses. Because databases are now more easily sharable than ever before, the entire notion of “cleansed data” is fairly preposterous.
If you’re noticing the presence of wool, and you’re wondering whether someone or something is trying to pull it over your eyes, you’re not mistaken. The biggest users of big data are ourselves. If that were not so, how would readers possibly have located Richards’ and King’s article?
Building On
The dramatic picture that is emerging of a kind of overlord over all big data — whether it be Google or Facebook or the NSA — is not only impractical but surprisingly fictional. Even the NSA’s goals, as revealed by a whistleblower, were never actually completed, and could probably never have worked. Centralized repositories of identity may be what everyday users most fear, but it is also what practically minded security architects don’t even want in the first place.
Just as big data does not need to be replicated and centralized to be made useful, there does not need to be — nor, in fact, should there be — one central database of identity, to resolve the associations of individuals with the roles they play in organizations or in social networks or on Web sites. Roles are attributes, and as the modern economy teaches us the hard way, these attributes are temporary. As the rather disrupted picture of the modern cloud starts to settle, what you’ll see is an authentication mechanism that borrows the best ideas from Kerberos plus those of Hadoop.
Here, I firmly believe, “identity” will not be a fixed thing, like a Social Security number but with more digits. Rather, it will be the sum of all attributes that a person may claim to be granted access to resources. Think of a key/value pairing chain, with limitless links. When a person plays a role on a Web site or a social network or with an organization, these unique attributes are collected together. And each attribute is an assertion from someone else, mutually trusted by a third party, of the association. Thus, I may not be given access to files because I am an administrator, or someone with a high privilege level, but rather because one or more other parties attest to my having administered.
This is a different thing altogether than an identity code. It’s a testament to what rights and privileges a person uses, made by someone else who is a witness to that use. This results in the attribution of responsibility to identity — not just that some branded certificate authority, like VeriSign, has rolled a bunch of dice and given people their own (hopefully) random numbers.
These attributes may exist any number of places. But conceivably, just as unstructured data no longer needs to be indexed to be analyzed, a system may be discovered where no single “key” or unique ID code is required to link them together. Rather, perhaps they may be chained to one another — perhaps the sources of these chain links themselves can serve as the unique forest of trust for each set of attributes. Individual attributes may be revocable (for example, when an IT admin is laid off, he’s no longer privileged to certain cloud resources), and some may be challenged. But no one database is the source of every component or attribute that someone else would require to spoof anyone’s identity.
The cloud identity issue may be a dilemma, but it is not the screeching, monolithic oracle from Kubrick’s 2001. Its solution involves the creation of policy, but not just for ourselves. Kerberos shared trust between certificate authorities in order to enable users to cross the bridges across networks without being challenged each and every time. Similarly, role-based access can also be a two-way street. It can not only specify how resources are to be used within the data center under its purview, but also how data centers elsewhere in the cloud should apply similar restrictions to the data under their purviews.
The idea is called policy enforcement. This way, the data “out there” can be guaranteed to be as well-maintained as the data “in here.” And neither you nor I will forget who we are.
For the past few years, HyTrust’s Eric Chiu has offered a thought experiment to prospective customers. First he quizzes them about whether they’ve applied the same access controls to their virtual servers, as they had applied when those servers were virtual. He makes sure that access to virtual resources are under guard. Then he notes that virtualization, by definition, means the entire server has been collapsed onto a software layer, which resides somewhere on a physical system. Even if the best access controls and monitoring tools are defined in that software, what’s to stop someone from stealing the VM?
Role-based access, in which HyTrust specializes, is conceptually, deceptively simple. Rather than creating an administrator tier or organizational unit for the express purpose of removing restrictions and foregoing monitoring for people you trust because of what they do, role-based access applies controls that kick in specifically for administrators. This way, the users who claim to be the ones responsible for maintaining resources, are the ones most closely watched instead of the ones most frequently ignored. Chiu explains:
The only way that you can secure against breaches and data center disasters is by making sure that you limit access to sensitive operations and sensitive data, and you make that automated in technology. You add the right level of what we call role-based monitoring, that tracks all of the administrator activity and compares it against what they should be doing, and alerts you when potentially anomalous or bad things are happening.
This idea that you can do background checks, and then trust your admins — “I trust my guys” — all of that is out the window. You can’t trust a thousand admins at the NSA. It’s impossible... Access management has to be purpose-built for the cloud. It can’t be, “Oh, I’ve got [Active Directory], so that’s great.” Does AD govern who can make copies of VMs? Who can delete data center resources? No? Then that’s not access management at that point; that’s centralized authentication.
Our mindset going forward has to be inverted from the 20th century, Hollywood context in which we usually find ourselves. Cloud services are not fortresses with walls that can be hardened or reinforced. Anything that can misuse a resource, whether maliciously or accidentally, is already on a network somewhere.
Access control focuses on the resource, not the perimeter. As a practice, it addresses the pertinent questions: If a resource can be used by someone claiming to have the authority on account of his role, then how is it being used? Does its use follow the general pattern? Or is it peculiar? Apply business intelligence to access management, and the problem of cloud security may not seem so much like the end of a Twilight Zone episode.