Password Managers Relieve Password Headaches

Passwords Are a Hassle

I’ll be the first to admit I can’t remember all my passwords. Most of us can’t, so we pick a few passwords that are easy to remember and then use them with multiple sites. This results in two immediate problems. A password manager can help with both of these problems. First, passwords that are easy to remember are typically also easy to guess. Second, a compromised password is a risk to every site where it has been reused. A password manager both of these problems since it can generate a secure and unique password for each site, but only requires that you remember a single password to unlock the database. While it is possible, to create passwords that are secure and memorable, it is more difficult to do this with the significant number of passwords we frequently use in modern life. I detailed some additional problems with passwords in previous articles Your NYE Resolution—Pick Better Passwords and Data Evaporation and the Security of Recycled Accounts. I find that password manager with solid browser integration is well worth the initial setup time and expense.

While there are many good options, my password manager of choice is 1Password from AgileBits that is available for Mac OS X, Windows, and the iPhone, iPad, iPod Touch. I consider it an indispensable tool and I use it daily both on my desktop and my phone. 1Password integrates with many popular browsers, which makes logging into web sites faster and more convenient. The application allows me to easily switch between multiple browsers and multiple devices without worrying, which browser I might have saved a particular password.

When I first looked at 1Password in 2006, I thought there was no way I would be willing pay for it since all modern browsers ship with password management functionality. Shortly after I started testing the application I found it so convenient, I changed my mind and purchased it. Nearly six years and many major upgrades later, I have no regrets. I have nearly eight hundred logins saved in 1Password. Even though I regularly clean out duplicates and entries for dead services, this is still a ridiculous number of accounts. Look at it this way, I test services so you don’t have to.

We All Forget Passwords

A 2007 paper A Large-Scale Study Of Web Password Habits of more than half a million users found that about 1.5% of all Yahoo! users forgot their password each month. Yahoo Mail alone has more than 200 million accounts, so this is a significant number. The authors found that the “average user has 6.5 passwords, each of which is shared across 3.9 different sites. Each user has about 25 accounts that require passwords, and types an average of 8 passwords per day.”

Complicated Passwords and Compact Keyboards Don’t Mix

The current crop of smartphones ship with highly capable browsers, but entering lengthy passwords on a phone keyboard is even more error prone and frustrating on the desktop. Here again, a password manager can reduce the complexities of entering many different password strings on a mobile device. The application allows you to make a mobile keyboard optimized and possibly simplified password that protects your longer more complex passwords and notes. This is of course a security tradeoff.

Mobile Safari on the iPhone and iPad does not permit plugins, so the 1Password application on iOS devices embeds a browser that is able to offer the automatic login feature. I prefer the default browser, but unfortunately there is no option for direct integration. The 1Password bookmarklet makes it relatively quick to look up an entry in the database and then copy and paste long passwords from its database far more easily than trying to type them in by hand

Other Advantages of 1Password

I regularly use multiple browsers. I also frequently delete my cookies and browser settings when I test services. This would typically cause a nightmare of needing to re-authenticate to each web site where I deleted the cookies. Since all of my login information is stored in 1Password rather than the browser, I don’t have to care about which browser I am currently using or even if my cookies still exist.

Since 1Password is also a general form filler it can cope with login forms that have partial entries or multi-stage. For example, many services require that users re-enter their password to access account management features even if they are already logged in. This is to prevent another person from simply walking up to your unattended computer from viewing or making changes to billing information, email forwarding, and passwords. In most cases, 1Password is able treat the re-authentication sign in forms exactly like a standard sign in form.

Some sign in forms are multi-stage where login process is split across several forms. For example, many online banks are multi-stage sign in forms. In the first stage, the user enters a username and their browser must acquire a cookie from the bank. If the user does not already have a cookie from a previous session, the user must enter a second authentication factor such responding to a text message with a unique code or entering the code from a hardware token. Next, on a second form on a separate page the user enters a password.

In cases where 1Password is confused by multiple stage forms, the work around for this type of site is to simply make two separately named entries in 1Password. For example, the first entry would contain the username and the second entry would contain the password. The user must go through the full sign in process the first time to received a cookie from the bank by completing the two-factor authentication process and has create a 1Password entry for each step in the form. Each subsequent login to the bank will be treated like all other sites and can be automated with the auto-login and auto-submit features.

Here is a small laundry list of other features I regularly use and appreciate about 1Password.

  • General form saving support. 1Password can save and replay many kind of web forms, which is a useful feature if you find yourself filling out the same information over and over again.
  • Support for “identities” where the application stores commonly used bits of information such as name, email, phone number and can populate this information into many types of forms with little effort.
  • Basic anti-phishing protection since by default 1Password will only post usernames, passwords, and other forms back to the same domain name as the original.
  • The application can generate random passwords with several different templates that will satisfy most password requirements.
  • In addition to usernames, passwords, forms and identities, 1Password also supports encrypted notes.
  • The Mac OS X desktop application will sync over the local wired network and WiFi for iOS devices
  • 1Password will sync with Dropbox for all desktop and mobile applications including Windows and Android

Limitations of 1 Password

There are several important limitations with 1Password. The application cannot handle login forms built with Adobe Flash. Previous generations of 1Password supported login forms with HTTP basic authentication, however the new plugin architecture for Safari and Chrome do not offer support for HTTP basic. AgileBits says it is working on a solution for Firefox.

The features of the Windows version of 1Password are not quite yet on part with the Mac, for example it only supports 32-bit Internet Explorer, 32-bit Firefox, Chrome, and Safari. This said that covers most browsers that user’s need.


1Password for Mac and 1Password for Windows is $49.99, 1Password Pro is $14.95 is available for iPhone, iPad, and iPod touch.

1Password Bookmarklet Gone Missing

If you are a frequent 1Password user, particularly on iOS devices, you may have noticed that AgileBits discontinued support for the 1Password bookmarklet, which was the best option for integrating with Mobile Safari rather than the integrated browser in the application. Fortunately, Kevin Yank and * have produced a working 1Password bookmarklet. I have reproduced it here:


You should follow me on Twitter.

Your New Year’s Resolution–Pick Better Passwords

As we near the end of 2011, I can’t help but think this is the year I had the most trouble telling the difference between actual news stories and pieces from “America’s Finest News Source”, The Onion. As I write this article, details are still unfolding from the data breach at the private intelligence firm Stratfor.

According to reports, the Stratfor hackers found a weakly protected database of usernames and passwords and an unencrypted database of credit card information. The hackers proceeded to make donations to charitable organizations with the credit cards in the database. As any story benefits from more absurdity, there were claims and counter claims of whether or not the attack was associated with Anonymous, the discerning hacker’s first choice of affiliation.

According to Identity Finder, the Stratfor database contained approximately 44,000 hashed passwords in the database, roughly half of which have already been exposed. Unfortunately, another 20,000 or passwords on pastebin would not even be newsworthy, if it were not for the notoriety of Stratfor. Note: if you think you might have been on the list of compromised accounts in the Stratfor database, you can check at Dazzlepod.

There is plenty of blame to go around. First, Stratfor stored user passwords as basic unsalted MD5 hashes, which is simply irresponsible. There are well-regarded and widely-available solutions for storing passwords such as bcrypt, which is nicely summarized in Coda Hale’s How To Safely Store A Password. Secondly, and more importantly, storing customer’s credit cards in clear text is unconscionable. Never mind the question of why on earth Stratfor stored CCVs in their database, which is never OK.

Given the recent attacks against Sony, Gawker, HBGary Federal, and Infragard Atlanta, one could reasonably expect that Stratfor would pay more attention to the operational security side of their business. To put the Stratfor hack in a more global context, the 2011 Verizon Data Breach Investigations Report aggregates data from Verizon RISK, the U.S. Secret Service and the Dutch High Tech Crime Unit. DataLossDB Statistics collected data from open sources including news reports, Freedom of Information Act (FOIA) requests, and public records. These reports give a more nuanced breakdown of the types of breaches and data exposed across many industries.

As much as it pains me to blame the victim, a great many of the subscribers to Stratfor’s service, clearly could and should have picked better passwords. According to Stratfor Confidential Customer’s passwords analysis, we could start with the 418 users who picked “stratfor” as their password or even the 71 users who picked “123456.” The database was full of weak passwords, which was why the clear text of nearly half the passwords followed in a post shortly after the original password hashes appeared online.

In Data Evaporation and the Security of Recycled Accounts, I described how passwords for email accounts are frequently the weak link in the security chain. It is common for sites to allow users to reset their passwords to the email address listed on the account. This means that a compromised email account may be the only method an attacker needs to gain access to other accounts.

In my dissertation interviews, I talked with people about how they managed their accounts and passwords. Many of my interviewees told me they effectively had 2-3 passwords they used for most accounts with some minor variations due to password complexity rules. The interviewees frequently reported using a set of low, medium, and high security passwords. Unfortunately, the email accounts were often given the low security passwords.

It pains me to think how many of the customers in Stratfor’s database likely reuse the same password on multiple sites. In Measuring password re-use empirically, Joseph Bonneau analyzed the overlap between and passwords in addition to other studies and found a wide-spread ranging from 10% to 50% overlap. Even with 10% overlap, there are significant benefits from leveraging one exploited password database to compromise another. As always, XKCD keeps track of the pulse of the internet and has informative comics for both Password Reuse and Password Strength.

Realistically, it’s getting to the point where unless you have a pretty fantastic password, if your password is in a database of poorly hashed passwords then someone with a bit of time can discover it. Why is that you might ask? Whitepixel the purveyors of fine open source GPU accelerated password hashing software report that it currently achieves 33.1 billion password/sec on 4 x AMD Radeon HD 5970 for MD5 hashes. This is fast enough to make rainbow tables (pre-computed hashes for a dictionary attack) much less compelling. If the attacker has any additional personal information this significantly increases the chance of a successful attack since so many people use bits of personal information in their passwords. Bruce Schneier describes commercial software that exploits personal information when attempting compromise password hashes in Secure Passwords Keep You Safer.

In general, unless your password or pass phrase is quite long you are far better off with a long randomly generated string that you manage with a password manager. There are many good options including my personal favorite 1Password, UsableLogin, LastPass, RoboForm, or the open source projects PwdHash or Password Safe. PasswordCard is a nice alternative if you would prefer a solution you can always carry with you that does not require any dependencies besides what you can carry in your wallet.

Unfortunately, none of the password managers are magic. You will still have to deal with a depressingly large number of services that force you to choose poor passwords with arbitrary restrictions. Troy Hunt names some offenders in the Who’s who of bad password practices – banks, airlines and more. Still, if you simply use a password manager and different password with each service, you will dramatically limit any potential damage, as an attacker cannot reuse your password on another service.

You should follow me on Twitter.

Security, Productivity, and Usability in the Enterprise

During interviews I conducted for my dissertation research, I asked individuals how the security policies and systems affected their daily life in terms of productivity and work and personal communication. Interviewees gave many examples of tradeoffs between security and usability. People understood the reasoning behind many of the security restrictions. However, these implementations often significantly reduced productivity and frustrated employees everyday work practices and basic personal communications needs. Many implementations actively motivated employees to subvert security protections. The lengths to which people went "work around’’ what they perceive as overly restrictive security and compliance implementations lead to distinctly counterproductive measures in terms of overall security.

Security implementations in systems and security policies vary widely across the enterprise. These systems can help prevent unauthorized access, dissemination of proprietary business information, and confidential customer data. Security and compliance systems are also essential to passing an audit. The effectiveness of a system’s security is directly related to the overall user experience of the system. Security implementations that do not adequately consider a range of factors including existing work practices, the overall usability of the system, and basic social communication requirements may have serious negative consequences for morale, productivity, and information security.

Unsurprisingly, interviewees often responded that they were more concerned with job performance and completing the tasks at hand than with complying with corporate security policies. In short, they were far more worried about a lost job or a promotion from not getting their word done, than they were about violating security policies. Don Norman summarized the problem nicely as “The more secure you make something, the less secure it becomes.”

People did not distinguish between the technology failing, not understanding how the technology works, and not realizing that a task was technically infeasible. In one example, an employee had tried to work from home over the weekend. This employee was not able to access the corporate network, because the VPN was inoperable over the weekend and the situation was possibly complicated due to a user misconfiguration. The following Monday morning, the employee was rebuked for not completing the project by the deadline.

Institutions that do not pay attention to employee’s perception that they can be productive and efficient when implementing security policies may find their employees at odds with their own policies. The employee perceived the situation as technological failure the prevented the work from being completed. This had significant consequences as the employee began to regularly copy data to an external device or via a personal email account to ensure they would be able to work. It is easy to criticize employees who violate security policies and argue they should be reprimanded or fired. However, in nearly every case in my interviews, the employees who violated policies did so to work around situations the company could have been avoided though a more nuanced implementation that took productivity into account. In the particular case of the VPN, it was clear there were widespread problems with remote access that lead to undesirable methods of replicating data.

Companies would be rewarded with higher levels of job satisfaction and productivity if they took greater efforts to both explain security policies and made attempts to ensure that users, especially mobile users, were not regularly prevented from communicating or managing documents. In these cases employees were appreciative of how productive the system allowed them to be while still mindful of the risks involved. Explaining the reasoning behind the policies and implementations goes a long way to improve compliance. In the now classic paper, “Users Are Not the Enemy” Adams and Sasse found that individuals did not have adequate understanding of security issues and that security mechanisms were not adequately explained to them. In addition, the authors found that security departments did not understand their user’s perceptions of security or their needs. The lack of understanding combined with lack of communication resulted in reduced security overall.

Many businesses could reduce the risk of compliance violations by taking into consideration their employees’ everyday communications needs and practices. Internal needs assessments, possibly including surveys and interviews, can be used to determine how well corporate needs for security and compliance align with employee’s work practices and other communications needs. Security policies and compliance systems that take social factors, work practices, and overall understanding of the reasoning behind the requirements into consideration will be far more effective than those that do not. Unfortunately, it seems that this is the exception and not the rule.


A. Adams and M. A. Sasse. Users are not the enemy. Communications of the ACM, 42(12):40–46, 1999.

D. Norman When Security Gets in the Way

You should follow me on Twitter.

The World is Not Flat and Neither Are Social Networks

Now that I and the rest of the Internet has grown accustomed to Google Plus and Facebook’s most recent friend categorization features, I thought it was time to revisit and revise a previously unpublished piece of mine. Take a moment and think about your friends, family, colleagues, friends of friends, acquaintances, and members of the same social club. These six groups could comprise a large part, but certainly not all, of the people that you know. You may also have extended family, classmates, common members of sports teams, religious associations, and the familiar strangers you recognize, but don’t know their names. To further complicate matters, the people in these groups often change over time as we move through life. How we conduct ourselves depends on the situation. It is highly unlikely that you act the same way around your grandmother as you do at a party with your friends and people do not expect you to act the same way. Your friends, work colleagues, and extended family do not all know each other and I suspect that in many cases you would like to keep it that way. For this reason, it seems odd to expect that our interactions in online social networks would be any different.

I had the final word in Erica Naone’s Technology Review article Can Google Get Social Networking Right?. Naone’s piece argues that Google needed to dramatically improve its social offerings to compete against Facebook. She asked me to comment on Google’s social services such as Buzz and Profiles and how they might interact with user’s search history. It is interesting to see how much the discussion has changed since the article appeared. Disclosure: I worked as an engineering intern on Google Accounts during 2005-2006, but this was well before any of Google’s social options existed. I responded with a discussion of broad problems I saw with social network services. The following quote in the Naone’s article mostly reflects my statements, although the quote makes it appear that I am singling out Facebook for criticism, which misses the point that I think this is a fundamental problem across many social networks.

“Facebook, meanwhile, has its own problems, and some of these could turn out to be opportunities for Google. Ben Gross, an expert in online identity, notes that Facebook and other social networks don’t accurately differentiate between people’s social connections, making their social graph information less valuable to users and advertisers. For example, social networks tend to put all of a user’s connections into a single group of “friends,” and expect users to manage complex privacy settings to sort out family, work connections, and bar buddies. “Social network services should not assume that networks are flat, or that people are willing to put in the effort to articulate these networks or that they even want to,” he says.”

My full response from which the quote was taken follows below. I fixed a few typos, but it is otherwise unedited.

“I see several consistent problems with many of the social network services. First, they often unify disparate social networks in ways that do not match people’s actual experience and may not even make sense to them. In order to have a real representation of people’s social networks, they would have to fully articulate these networks to the service, which is a pretty unnatural thing to do. For many people the edges of the network shift regularly. Most social network services do not make it easy to maintain multiple independent networks on the service. It is common for people to maintain independent social networks, where individuals may not want the networks unified and people may not even care or wish to know about the other networks. For example, one’s extended family vs. one’s work colleagues vs. one’s friends they have brunch with on the weekend. The idea that there is a single flat network is sort of ridiculous.

I often hear people say that people who want to maintain independent identities or networks are somehow up to no good. I have interviewed quite a few people about this topic for my dissertation. It’s clear that people’s lives are complicated and their identifiers and networks reflect this. If you think about it, it is not at all strange for someone to want to separate their work life, from their family life, from their friend, or all manner of combinations. The boundaries of these relationships shift and behaviors vary widely. Social network services should not assume that networks are flat, that people are willing to put in the effort to articulate these networks, or that they even want to. Also for many people, they may have portions of their network that they are connected to online and therefore the online representation of their network may be very skewed. Even if people are connected to multiple networks online, they may use different social network services for different social networks. For example, it is not unusual for people to primarily have email conversations with some connections, use AIM for others, Google Talk for others, SMS for another group, and Facebook for yet another. Each service would be missing the chunk of connections for the other service.”

You need context to create a meaningful representation of a person’s social network. To make matters worse, that context shifts constantly as do peoples social relations, particularly those with whom we have weak connections. This is why people often see online social network representations as a cartoonish view of their own complex and ever changing social worlds. This is not a new revelation about social relations. William James published the following in 1890.

Properly speaking, a man has as many social selves as there are individuals who recognize him and carry an image of him in their mind. To wound any one of these his images is to wound him. But as the individuals who carry the images fall naturally into classes, we may practically say that he has as many different social selves as there are distinct groups of persons about whose opinion he cares. He generally shows a different side of himself to each of these different groups. Many a youth who is demure enough before his parents and teachers, swears and swaggers like a pirate among his ‘tough’ young friends. We do not show ourselves to our children as to our club-companions, to our customers as to the laborers we employ, to our own masters and employers as to our intimate friends. From this there results what practically is a division of the man into several selves; and this may be a discordant splitting, as where one is afraid to let one set of his acquaintances know him as he is elsewhere; or it may be a perfectly harmonious division of labor, as where one tender to his children is stern to the soldiers or prisoners under his command.

It is important to recognize that forcing people interact with their social relations as a flat network has many undesirable consequences. Figuring out how to restore a more natural balance to social relations is a grand challenge for social networks. People we think of as friends, enemies, and acquaintances change over time as friendships intensify and cool and we move through life phases. Also, complete visibility in networks is not always desirable or healthy. When we remove people’s choice to disclose their relationships and group memberships we strip them of something that is fundamentally human. We provide people with only one option for presenting themselves at a time denies them an important means of self-expression that is also fundamentally human.

I find it heartening to see how much has improved over the last year as both Google Plus and Facebook have dramatically improved the situation in allowing us more options to interact naturally with different social spheres. Framing choices about self presentation as choices about privacy misses the point that the issue is usually about context. When social networks lack context, it forces people to articulate everyone that should be included or excluded from a particular interaction. In these cases, the cognitive overhead of potentially making this judgement for each interaction is staggeringly high. Unless you are a public figure, you likely never need to decide if what you say is appropriate or even remotely interesting to someone you went to grade school with, someone you went to college with, a work colleague, your aunt, your next door neighbor, and a dear friend. We should not force people to work this hard unnecessarily.


danah michele boyd. Friendster and publicly articulated social networking. In CHI ‘04 extended abstracts on Human factors in computing systems, pages 1279–1282, New York, NY, USA, 2004. ACM. Articulated Social Networks: An Ethnographic Study of Friendster

Erving Goffman. Presentation of Self in Everyday Life. Anchor Books, New York, 1959.

Francesca Grippa, Antonio Zilli, Robert Laubacher, and Peter A. Gloor. E-mail may not reflect the social network. In Proceedings of the North American Association for Computational Social and Organizational Science Conference, 2006.

Ido Guy, Michal Jacovi, Noga Meshulam, Inbal Ronen, and Elad Shahar. Public vs. private: Comparing public social network information with email. In CSCW ‘08: Proceedings of the ACM 2008 conference on Computer supported cooperative work, pages 393–402, New York, NY, USA, 2008. ACM

Kai Fischbach, Peter A. Gloor, and Detlef Schoder. Analysis of informal communication networks – a case study. Business & Information Systems Engineering, 1:140–149, 2009.

William James. The Principles of Psychology, volume 1. Henry Holt & Co., 1890

Hat tip to Gaurav Mishra whose similar titled article The World is Not Flat and Neither is the Social Web (site is currently offline), from 2008 I found after I finished writing this post.

You should follow me on Twitter.

Tracking, Geolocation and Digital Exhaust

You are unique… In so many ways…

The accounting systems on which modern society depends are surveillance systems when viewed with another lens. All administrative, financial, logistics, public heath, and intelligence systems rely on the ability to track people, objects, and data. Efficiency and effectiveness in tracking have been greatly aided by improvements in data analysis, computational capabilities, and greater aggregations of data.

Advances in social network analysis, traffic analysis, fingerprinting, profiling, de-anonymization/re-identification, and behavioral modeling techniques have all contributed to better tracking capabilities. In addition, modern technological artifacts typically contain one or more unique hardware device identifiers. These identifiers—particularly in mobile devices, but also RFIDs, and soon Intelligent Vehicle-Highway Systems—are widespread, but also effectively unmodifiable and relatively unknown to most of their owners. For example, with mobile devices, each network interface (cellular, Bluetooth, WiFi) requires a minimum of one unique hardware identifier—all uniquely trackable. One hand, aggregating these unique identifiers allows services like Google, Skyhook, and others to associate geolocation data with WiFi access points and provide useful services. On the other hand, Samy Kamkar’s work described in Hack pinpoints where you live: How I met your girlfriend shows the potentially awkward and invasive side effects.

Individuals generate transactional data from common interactions offline such as card key systems and nearly every online transaction. Improvements in techniques to correlate disparate data as well as techniques to analyze the unique characteristics of software, hardware, network traffic to form a fingerprint is frequently unique. For example, a large-scale analysis of web browsers from the Panopticlick project showed that over 90% of seemingly common consumer configurations were effectively unique. IP geolocation data can be used to increase security as with Detecting Malice with ModSecurity: GeoLocation Data or it can be used in ways that are quite Creepy.

Another major shift is the widespread collection and aggregation of geolocation information from mobile devices. Location can be a highly unique identifier, even if the mobile device changes. Philippe Golle and Kurt Partridge show that two data points sampled during the day—one at home and one at work are enough to uniquely identify many individuals, even in anonymized data. Geolocation data can also reveal significant information about the people spend time with and a view of their social network. Jeff Jonas sums this up well in Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food! In a sense the mobile phone has caused an enormous increase in uniquely identifiable data that can be used for tracking.

An average person now generates a constant stream of geolocation data that is collected by mobile carriers. Geolocation information is generated from cellular triangulation, geolocated IP addresses, and integrated GPS units, which deliver down to 10 meter accuracy. Geolocated mobile transaction data aggregated across multiple carriers is increasingly available for commercial use. It is possible to accurately track large numbers of individuals in constrained environments simply by sniffing the ITMI (temporary ID) as Path Intelligence does in mall, although they could sniff the IMEI just as easily, but they say they do not to protect privacy. Still, large-scale analysis of geolocation data is in its infancy. ReadWriteWeb describes how Developers Can Now Access Locations of 250 Million Phones Across U.S. Carriers

Tracking technologies—particularly when combined with geolocation information—have matured far beyond tracking individuals and are rapidly becoming capable of tracking groups and larger populations, which could be applied to entire enterprises or political organizations. Tools and techniques have made it feasible to correlate geolocation information, commercially aggregated profiles of online use, digital fingerprints, and offline transactional data. In addition, analysis of current anonymization techniques has repeatedly shown that simply adding another source of data is enough to re-identify a large percentage of the population. The Spatial Law and Policy blog is doing a nice job of tracking the policy implications of geolocation data.

The immense potential value of geolocation and other tracking data may well provide enough incentive for it to be used in ways counter to our own interests. Potential threats for misuse of the data need to be taken into account when designing systems. For example, what is the value of highly accurate logistical data about a US corporation derived from geolocation data and social network analysis to a foreign industrial competitor? Even a small amount of data that allowed a rudimentary analysis of external individuals meeting with internal high-level executives would be a worthwhile target. Similarly, both foreign industrial interests and foreign states may be willing to spend significant resources to acquire details on the movements and meetings of political parties.

More broadly I have been thinking about the question—What does it mean for a third-party to acquire better logistics about an organization than the organization has itself? What are the policy implications when and if these tracking tools are deployed in places without the rule of law, stable transitions of government, and low levels of corruption that we assume in the US? Could changes in the design and implementation of these systems mitigate the risks outlined? For example, should these design changes include internal controls, data scrubbing capabilities, and user interfaces that more clearly indicate a big picture of what data is being given off. Are there behavioral strategies that would reduce risks? To what extent can user education reduce risk?

You should follow me on Twitter.

SSL Is Critical Infrastructure at Risk

Problem Areas for SSL

The security of the transactions for much of the consumer Internet relies on the Secure Socket Layer (SSL) protocol. SSL and its Public Key Infrastructure (PKI) are critical Internet infrastructure. Most consumer Web, email, and VoIP traffic relies on SSL for security as does substantial portions of enterprise Internet traffic both from SSL enabled Web applications and SSL-based VPNs.

Fundamental problems increasingly put this infrastructure at risk. Significant risks include flawed implementations of the SSL protocol and PKI, inadequate verification mechanisms for certificate issuance, limited implementation of revocation mechanisms, and involvement by state actors in the issuance process. There are no viable alternatives to the mainstream use of SSL that are currently widely accepted or deployed.

Cryptographic Flaws

The first analyses of problems with the protocol focused on the cryptographic aspects of the implementations, which largely stabilized with the release of TLS 1.0/SSL 3.1 in 1999. The IETF (Internet Engineering Task Force) released the last version of SSL in 1996, which it superseded with the Transport Layer Security (TLS) protocol released in 1999. Still the protocol is primarily referenced as SSL.

TLS versions 1.1 and 1.2 added further security refinements, although they are not yet widely implemented or deployed. Recent flaws target weakness in the SSL framework and not the encryption itself. One notable exception is the 2008 discovery of weakness in the MD5 cryptographic hash function that allowed security researchers to create a false Certificate Authority certificate that could sign other valid SSL certificates.

User Interface Problems

The second phase focused on user interface and user experience aspects of SSL. In particular, people simply ignored the large number of security warnings about SSL certificate problems no matter what their severity. Users are more vulnerable to both hijacking and phishing attacks when they become desensitized to certificate warnings. The Mozilla Foundation investigated usability problems and experimented with multiple user interfaces to prevent and train users from navigating to sites with invalid SSL certificates.

Implementation Flaws

The OpenSSL toolkit is widely used to generate cryptographic keys for SSL certificates and SSH keys. In 2006, a developer on the Debian Linux distribution team modified the OpenSSL source to eliminate errors generated by a debugging tool. The change had an unintended side effect that eliminated most of the entropy destined to seed the pseudo-random number generator, which caused the modified version of OpenSSL to produce weak cryptographic keys for the Debian version of OpenSSL. Another Debian developer discovered the flaw in 2008. In the intervening time, flawed versions of OpenSSL created an estimated 25,000 weak and easily compromised SSL keys.

In 2009, researchers discovered the potential for man-in-the-middle type attacks by targeting the renegotiation feature of SSL, which allowed changes to keys in-connection to accomplish tasks such as upgrading the key strength. I described the problem in “A Practical Attack and Fixes for Current SSL/TLS Vulnerabilities.”

Moxie Marlinspike published a series of man-in-the-middle-based attacks on SSL starting in 2002 with the sslsniff tool, which exploited a vulnerability that allowed leaf certificates to act as signing certificates. In 2009, Marlinspike published a new tool called sslstrip, which could forcibly downgrade HTTPS connections to insecure HTTP connections. He also published a “null prefix attack” that could trick some browsers such as Firefox into accepting specially crafted certificates as wildcard certificates. Finally, he published an attack on the Online Certificate Status Protocol (OCSP), which allowed him to present revoked certificates as valid. Marlinspike and others have created widely available software and techniques to compromise the security of SSL via man-in-the-middle attacks.

Infrastructure Constraints

The implementation flaws highlight the problem that the SSL and PKI infrastructure is both distributed and constructed from many different implementations of SSL, which can be difficult to patch or upgrade quickly. The large number of SSL implementations for embedded devices further compounds the problem.

The tools to verify the integrity of digital certificates, certificate authority roots, and the chain of trust between them are not widely deployed. While modern browsers increasingly include support for certificate revocation, the support is uneven. Many non-browser implementations of SSL do not check for revoked certificates. Recent large-scale surveys of SSL certificates have found substantial numbers of certificates with intentional and unintentional errors, including a significant number of possibly malicious certificates.

Problems with Certificate Issuance

There are a limited number of root certificates that are widely accepted by nearly every browser, which can be highly profitable for the certificate authorities that own them. At the same time, there is a financial incentive to offer certificates with the least possible overhead. Because of this, many certificate authorities require only limited verification to issue certificates.

This type of limited validation called domain validation typically only requires that the certificate requestor be able to receive email to certain administrative email addresses. Limited validation periodically results in attackers devising ways to inappropriately request certificates for domains that may not be legitimate.

Extended Validation certificates are an attempt by certificate authorities to offer higher cost certificates with substantially higher verification requirements to ensure that only legitimate requests receive certificates. Still, the process of purchasing certificates is overly complex and many sites do not have SSL certificates, even when they would be well served by them. I discussed some of the difficulties in purchasing certificates in “No Frills SSL Certificates Are Inexpensive and Useful.”

Root Certificate Bundles

Root certificate bundles or root certificate stores contain the collection of root certificates that the browser or other SSL enabled service will automatically accept as trusted. However, root certificate bundles often contain many certificates without detailed provenance information. In April 2010, the Mozilla project discovered a root certificate that had been included in the root certificate bundle for many years, but whose owner was unknown. Eventually, Mozilla determined there was a miscommunication and that the root certificate belonged to RSA, but the situation underscored the tenuous provenance of some of the certificates of the bundles.

There are a number of widely used certificate stores on a single machine that are controlled by multiple entities. For example, while Microsoft Windows and Mac OS X offer system wide root certificate stores, Firefox uses a certificate bundle maintained by the Mozilla Corporation. Server applications, especially on UNIX systems may contain their own root certificate bundle.

The policies for inclusion in certificate stores vary widely and the influence of payment is unclear. The Microsoft Windows root store may load new certificates on demand, meaning that there is no precise list of valid root certificates.

Influence by State Actors

There is growing and widespread awareness of the policy and political dimensions of SSL certificates, especially as we find that state actors may have undue influence over some certificate authorities. State actors may compel vendors, carriers, or paid attackers to insert additional certificates into the root certificate stores either openly or surreptitiously. Christopher Soghoian and Sid Stamm published an analysis of what they call a “compelled certificate creation attack” in their paper “Certified Lies: Detecting and Defeating Government Interception Attacks Against SSL” (PDF).

Root certificates are high value targets as they can produce certificates that can decrypt communications and effectively verify identities of individuals with client certificates and for entities with host certificates.

In 2010, the EFF petitioned the Cybertrust division of Verizon to revoke the certificate for Etisalat in the United Arab Emirates after the telecommunications company issued a BlackBerry firmware update that included surveillance software. Also in 2010, there was a significant debate on the Mozilla policy list about the inclusion of a root certificate for the China Internet Network Information Center (CNNIC) certificate authority in the Firefox certificate store. The argument was that while CNNIC was affiliated with an academic institution, it was not free of government influence.

The problem is that any certificate authority may issue a certificate for any domain on the Internet. The problem is further complicated by the fact that each browser, operating system, and a great many server applications may use independent root certificate stores that may contain an unknown collection of root certificates, which will automatically trust any SSL certificate signed by that root.

ForeverSave Prevents Lost Work on the Mac

It’s happened to all of us. You are busy writing, entering data, or working on a slide deck and all of a sudden something freezes and then the application crashes. If either we recently saved the document all is well, otherwise the inevitable explicative follows. It is 2011 and there is no excuse for not having autosave, but there are still a depressing number of applications that do not automatically save documents. Blaming the user who lost work to an application or operating system crash is blaming the victim. People are far better served by applications that automatically name, save, and version their files without requiring manual intervention. This way users can easily undo or revert to an older version after application crashes, machine hangs, and power outages, no swearing like a sailor necessary.

Tool Force Software’s ForeverSave ($15) largely solves this problem for Mac OS X applications. ForeverSave allows you to configure the application to automatically save documents from many applications including Apple’s iWork, Microsoft Office, and most Adobe products. The configuration process is quick and straightforward. You simply select the applications that you want to enable autosave. There are options to save after a fixed time interval or when switching to another application.

ForeverSave can also automatically create backup copies of your documents. You can set the maximum number of backup copies and a maximum size for the backups overall. One advantage of multiple backup copies is that it is that you can quickly preview old versions of the document with QuickLook. Restoring an old version is a one click operation. One interesting feature is database sharing. This allows you to share all the historical versions of a document, which is useful to show a colleague how a project evolved over time.

If you use any of Apple’s iWork applications including Keynote, Pages, and Numbers, then you absolutely want to use ForeverSave. The applications in iWork are well designed and I use them often, but unfortunately, as of the most recent version iWork ‘09, Apple has not seen fit to include an autosave feature. Each of the applications crash periodically, It also means that you have lost any work form the last time you remembered to manually save. If you have not named and saved the document at all yet, then everything is gone.

When an iWork applications crashes, all remnants of unsaved work is gone. After a recent crash with Keynote, I decided to experiment to see if I could find any traces on my file system. I scanned my temp files and the swap files and found nothing other than the images in the document. This is a terrible oversight and I expect better from some of Apple’s high-profile applications. Judging from the many complaints I found on the Apple discussion boards and elsewhere online, I’m not remotely alone.

Overall I highly recommend ForeverSave, the price is well worth the insurance against lost work. I experience two annoyances when using the application. First, saving is a blocking operation in the iWork applications, so if you have a large document such as a Keynote slide deck with many slides it will force you to wait each time it saves the document. This is technically the fault of iWork and not ForeverSave, but it is still a detractor. The second annoyance is that ForeverSave requires you to name the document the first time. This typically comes up when I start to work on a document and right when I get into a flow, then the save window pops up asking me to name the file the first time so it can save. I would rather the application not interrupt me and simply pick a reasonable name and let me rename it later.

ForeverSave is $15 and has a 30-day trial. ForeverSave Lite is a stripped down version that offers autosaving only, without backups, versions, QuickLook, or database sharing.

Time Machine vs. CrashPlan for Backups

Trouble in Time Machine Land

In my recent article, A Simple and Effective Backup Strategy for Mac OS X, where I recommended a three part backup system: 1) a full disk clone, 2) local incremental backups with Apple’s Time Machine, and 3) networked incremental backups with CrashPlan. I found Time Machine problematic for my own setup, for reasons I explain below, so I now use CrashPlan for both local and networked backups.

For most people with configurations that are not highly customized or complicated, Time Machine is a great “set and forget backup” solution. The primary interface is a single on or off toggle switch. Its ease of use can make the difference between having backups and not having backups for many. At the same time, Time Machine has some notable quirks and limitations that can make it far less desirable in some circumstances. In these cases CrashPlan provides a solid alternative for local backups in addition to network backups. CrashPlan also has the advantage that it works equally well on Windows and Linux.

Clones are Key to Fast Recovery Time

Let me emphasize that maintaining a recent clone is the key for you to rapidly recover your data in the case of a disk failure or theft. Most incremental backup solutions, including Time Machine and CrashPlan, do not backup your entire computer including all the system files and boot records. This means that you must first reinstall your operating system and then restore your files from the incremental backup on to the newly installed operating system.

The process of recovering from a disk failure with a clone is much faster and more efficient since you can connect your cloned disk and boot from it. You computer will be in the same state as it was when you made the clone. You will only have to restore files that have changed since you last made the clone. No other recovery process is nearly as quick recent clone and an incremental backup. The difference is substantial.

Advantages of Time Machine

  • It’s free, supported by Apple and ships with every copy of Mac OS X
  • The setup is impressively simple and it generally just works after that
  • The overall user experience for backup and recovery is substantially better than most alternatives
  • You can manually mount a Time Machine disk on any computer and copy files from it

Disadvantages of Time Machine

  • When you restore from a Time Machine disk, the backup is invalidated and you must start your backups anew
  • Time Machine only backs up changes to your files once an hour, so there is always a potential lag in your backups
  • If you use FileVault, Time Machine will only backup your home directory when you log out
  • If you use FileVault, you can only restore your entire home directory (missing out on the great restore interface) unless your home directory is on Mac OS X Server
  • Time Machine can get confused if you plug more than one Time Machine backup disk into the computer
  • Moving a backup to a new computer is a complicated process and typically requires editing system files

Personal Observations About Time Machine

  • The combination of FileVault and Time Machine makes logging out very slow
  • I found the Time Machine volume occasionally got corrupted and I would have start over
  • Time Machine would sometimes cause large amounts of disk IO with high memory usage that substantially slow my machine down. This would typically happen after longer periods of not backing up due to travel etc.

Advantages of CrashPlan

  • Backups are continuous and files are backed up as soon as they change (note while CrashPlan can be used in local mode for free, continuous backups require a subscription to CrashPlan Central)
  • All backups are encrypted by default
  • Straightforward to configure multiple local and networked backup destinations

Disadvantages of CrashPlan

  • You must use the CrashPlan software to restore a backup, it needs to be installed first for recovery
  • Higher memory usage with 64-bit Java on Snow Leopard (see note below)
  • User interface is functional but, not nearly as nice as Time Machine, it’s also a bit slow to start up
  • If you use FileVault, you must be logged as the FileVault user for backups to happen

Personal Observations About CrashPlan

  • Simple fix improves memory usage
  • Appears to have much smaller impact on my system resources once memory is reduced
  • FileVault complicates install process

Notes on Reducing CrashPlan Memory Usage

I found that CrashPlan could use up significant amounts of memory with the 64-bit Java on Snow Leopard. The most recent version of CrashPlan places a 512 MB memory limit on the process, but that is still quite large. I limit my to CrashPlan process to 150 MB and it has not caused any problems, although this is lower than you will generally see recommended and you will want to carefully monitor your logs to look for memory errors if you set it this low. This post CrashPlan using too much memory on Mac OS X from offTheHill explains how to reduce the memory footprint of CrashPlan.

A Simple and Effective Backup Strategy for Mac OS X

Disk is inexpensive compared to the value of your time and data. My personal backup configuration consists of three types of backups. The following combination has proven itself over the last several years and I recommend it. It includes 1) A full disk clone, 2) an incremental backup, and 3) an online backup service. This setup is redundant, quick to configure, needs little maintenance, and allows for rapid recovery of data, even with a catastrophic failure.

Details of the three part backup strategy:

  1. A clone is a replica of your disk. One great feature of Mac OS X is that you can boot directly from a clone. This means if your hard drive dies, you can reboot from a clone on an external drive and be back to work in minutes rather than hours. I recommend SuperDuper ($28) as the user interface is very well done. Carbon Copy Cloner is an excellent alternative that is free to use, although the author encourages donations. Both applications support scheduling backups for a time when your system is not in use. Both applications also support incremental updates to substantially reduce the amount of time needed for subsequent backups. The hard drive for your clone must be as large as the amount of data you wish to back up.
  2. An incremental backup application called Time Machine ships with every copy of Mac OS X that archives any file changes every hour. Time Machine has a unique time-based interface that allows you to easily find and restore previous versions of files. Overall, Time Machine is simple to use and works well unattended, but it does have several detractors. First, if you have a hard disk crash, you must manually reinstall the base operating system from the DVD and then use Time Machine to a restore the rest of your data. This makes time machine most useful in cases of accidental file deletion or data corruption. Time Machine works very well when combined with a clone as you can quickly restore from a clone and use Time Machine to restore any files more recent than the clone version. Time Machine is far less useful on drives with FileVault enabled. I recommend giving Time Machine at least two times as much hard drive space as the amount of data you want to back up.
  3. An online backup service allows you to have offsite backups for cases of theft, natural disaster, or large mugs of coffee. Online services also allow laptop users to continue to make backups in any place that has a network connection. I have used the CrashPlan service for about 18 months and I find the service reasonably priced and reliable. CrashPlan automatically archives file changes in real-time and encrypts all backups. This is nice if you use it on a laptop because it means that you have backups even when you travel. CrashPlan also allows online restores from a web-based interface. The unlimited service is $25 a year for a 10GB service, $50 a year for unlimited service for one computer, and $120 a year for a family unlimited plan for up to ten computers. Multiyear subscriptions are discounted.

CrashPlan has a backup seeding service for $125 where they send you a 1TB drive. You then run the initial backup locally and ship the drive back to CrashPlan. Depending on the size of your disk and the speed of your network connection, the initial backup can easily take weeks. Companion emergency recovery services are also $125. Expedited shipping is extra. CrashPlan also offers a computer-to-computer backup mode. This means you could backup to another machine in your house or to a computer in a friend’s house. The computer-to-computer backup feature is free. The paid version provides real-time versioning with fine-grained control over the versioning settings, stronger encryption, the ability to restore from the web, and the client is ad-free. CrashPlan works with Mac OS X, Microsoft Windows, and Linux operating systems

I last wrote about backup options in We Need Simple Backup Solutions for Complicated Data.

Data Evaporation and the Security of Online Identities

Disappearing Data

What happens to our data when we are gone? What happens to us, when our data is gone? Does any of this missing data make us vulnerable? These questions that once seemed theoretical are increasingly relevant to our everyday lives. The consequences include not only the potential for lost communications, but also lost data in cloud services, and risk for security breaches for individuals and businesses alike.

We all understand that data deteriorates along with the physical media it is stored on–photographs fade and hard disks crash. This is why we have backups, or at least should have them. The problem is, unfortunately, not so simple these days as much of our data in the cloud depends on multiple systems and services acting in concert to exist. This means that data may disappear for reasons independent of the physical media, even with backups and replication.

I think evaporation is a useful analogy for describing the complex array of factors that cause data to disappear–including services going out of business, enforced retention policies, missed subscription payments, malicious deletion, and loss due to system migrations. One new problem is that the loss of modern data often includes not only documents and media on file systems, but also accounts and online identities.

Lost Data = Lost Access = Lost Identities

It is not a stretch to say our online identities are now essential for daily communication. As part of my dissertation research, I began to investigate the lifecycle–selection, increased use, decreased use, discontinuation, and points in between–of online identifiers including email addresses, instant messenger IDs, and social network services. I was particularly interested in what caused people to stop using their identifiers and if it was by choice. I found that often people lost access to identifiers for reasons out of their control, such as account lockouts, account inactivity, and failure to renew subscriptions. There is often a limited window of time before that data begins to evaporate due to account inactivity or missed payments for a service.

I began to look at the policies from major service providers related to inactive accounts. The policies I found were conflicting, inconsistently presented and followed, and are evolving rapidly. Email services tend to mark accounts inactive, while social networks do not. Paid email accounts do not have activity requirements.

Here are some of the policies from large providers of webmail and other services:

  • AOL: May mark free email account as inactive after 30 days and data may be deleted.
  • Gmail: Marks account as inactive after six months. Inactive accounts may still receive email. After nine months of inactivity, addresses may be deleted. Deleted addresses are not recycled or recoverable.
  • Hotmail: Microsoft says free Hotmail accounts will become inactive after 270 days or if you do not log in for 10 days after creating the account. Inactive accounts will not receive email. Account names may be deleted after 360 days of inactivity and Window’s Live IDs may be deleted after 365 days of inactivity. I also found conflicting documents on the – Microsoft site that said Hotmail accounts might be marked inactive after 30 days or 120 days of not logging in.
  • Yahoo: Deactivates free email accounts after four months. After this time, accounts may be reactivated, however any existing email is deleted and cannot be recovered.

Security and Recycled Identifiers

Depending on the circumstances, services may recycle expired accounts. This means that old identifiers may have new owners. The consequences may be much more than needing a new email address after forgetting to renew a domain name or the loss of a loved one’s letters after an account becomes inactive. There are serious security and privacy implications ranging from potential identity theft to corporate espionage.

If your old email address ends up with a new owner, that new owner will receive any email that was once destined for you. Why is this a problem? Suppose that email address was listed as the primary address or the recovery address for another account. Most systems send either one-time links to reset passwords, or worse, the password in plain text to the email primary or recovery email address. Unfortunately, people tend to reuse passwords across accounts. It is also not uncommon for people to list the older email address as the recovery address for a newer email account, meaning it would be possible to reset the password for a new account as well. Gaining access to an individual’s primary email account is the key to gaining access to most other accounts.

This is a not a theoretical problem. In 2009, Twitter’s internal systems were compromised when an attacker systematically evaluated Twitter employee’s personal accounts looking for potential points of access. The attacker realized that one employee registered a Gmail account using a Hotmail account that had since been marked inactive.

Hotmail recycled the Twitter employee’s account as it had been inactive more than a year and so the attacker simply registered the old username and then used it to reset the current Gmail password. The attacker then found messages in the Gmail account that contained plain text passwords and correctly guessed that the password had also been the Gmail password and simply reset the password to the old password to remain unnoticed. The hacker then used his access to the Gmail account and passwords to compromise other personal accounts of the employee and then those of other employees. One compromise led to another and eventually the hacker gained access to internal Twitter systems. He downloaded hundreds of internal documents, posted screen shots proving his exploits and released more than 300 internal documents to Techcrunch.

Domain Names

The rules and policies under which domain names expire and may be transferred to other parties are complex and vary widely–both by registrar, TLD, and ccTLD–but in general this is not much more than two months and after two to three months the domain will be resold. Here is a brief overview to give you a sense of the time frame and the complications related to expiring domain names.

When the owner of a domain fails to pay, the domain is typically assigned an “Expired” status usually lasting between 30 and 45 days. During this time the domain is usually renewable, but may not be accessible or transferable. Afterwards the domain enters what is known as the Redemption Grace Period (RGP), which is 30 days. Individual details are removed from the WHOIS database and the DNS are deleted so the domain is inaccessible. During the RGP, no edits or transfers are allowed, although the domain may be restored by paying the registrar a fee of $100-$250 USD. After this time, the domain is assigned a “Pending Delete” status, which lasts for five days. At the end of this period, the domain is generally either placed up for auction or released to the general registration pool.

Once a domain is reregistered, the new domain owner may create addresses and Web pages that match the old ones. Domains of defunct businesses may have potentially hosted many email accounts. As with the Twitter breach, these accounts could potentially lead to the compromise of other accounts.

Risk Analysis

The following are some risks to consider, and a few thoughts on how to mitigate those risks.

Potential Risks

  • A complex web of interlocking accounts and systems may affect your risk of a security breach.
  • Do not disregard the risk of “low value” accounts, as they may allow access to more sensitive accounts.
  • Inactive accounts may introduce as much liability as accounts with weak passwords.
  • Best practices may demand a clear separation of business and personal accounts and data, but there are often lapses in the real world.

Suggestions to Mitigate Risk

  • Document usernames and recovery addresses for each account.
  • Set recurring calendar tasks for account renewal payments and to log into infrequently used accounts.
  • Consider purchasing a subscription for infrequently used email accounts used as recovery addresses.
  • Consider using a password manager to generate and store unique strong passwords for each site.
  • Services should never send passwords in plain text.
  • Services should not allow password changes to recently used passwords.
  • Services should offer more notification options about accounts with a pending inactive or deleted status.