Personal Digital Archiving 2011 Call for Participation

Personal Digital Archiving 2011

February 24 & 25, 2011
The Internet Archive, San Francisco

We are pleased to announce that the Personal Digital Archiving 2011 Conference is now open for participation. We welcome proposals for session topics and speakers, as well as volunteers to help us organize and serve on site.

Conference sessions will be selected by an international peer review panel that includes:

  • Ben Gross, Highlands Group
  • Brewster Kahle, The Internet Archive
  • Cal Lee, University of North Carolina
  • Cathy Marshall, Microsoft Research
  • Clifford Lynch, Coalition for Networked Information
  • Elizabeth Churchill, Yahoo! Research
  • Jeff Ubois, The Bassetti Foundation
  • Jeremy John, The British Library

Relevant themes include but are not limited to family photographs and home movies; personal health and financial data; interface design for archives; scrap booking; social network data; institutional practices; genealogy; email, blogs and correspondence; and funding models.

Conference presentations will be 15-20 minutes in length. If you wish to submit an abstract for the conference, please email with:

  • title of your project, paper or presentation
  • a 150-300 word abstract
  • a brief biography (a few sentences)

Deadline for abstracts: 24 December, 2010.
Notification of acceptance: 5 January, 2011.

Late submissions will be considered on an individual basis.

Topics for discussion

From family photographs and personal papers to health and financial information, vital personal records are becoming digital. Creation and capture of digital information has become a part of the daily routine for hundreds of millions of people. But what are the long-term prospects for this data?

The combination of new capture devices (more than 1 billion camera phones will be sold in 2010) and new types of media are reshaping both our personal and collective memories. Personal collections are growing in size and complexity. As these collections spread across different media (including film and paper!), we are redrawing the lines between personal and professional data, and published and unpublished information.

For individuals, institutions, investors, entrepreneurs, and funding agencies thinking about how best to address these issues, Personal Digital Archiving 2011 will clarify the technical, social, economic questions around personal archiving. Presentations will include contemporary solutions to archiving problems that attendees may replicate for their own collections, and address questions such as

  • What new social norms around preservation, access, and disclosure are emerging?
  • Do libraries, museums, and archives have a new responsibility to collect digital personal materials?
  • What is the relationship of personal health information and quantified self data to personal archives?
  • How can we cope with the intersection between personal data and collective or social data that is personal?
  • How can we manage the shift from simple text-based data to rich media such as movies in personal collections?
  • What tools and services are needed to better enable self-archiving?
  • What are viable existing economic models that can support personal archives? What new economic models should we evaluate?
  • What are the long-term rights management issues? Are there unrecognized stakeholders we should begin to account for now?
  • Can we better anticipate (and measure) losses of personal material?
  • What are the options for cultural heritage institutions that want to preserve the personal collections of citizens and scholars, creators and actors?
  • What are the projects we can commit to in the coming year?

Whether the answers to these questions are framed in terms of personal archiving, lifestreams, personal digital heritage, preserving digital lives, scrapbooking, or managing intellectual estates, they present major challenges for both individuals and institutions: data loss is a nearly universal experience, whether it is due to hardware failure, obsolescence, user error, or lack of institutional support. Some of these losses may not matter; but the early work of the Nobel prize winners of the 2030s is likely to be digital today, and therefore at risk in ways that previous scientific and literary creations were not. And it isn’t just Nobel winners that matter: the lives of all of us will be preserved in ways not previously possible.


In February, 2010, more than 60 people met at the Internet Archive to explore common concerns about personal digital archiving. Attendees included representatives from UC Berkeley, Stanford, UNC, UT Austin, the University of Illinois, and Oxford University; Microsoft, Yahoo (Labs, and Flickr), Google, and Amazon (S3); the Smithsonian, the Magnes Museum; Xerox PARC; the Center for Home Movies, the California Digital Library, Family Search, and the Coalition for Networked Information. The Internet Archive, the Bassetti Foundation, and the Netherlands Institute for Sound and Vision provided support for the conference.

Several projects discussed in 2010 have progressed, and we’ll have some reports on these:

  • a showcase of interface designs for personal collections
  • cost modeling for personal archives
  • guidelines for AV archives interested in preserving amateur film
  • small scale endowments for storage that can allow individuals to preserve their materials inside leading institutions

The conference fee is $95 for attendees from non-commercial institutions and $195 for attendees from commercial organizations. Scholarships and early bird discounts are available.

Registration and conference information is available at

How and Why to Sniff Smartphone Network Traffic

Smartphone Network Connection Monitoring

Tools for monitoring and modifying connections between web browsers and web servers are essential for debugging, testing, optimizing performance, and assessing vulnerabilities of web-based applications and native applications. Developers, security professionals, and anyone with an interest in gaining insight into the lower levels of web traffic commonly use these tools.

There are many mature options for monitoring connections from desktop machines. Unfortunately, there are fewer tools to monitor connections on smartphones and these tools often require more complex configurations, as the monitoring software must run on a separate device. In this article, I present an overview of tools and methods for monitoring network connections on Smartphones including devices based on Apple’s iOS–iPhone, iPod Touch, iPad), Google’s Android OS, BlackBerry OS, and Symbian. This article focuses on inspecting HTTP and HTTPS traffic, although many of the tools and techniques described work equally well to analyze other protocols.

This article is the first part in a series: The articles in the series include:

  • An overview of the tools and techniques for monitoring smartphone network connection
  • Pros, cons, and limitations for monitoring smartphone network connections
  • Network monitoring for security analysis and self-defense

Why Monitoring is Useful

Potential use cases for monitoring HTTP and HTTPS traffic–the two primary protocols of the Web:

  • Inspecting network traffic often simplifies debugging AJAX XMLHttpRequest requests, compressed content encoding, and cookies.
  • Network connection details such as number of HTTP requests, DNS lookups, cache hits are also valuable for optimizing web application performance.
  • Many tools allow modifying requests and responses to simulate valid and invalid user input when testing applications for vulnerability analysis in addition to monitoring.
  • Network monitoring is an effective way to verify that a smartphone application securely handles user authentication and identify any inappropriate transmission of personally identifiable information such as unique identifiers and location.
  • Inspecting and modifying network traffic is essential for security analysis. For example, searching for Cross Site Scripting (XSS), SQL injection, and path traversal vulnerabilities.

Types of Monitoring Tools

Common network monitoring tools come in four major varieties: browser-based development tools, general purpose packet sniffers and network protocol analyzers, specialized HTTP/HTTPS sniffers, and specialized web proxies for debugging and security analysis.

Each type of tool has advantages and disadvantages, but there is no requirement to use a single type and combinations of tools may offer more power and flexibility. This list is in no way comprehensive, there are many specialized and hybrid tools for monitoring connections.

Two LiveCD Linux distributions contain a large number of tools optimized for penetration testing a subset of which is useful for network connection monitoring. BackTrack Linux is a very well-regarded distribution. AppSecLive the OWASP Live CD Project–soon to be known as the OWASP Web Testing Environment (WTE)–is another respected collection.

See the Top 100 Network Security Tools from provides a larger list.

Configurations for Monitoring

I’ll talk more about the constraints and pros and cons for each option in the second piece of this article, but briefly here are several potential configurations for monitoring.

  • Simulators allow the simplest configurations where the simulator and the monitoring software run on the same machine and share a common network interface.
  • Web proxies are a convenient option as all modern browsers supported them and only require a small change in the browser settings rather than a change in the network configuration.
  • Ad-hoc networks combined with internet connection sharing are one method to gain access to traffic. If the network monitoring host is located between the mobile device and the internet, it will typically require two network interfaces, usually one wired and one wireless.
  • Network hubs are one method to work around the problems with common switched network configurations.

Limitations for Monitoring

There are significant constraints for monitoring network connections. I’m specifically talking about WiFi-based traffic and not cellular traffic. Monitoring cellular traffic is substantially more complicated and requires specialized equipment. In nearly every case, all important web-related traffic will travel over WiFi if the cellular data connection is disabled on the device.

Limited software is one constraint. For example, there is currently no way to run Webkit Web Inspector, Firebug or LiveHTTPHeaders directly on a Smartphone. Limited networking options is adds another constraint as well as added complexity to the monitoring configuration. Typically, smartphones must communicate over wireless connections rather than wired connections, which eliminates some options for monitoring network traffic. Most modern network hardware is switched, which further limits the ability to access the traffic, even when an access point is plugged into a wired network. Additionally, wireless access points protected by WPA/WPA2 encryption employ per-user keys difficulties in sniffing are similar to switched networks.

Finally, monitoring connections encrypted with SSL/TLS also requires more complex configurations. The most straightforward option involves adding a new Certificate Authority to the trusted list in the browser. This effectively creates a man-in-the-middle attack for the browser that allows decryption of the HTTPS traffic. The browser will produce a series of warning messages, but it will be possible to view the encrypted traffic.

No Frills SSL Certificates are Inexpensive and Useful

SSL De Facto for Securing Connections

SSL, short for Secure Socket Layer, is a cryptographic protocol for securing network traffic that is the de facto mechanism for securing transactions on the web and many other protocols including email (SMTP/IMAP/POP), IM (Jabber/XMPP), VoIP (SIP), and SSL-based VPNs. The topic of SSL certificates is a bit arcane, but the much of security of our everyday online purchases depends on SSL. Yet, fewer services use SSL than one might hope. It is possible to buy a basic no-frills SSL certificates from a universally accepted certificate authority very inexpensively–less than $15 a year–if you shop around. In most cases, it makes no sense to use a self-signed certificate, to purchase a certificate from a second tier provider, or to purchase a chained certificate. This article is a substantial revision of an article in Messaging News from a few years ago. I receive some requests for an update and have also found an even more inexpensive provider in the meantime, which make the update worthwhile.

Securing a connection requires that at a minimum both the client and server application support SSL and that the server application must have a digital certificate with a digital signature from a Certificate Authority (CA). This is the most basic and the most common form of SSL Public Key Infrastructure (PKI), which a client to securely authenticate a server. Nearly every online shopping transaction uses this form of SSL to secure the payment details from the user’s browser to the merchants servers. One quick aside, the Transport Layer Security (TLS) protocol released in 1999 superseded the last version of SSL released in 1996, but nearly everyone still calls the protocol SSL.

The January 2009 Netcraft SSL Server Survey found nearly 2.1 million sites that responded to a request for a SSL certificate, but only about 40% of those were valid third-party certificates. Netcraft has been collecting SSL certificates since 1996 and reports that in recent years, use SSL has been growing at a rate of 30% a year. Still the August 2010 Netcraft Web Server Survey found over 210 million sites, which means the number of SSL enabled sites is a small percentage overall.

Why Is Server-Side Adoption of SSL So Low?

Given that nearly every consumer web browser and email client is SSL-enabled, why is server side adoption of SSL so low? In addition there are many reasons why businesses and even technically inclined individuals would want SSL certificates. There is substantial debate around the efficacy of the security provided by SSL for many common configurations, especially with its ability to prevent phishing and man in the middle attacks. Still, the security of an endless number of services such as small webmail providers, dashboards for managing blogs, and web-based router configuration consoles would all benefit from SSL. The majority of high volume ecommerce vendors use SSL, but I regularly see services that ask for credit card numbers over (shudder) unencrypted connections.

The relatively low use of SSL is due in part to the expense and difficulty of purchasing SSL certificates, the complexity of installing them, and the need for a static IP address. For small and medium businesses and individuals no-frills SSL certificates are affordable, especially if you are willing to shop around. The inexpensive certificates provide the same level of functional security for network traffic as the inexpensive certificates. The no-frills certificates are typically domain validated meaning someone just needs to be able to receive and email or possibly respond to an automated phone call in order to validate the domain, which makes the process fast but does not offer any particular assurance the certificate owner is who they say they are.

Other features beyond the level of security provided to network traffic are important for some business. For example, a business handling large numbers of consumer transactions may consider the branding of the certificate or the site seal important, or they may want the green bar shown by sites with Extended Validation (EV) certificates, or a Unified Communications (UC) certificates for an Exchange server. In these cases, then the no-frills route is probably the best one. No matter what kind of SSL certificate you want the process of purchasing them is frustrating and it is difficult to make any sense of the actual differences between the certificates by reading the marketing literature.

Certificate authority certificates, any intermediate certificates, and server certificates form a certificate chain that are verifiable through the SSL Public Key Infrastructure (PKI). It is possible for anyone to set up a private certificate authority and produce a “self-signed certificate.” This is often done for personal use or development purposes.

Inexpensive Certificates

Self-signed certificates require the same amount of effort to install and configure as a commercial certificate, they also require additional work to install and configuring a local certificate authority to sign the certificate. Self-signed certificates are not verifiable through the public PKI chain and most applications will produce warning messages that the certificate is not valid unless the user explicitly loads the credentials for the private certificate authority into each browser. Many second tier SSL providers offer chained SSL certificates, which are more complicated to install in many configurations and are typically less compatible on older browsers and mobile browsers. This said, chained certificates theoretically offer the certificate authority more security as they may revoke a compromised intermediate certificate with far less disruption than the root certificate.

RapidSSL is one of most economical of the top tier SSL certificates. RapidSSL has a bit of a convoluted history, but it is part of the GeoTrust family of certificate authorities, which is far and away the largest digital certificate vendor. GeoTrust was purchased by Verisgin in 2006 and in May 2010 VeriSign’s sold its certificate authority business to Symantec. Luckily, for the purposes of my argument the history is not important. What is important is that the GeoTrust family of certificates is recognized by nearly every browser.

For example, most recently I purchased certificates from a reseller called Revolution Hosting Pricing, Their pricing SSL certificates follows:

Type 1 Yr  2 Yrs 3 Yrs 5 Yrs
RapidSSL  $14  $24  $33  $50
RapidSSL Wildcard $135 $260 $360 $550
QuickSSL  $45  $86 $126 $300
QuickSSL Premium  $75 $140 $195 $300
True BusinessID $105 $190 $270 $425

Problems Purchasing Certificates

For many organizations, SSL certificates are moderately expensive, complicated to purchase, and even more complicated to install. In my own personal experience, the process of purchasing certificates has not improved greatly over the last decade. Going through the process, it is easy to see why so few sites, especially smaller ones, use SSL certificates. Clearly, there is great room for improvement in the user experience of the purchasing process. Unfortunately, I don’t see the process improving any time soon.

It can be surprisingly difficult to get a list of the certificate authority roots (often called a CA bundle) included in specific browsers and even more difficult to get the root certificate bundles included in most mobile devices. Unless the vendor provides a public list of included certificates, it is difficult to determine what CA’s are supported without extracting the CA bundle and analyzing it, which is a major pain. The lack of detailed information about the root certificates substantially complicates the problem for businesses that wish to determine which certificate may meet the needs of their users.

Because there is effectively no standard CA bundle for applications, operating systems, or mobile devices, each vendor has its own bundle of “trusted” certificates. This means, every application that employs SSL may use a different bundle, even if they are on the same machine. For example, both Windows and Mac OS X have a system-wide list of root certificates, but Firefox will use its own list of root certificates regardless of the platform.

To make matters worse many certificate authorities offer multiple types of certificates that may be signed with different roots. I looked at GeoTrust, Comodo, and GoDaddy, and Network Solutions web sites. Only GeoTrust clearly listed which root certificate signed each type of certificate on the main part of their site and not buried in a support document. The situation with GeoTrust was not always so simple, last time I checked a bit more than a year ago, I had to do quite a bit of work digging around the site to determine which root would sign the certificate I wanted to purchase.

Previously, a quick side project to SSL enable and IMAP server turned into an annoying extended detour after I realized that one of the older smartphones did not include the root certificate used on the IMAP server. While, it was possible to load the certificate manually, the process is too complicated for multiple users, although it could be handled in a bulk provisioning process. I ended up spending a significant amount of time searching for certificate authority lists and extracting certificate bundles for several smartphones to figure out which certificate to purchase that would cover them all.

Some Improvements in Purchasing Certificates

SSL certificate compatibility is gradually improving as applications, systems, and devices with out of date certificate bundles are gradually retired. As root certificates and intermediate certificates begin to time out and certificate authorities issue new root certificates. This means that if you have a server with a multi-year SSL certificate issued several years ago, its root certificate may differ from the current one. This is important if you are trying to connect to your SSL server from machines or devices with out of date certificate bundles.

Unfortunately, a market for automatic certificate installation in common machine configurations never developed. Both Microsoft and Apple have made strides with better GUI administration tools for SSL certificates. A number of web hosting services sell SSL certificates with installation for users who pay for the certificate and a static IP address. Another improvement on the horizon is RFC 3546–the Server Name Indication (SNI) extension for TLS. SNI will effectively allow name-based virtual hosting to use SSL similar to the name-based virtual hosts in HTTP 1.1. One major benefit is that this will allow multiple SSL enabled hosts on the same IP address. These are welcome improvements, but we still have a long way to go.

Appendix: A Brief History of RapidSSL and GeoTrust

GeoTrust became a certificate authority in 2001 when it purchased Equifax Digital Certificate Services from Equifax, which is why many of the GeoTrust root certificates are Equifax. FreeSSL launched in 2001 and offered free SSL certificates with its own single root certificate. These were popular, but only had 92% browser compatibility. In 2002, FreeSSL began to offer chained SSL certificates under the ChainedSSL brand for $35 a year, which was a very low price at the time. In 2003, FreeSSL relaunched and temporarily offered free one year ChainedSSL certificates and ChainedSSL wildcard certificates. In February 2004, FreeSSL launched a new brand called StarterSSL, which was a single root certificate. Also February 2004, FreeSSL relaunched the FreeSSL brand as a 30-day free trial certificate. The FreeSSL root certificate signed both the FreeSSL and StarterSSL certificates. Later in 2004 FreeSSL launched another brand called RapidSSL, which combined the StarterSSL single root certificate and included support.

In 2005 FreeSSL formally changed it’s name to RapidSSL. VeriSign purchased Thawte in 2003 and GeoTrust in 2006. At this point some of the details are fuzzy and involve a number of subsidiaries in Europe and Japan, but GeoTrust now apparently owns RapidSSL. In May 2010 Symantec purchased VeriSign’s Security Certificate Business and now controls all roots from all the prior acquisitions.

You should follow me on Twitter.

OpenID Trends: Improved Usability and Increased Centralization

The OpenID authentication framework is the most well known of the federated user-centric identity systems. OpenID has effectively become the first commonplace single sign-on option for the Internet at large. Most sizeable Web-based service providers such as AOL, Google, Facebook, Microsoft, MySpace and Yahoo! have integrated at least limited support for OpenID. Services often run OpenID authentication side-by-side with their in-house developed authentication or as an alternate method of authentication. Once the user has authenticated via their OpenID provider, their credentials can be used to automatically sign the user into other services previously linked to their OpenID. Widespread support has made OpenID the de-facto authentication mechanism for low-value transactions on the Web.

Two quick and somewhat loose definitions. An OpenID Provider is part of the backend of an identity system that offers an authentication services to other systems known as OpenID Relying Parties. Say your favorite blog requires that you log into Google to verify your identity to comment on a post. In this case Google would be the OpenID Provider (Identity Provider is the generic term) and your favorite blog would be the Relying Party since it depends on Google to handle the details of authenticating you so you can post.


OpenID has made great improvements in usability in the last several years. Many people found early OpenID implementations confusing. Users needed to first enter the URL that served as their OpenID identifier such as Without an existing cookie, users would have to enter their email address and password to complete the authentication. In addition, the users browser window was typically redirected to the OpenID provider’s site and then redirected back to the service they were trying to log into resulting in further confusion. Service providers found that the combination of URL-based identifiers and a login sequence differed from the entrenched standard of a username and password combination confused many people.

Each of these factors significantly reduced the usability of OpenID. However, OpenID specifications and implementations have evolved to mitigate and eliminate many of the usability problems. In many current deployments, users simply click on the logo their OpenID Provider (e.g., Google or Yahoo!) and then log in with familiar credentials without realizing the authentication is OpenID-based. One significant unsolved usability problem is that OpenID offers no support for Single Log Out. In the case of public or shared computers this situation is a significant security risk, as well as a usability problem, as subsequent users may find themselves signed in under the wrong user name when navigating to new sites.

User centric identity theoretically offers the end-user more control over his own identifiers, however in practice the amount of control is dependent on the amount of control the user has over the domain name or service of the OpenID URL. Users may maintain multiple OpenIDs and OpenIDs may be delegated. For example, an individual may wish to use a personal domain as an OpenID URL. The problem is this requires the skills to run the OpenID server as well as the overhead of maintaining and securing the server. There are two straightforward solutions to OpenID delegation, both of which require some technical facilities. The first–and most common–requires inserting a block of HTML containing the delegation commands on a Web page on the site being delegated to the OpenID Provider. The second requires adding an additional DNS CNAME for a host on the site that is being delegated to the OpenID Provider. Most individuals are highly unlikely to have this knowledge; the desire to obtain it, or even the knowledge that it exists.

Centralizing the Decentralized

OpenID was designed as a decentralized, federated, user-centric identity system. The OpenID infrastructure as a whole is decentralized. There are no dependencies on any single piece of hardware, software, service, individual, or company. The independent OpenID Foundation holds the intellectual property for the OpenID standards. The lack of dependencies removes the vulnerability of a catastrophic single point of failure.

I would argue that the common use cases for OpenID are increasingly centralized and realistic options for individuals to have any real control over their OpenIDs is decreasing. I recognize that some may argue with the last statement, but I would like to use a simple metric, which is the answer to this question: Can you take it with you? In the vast majority of common use cases, the answer is no. I would argue that the only viable way to have a true user-centric OpenID is to own a domain name and to have control over its DNS. The lack of end-user control does not mean the system functions any less efficiently, the opposite is quite likely true, but it does mean that it is not particularly user-centric.

In practice, OpenID appears to be heading towards greater centralization for Web-based authentication. Many services that offer OpenID authentication only accept authentication from a very limited set of OpenID providers. Services that accept OpenID authentication from any OpenID provider often place the general authentication in a less prominent location. Service providers have an incentive to limit authentication services they accept as it can significantly reduce risk and complexity and most users already have credentials from one of the major service providers. I believe this situation is not inherent to OpenID and would likely occur with any successful user-centric identity system. For example, Twitter does not support OpenID, rather it uses OAuth for both external authorization as well as authentication. Many services offer support for authentication via Twitter OAuth in the same interface as other providers that use OpenID.

Furthermore, most large OpenID enabled services are Identity Providers meaning they offer an authentication mechanism to other services. Most smaller OpenID enabled sites are OpenID Relying Parties meaning they accept authentication from OpenID Providers. OpenID Providers typically offer authentication services, but do not accept outside OpenID authentication themselves. Effectively, a few OpenID Providers serve many OpenID Relying Parties. Delegating the development and maintenance of user account management systems and password reset flows are benefits for offering authentication as an OpenID Relying Party. In addition these services gain the benefit of any advances in OpenID security and usability.

OpenID Increasingly Popular

In the close of my 2008 article: “The Promise and Problems of OpenID,” I wrote: “OpenID is clearly gaining in adoption and importance. Currently, OpenID is both too lightweight for enterprise identity management and too insecure for sites with financial or other highly sensitive data. Some of the current problems will be mitigated by OpenID extensions and new more secure mechanisms for OpenID authentication and improved phishing protection. Businesses, especially those with consumer Web-based services, would do well to familiarize themselves with the technology and pay attention to its progress.”

When people authenticate to poplar services via OpenID without having to even know they are using it, this indicates OpenID is becoming a mainstream authentication infrastructure. The protocol is evolving rapidly and it appears that common implementations in the future may be hybrids of OpenID and the OAuth authorization protocol. Still, there are substantial costs to implementing, managing, securing, and supporting user account management systems. Offering authentication as an OpenID Relying Party can potentially significantly reduce these costs and the friction for new account signups for people with existing OpenIDs. However, this reduction in cost comes with a loss of control over user account information that must be weighed against the benefits. Even though long-term stability for OpenID may be a ways off, it is clearly a critical technology to monitor.

* This article originally appeared as OpenID Trends: Improved Usability and Increased Centralization in the August 2010 issue of Messaging News.

How to Email a Complete Web Page From Any Browser

Email is still one of the most convenient ways to quickly share links to friends and colleagues. Unfortunately, there are two major problems. First many people’s browsers are not configured to work correctly with their email client, especially for webmail. Second, many browsers only support emailing a link to the web page and not the entire web page. Furthermore, native support to email links is inconsistent and often formatted in a way that may break links for the recipient. I my Messaging News article a Better Way to Share Links in Email described these problems as well as a solution based on the free Readability bookmarklet that should work in nearly any browser and typically produces better results.

Native Options

This article looks at your options for emailing full web pages from nearly any browser. Unfortunately, there are few native options for emailing full web pages. If your primary email client is Outlook 2007 you can select to View -> Toolbars -> Web then open your web page in the built-in browser and finally select “Send Webpage by Email” from the Actions menu. In Internet Explorer version 6 and higher you can click on the “Send Page by Email” button. If you use both Apple Mail and the Safari browser you can select the “Mail Contents of This Page” from the File menu.


The next most simple option is to use the EmailTheWeb service, The service requires that you sign in with Google Account and uses your Gmail account to send out the message. The service is free for up to 25 messages a day. Email the web will also archive your pages for a limited time and mirror the original web page for the recipient in cases where the HTML was too difficult for the application to send correctly. Paid plans range from $20 to $80 a year. Paid plans include longer archiving and mirroring periods. You can use the service by entering your URL on the web site, with a browser bookmarklet, as a Google Toolbar button in IE, or as a Firefox extension.

Limitations of Email Web Pages

All of the above methods of email a full HTML page have limitations. In particular, complex HTML pages will likely look different to the recipient as the application sending the web page may modify contents when sending and the recipient’s email client may further modify the page when rendering it. Web mail clients typically have strict limitation on style sheets in email and many block images by default. The Campaign Monitor Guide to CSS support in email clients is an excellent overview of the limitations. Campaign Monitor has more details on other aspects of HTML in email in their resources on designing and building emails. In some cases it is possible to simply copy and paste the entire email message, but the results are typically far from satisfactory, especially since the style sheet is often not copied along with the HTML. Some pages have a print link that produces a simplified version that works better with cut and paste.

Readability Offers a Better Solution

In general I recommend that people first use the Readability bookmarklet to clean up the page and send the new version via email. Unmodified web pages will often not look like the original and may in fact be far less readable if an essential element is modified or removed. I regularly see pages that have text which becomes mashed together, hidden beneath images, and is otherwise unreadable. The page may also contain many unnecessarily elements such as page navigation and embedded items such as Flash that will not typically arrive correctly. Web pages that processed by Readability often fare much better.

Readability is an excellent tool from Arc90, that reformats web pages, strips out extraneous elements/ads, turns the text into a single column, and generally improves the typography. I find it makes nearly any web page significantly easier and more pleasant to read. I find several advantages to forwarding pages processed by readability. First, Readability inserts a reload button into each page so the recipient only needs to click on the button to see the original in the browser. Second, Readability includes a print link with a stylesheet customized for printing. Third, the pages greatly simplified, easier to read, and have less HTML for any email client to screw up. From all reports, it is also very helpful for people with limited vision as it increases accessibility. Pages processed with Readability make it far easier for recipients with mobile phones to read the content and typically load faster. I tested reading emailed pages on both iPhone and Android devices. Finally, since you are mailing the entire page to the recipient the well be able to read it offline.

To use Readability, just drag the bookmarklet to your toolbar and click on the bookmarklet for any page you want to improve. Readability offers a selection of fonts including two licensed from TypeKit, options to change the size of the text, modify the width of the margins, and optionally convert all links to footnotes. You can find more information about readability in the Arc90 blog posts Introducing: Readability 1.5 and Readability Updated: An End To The Yank Of The Hyperlink. Finally, the most recent update to Readability includes the long-awaited feature to automatically stitch together multi-page articles, which is a feature that none of the native clients offer. The service is free and the Readability source code is available under the Apache license. For users of Safari 5 on the Mac, Safari Reader is based on Readability and offers much of the same functionality, but does not have any customization options. The “Mail the Contents of This Page” option works from Safari Reader.

There are a few limitations, first Readability will not work on every web page. It is specifically designed for longer articles and does not fare well on complex home pages. Second, the process adds an extra step, which is decidedly less convenient. Finally, in testing I found that ad blockers caused Readability to over block images in some cases. In cases where Readability fails, I find that the Instapaper Mobilizer service is a good alternative, but it is not designed for high volume use.

Federal Digital Identity Proposal Lacking in Usability

The White House announced The National Strategy for Trusted Identities in Cyberspace (NSTIC) proposal and a NSTIC Fact Sheet on The White House blog. The NSTIC proposal (PDF) describes a plan to implement a federated online identity system with strong authentication. The document states the President expects to sign a final version in October 2010 and the strategy will likely significantly influence the government’s identity management efforts. In this post I will discuss the usability aspects of the proposal.

One of my primary concerns is that the proposal barely mentions usability factors within the identity system, even though they will be crucial for gaining public acceptance and critical to its effectiveness. Researchers studying usability and security have repeatedly shown that people are likely to resist or circumvent security in a system with poor usability. One of the guiding principles for the strategy is that “Identity Solutions will be Cost-Effective and Easy To Use.” However, the section is only a half a page long and largely discusses the potential benefit derived from reducing the number of username and password combinations individuals must remember. The section includes a few sentences that state that the new identity system should take advantage of as many existing widely used of infrastructure as possible and that service providers should conduct usability studies. The section leaves the reader with the impression that usability in actually unimportant even the proposal lists ease of use as listed as a major goal.

I would argue that most modern identity systems have been overly complicated for individuals to use and have required too much cognitive overhead for routine transactions. This is in no small part why it has been so difficult to move beyond the much-criticized username and password combination for user authentication. In order for a new identity system to provide significant improvements in reliability, assurance, security, and privacy, we must make significant improvements in usability. This is not a new problem. In his 1992 paper Observing Reusable Password Choices, Eugene Spafford, published research detailing problems with reusing weak passwords on multiple sites (Spafford 1992). In their 1999 paper Users are not the enemy, Adams and Sasse investigated compliance with security policies and in particular password management policies in several companies and found that compliance rates were substantially lower when policies conflicted with or prevented common work practices. In their 2006 paper Why Phishing Works, Rachna Dhamija and colleagues showed how individuals consistently fail to detect fraudulent web sites even when security indicators provided notifications that something was amiss.

Another component of usability is accessibility. The proposal made no mention of how the new identity systems will accommodate the less technically savvy and less able-bodied segments of the population. The strategy should consider those with limited vision, limited mobility, or other disabilities. The American Foundation for the Blind provides the following statistics of adult Americans with limited vision. Ages 18-44 8.0 million, ages 45-64 10.7 million, ages 65-74 2.8 million, ages 75 and older 3.7 million. This is a total of 25.2 million adults who have trouble seeing even with glasses or contact lenses.

The proposal promotes a federated and user-centric identity system. The common definition of a federated identity system is one that allows one service to accept authentication from another service. User-centric identity systems allow individuals some measure of control over their identities–typically a username or other unique identifier–and the attributes–age, email address, citizenship–attached to that identity. The usability problems for federated identities, user-centric-identities, and attribute exchange are neither trivial nor solved. OpenID is arguably the first widely adopted federated authentication mechanism for the internet with a user-centric model.

The history of OpenID is an excellent illustration of the usability challenges. Early incarnations required that users enter their OpenID URL to begin the authentication process. Their browser session was then redirected to the OpenID provider they used for authentication, which was often a different domain than the one they were attempting to log in to. Finally, after a successful authentication, the user would be redirected back to the original site. The change from the traditional username and password combination combined with a confusing authentication flow with multiple redirects left many users confused. OpenID specifications and implementations have evolved to mitigate and eliminate many of the usability problems. In many current deployments, most users will not even realize they are using OpenID for authentication, as they simply will click on a Google or Yahoo logo and then log in with familiar credentials.

This post is a revised version of the usability portion of the comments I submitted to the official NSTIC submission site. I based the critique on research from my dissertation Online Identifiers in Everyday Life, where I examined at the ways that social, technical and policy factors affect individual’s behavior with online identifiers.

Why Pinboard is My Favorite Bookmarking Service

Pinboard is a bookmarking service that allows you to easily save, tag, annotate, share, and archive bookmarks independent of your browser. Pinboard describes itself as “antisocial bookmarking,” which highlights its capabilities as a private and personal archiving tool compared to the social features offered by Yahoo’s Delicious service. I find Pinboard a simple, fast, and reliable way for me to save bookmarks and archive web pages for future reference. I have been happily using the service for nearly five months (Update a year) and recommend it highly.

Pinboard has become a part of my everyday online reading experience as I use it archive both a bookmark and the full text of any article I found interesting or that I plan to read later. My primary use of Pinboard is as a personal archive rather than a public bookmark sharing service, and I prefer it to Yahoo’s Delicious bookmarking service, although Pinboard has fewer options for sharing and tag management. For example, it does not support the Delicious style of aggregating multiple tags in tag bundles or the ability to share a bookmark with a specific user.

To start using the service, simply drag one of the Pinboard bookmarklets into your browser bookmark bar. The first style of bookmarklet can either open a new page or a popup window allows you to edit the URL, title, description, tags, and optionally mark the bookmark as private or “to read”. I use the send style of bookmarklet that Pinboard calls “read later.” This bookmarklet saves the page, automatically marks it as read later, and returns you to the place on the page where you left off without opening a new window or a popup. The “to read” status allows you to quickly build up a reading list without interrupting your workflow.

You can aggregate links posted to multiple services by configuring Pinboard to watch for links in your Twitter posts, Twitter favorites, or pages saved to Instapaper, Read It Later, Delicious, and Google Reader. You can easily save links from a BlackBerry or iPhone using a private email address from Pinboard. I find the ability to centralize my bookmarks from multiple services very convenient. Pinboard automatically expands any shortened links and stores the original URL. Full text search on Pinboard include the title, description, tags, and notes, but not the text contained in the pages themselves. Pinboard also allows you to narrow the results of queries with public vs. private status, starred status, and the source e.g. Twitter.

Pinboard offers a single paid add-on, that will archive the entire page, HTML, CSS, and images for each bookmark you save. You can then view the snapshot of the page, even if the original disappears. The cost for this is $25 a year minus your original sign-up price. Pinboard recently introduced a feature where all users can download an offline copy of the last 25 URLs saved. The developer says that he plans to eventually allow users to download their entire archive.

Pinboard offers multiple ways to import and export data including including a format compatible with that is compatible with Delicious. Pinboard offers both public and private RSS feeds of bookmark data including tag-based feeds. The Pinboard API is compatible with the Delicious API. This means that any application that uses the Delicious API should work with Pinboard by simply changing the URL to the API endpoint. Unfortunately, most bookmarking applications do not allow end users to change the API endpoint URL and few directly support Pinboard. On the Mac, both Delibar ($18) and Pukka ($17) desktop applications support Pinboard. The best solution for mobile devices is to use the Mobile web version of Pinboard. Update The Delibar touch application for the iPhone and iPad ($1.99) works with both Pinboard and Delicious. I recommend it.

Overall, Pinboard is an excellent option for storing and archiving bookmarks and I recommend it highly. The service is not free. Currently the price to join is $6.38 (Update $7.41) and the cost increases by a fraction of a cent for each new user. I like this pricing model as it is inexpensive and allows the developer to support the service without ads and without taking external funding. This leaves the service with a smaller, but more active user-base, and more importantly almost no spam. Recent Pinboard releases have improved bulk editing capabilities, but it is not currently possible to add or remove tags on a set of items returned from a search of your own bookmarks. Hopefully, the developers will eventually add this feature as it would make it possible to quickly and easily organize large numbers of uncategorized bookmarks. Update The developers added this functionality. Tag management is now far more flexible.

If the idea of social bookmarking seems foreign or the benefits do not seem clear, I highly recommend taking three minutes to watch the short and entertaining animated video Social Bookmarking in Plain English by Common Craft. What is Antisocial Bookmarking? is a nice post on the Pinboard blog by, Maciej Ceglowski, the founder of Pinboard explaining his reasons for creating the service.

* This article originally appeared as Why Is My Favorite Bookmarking Service in my Messaging News “On Message Column.”

Update 2010-12-16 Mentioned feature additions, Delibar touch support, and price update.

You should follow me on Twitter.

iPhone Screenshot and Photo Smart Album Hack

I take a lot of screenshots when I research products, both on the desktop and on the iPhone, so having some way to automate organizing my collection is important. The problem is that screenshots images taken with the iPhone have no EXIF metadata. This means there is no straightforward way to produce a list of all your screenshots.

After a little bit of experimentation, I found a workable but not ideal solution. You can use the lack of EXIF metadata as conditions to group all the images. Screenshots are saved as PNG files on the original iPhone and the iPhone 3GS (the two models I had access to) and have no EXIF records. The only other metadata fields available are filename, file size, and modified, and imported dates. The PNG extension for the filename is the one existing feature you can search for, all others have to be unknown. I selected two features aperture and ISO, even though one would work in the hopes that this would reduce any false positives.

A Smart Folder recipe for iPhone Screenshots

  • Match all of the following conditions
  • Aperture is Unknown
  • ISO is Unknown
  • Filename contains PNG

iPhone Screenshot 3 Item Smart Folder.png

Photos taken on the iPhone are saved as JPEGs and contain EXIF metadata. The iPhone 3GS embeds many more fields than the original iPhone. The easiest feature to select is “Camera Model.” The field type must be is or is not, there is no option for contains, so you will have to specify each phone separately.

A Smart Folder recipe for iPhone Pictures

  • Match any of the following conditions
  • Camera Model is Apple iPhone
  • Camera Model is Apple iPhone 3GS

iPhone Pictures Smart Folder.png

Searching for Screenshots from the command line

All iPhone screenshot images have a width 320 pixels and height 480 pixels in portrait or landscape. It is possible search for these files using the Spotlight command line tool mdls to integrate them into other scripts. There are many other options for searching for images with the full Spotlight syntax and it is possibly to execute these as Raw Querys in the Finder or use a Spotlight front end such as HoudahSpot, but that is a topic for another post.

mdfind -onlyin $HOME/Pictures 
  'kMDItemKind == "Portable Network Graphics image" && 
  kMDItemPixelHeight == 480 && kMDItemPixelWidth == 320'

Notational Velocity – Elegant Note Taking for the Mac

Notational Velocity is a free and open source note taking application for Mac OS X that is extremely simple, fast, and stable. I find the minimalist interface very functional and pleasant to use. It is one of my favorite applications.

I mentioned Notational Velocity’s ability to sync with the Simplenote iPhone note taking application in my Messaging News Magazine column Great iPhone and iPad Apps for Reading and Sharing Docs. The combination of Notational Velocity and Simplenote allows me to create, edit, and manage notes that are seamlessly synchronized between my desktop and iPhone without worrying that I will have the latest version on the other device.

Dropbox and allow for synchronizing Notational Velocity across multiple machines. The author of provides the source code you can run your own private server on Google App Engine.

Aside from the ease of use and speed some of the features of Notation Velocity I like are:

  • Makes no distinction between searching notes and creating new notes
  • Displays search results incrementally to help rapidly filter documents
  • Saves automatically, no save button needed
  • Allows data export with a single click
  • Preserves creation and modification timestamps for both import and export
  • Optionally stores notes as plain text, rich text, or HTML
  • Optionally stores notes as a single database or as plain text files in a directory
  • Optionally encrypts the database and provides secure text entry mechanism
  • All commands have keyboard equivalents

* This article originally appeared as Notational Velocity – Elegant Note Taking for the Mac in my Messaging News “On Message Column.”

Great iPhone and iPad Apps for Reading and Sharing Docs

Instapaper, Dropbox, GoodReader, and Simplenote are my favorite applications for reading, writing, and sharing documents on the iPhone and the iPad. I have used each application for more than six months and I highly recommend all of them.


The Instapaper application makes it simple and pleasant to read lengthy articles on your mobile device. Instapaper is optimized for the type of articles where you find yourself starting in your browser and thinking, “I’d rather read this later”. The application automatically loads any new content from the Instapaper Web service, which reformats Web pages for small screens and strips away unnecessary elements. The service provides an experimental option to save pages formatted for the Kindle as well.

There are multiple ways to save content to the Instapaper service including a bookmarklet, email, or applications that integrate Instapaper directly. The “Read Later” bookmarklet is compatible with most desktop browsers, mobile browsers and Google Reader. Each Instapaper user receives a unique email address that will import included links and text. Many iPhone and iPad RSS feed readers, Twitter clients, and social bookmark clients support saving links to Instapaper directly. The Instapaper service allows sharing of individual articles via email, Tumblr, and several Twitter clients.

Instapaper is available in two versions, a free ad-supported version with a limit of 10 articles, and a $5 (USD) pro version with a 250-article limit. The pro version includes additional features, such as background updating, folders, remembering the last read position, tilt scrolling, multiple font options, and disabling rotation. I find that the pro version is well worth the price.


In a crowded market of Web-based consumer storage services, Dropbox is popular and widely praised. The minimal user interface of the desktop application is one reason for its popularity. When I say minimal user interface, in most cases I mean non-existent. This is the beauty of Dropbox. After installing the application, Dropbox appears as a folder on your desktop. The folder is essentially magic. Any files in the folder are automatically synchronized to all other machines where you have Dropbox installed. Mobile Dropbox clients synchronize with the server upon launch. In my experience, it just works, and this is high praise. Dropbox is fully accessible via a Web interface for devices without an installed Dropbox client. Dropbox saves any revisions to your files for 30 days by default. These revisions are only available via the Web interface and do not count against your storage quota.

Working with shared files on Dropbox is as easy as working with files on the desktop. Shared files and folders are synchronized with all authorized users’ accounts. My only real complaint is that sharing must be configured from the Dropbox Web interface rather than a Dropbox client, which is not intuitive. Access control for sharing is based on email addresses and can only be configured via the Dropbox Web interface. It is important to recognize that any shared files count against the storage quota for all shared accounts. Each user’s Dropbox folder has a public directory—any files placed in that directory become publicly accessible without access control. The mobile Dropbox can generate links and can be used to share individual files with any email address. Be careful, the mobile links can also share private files and currently there is no way to revoke access.

Another reason for Dropbox’s popularity is its broad platform support. Mobile clients for Dropbox are available for the iPhone, iPad, and Android devices. A BlackBerry version is in development. Desktop clients are available for Mac, Windows, and Linux. All Dropbox clients are free. The mobile Dropbox client takes advantage of the document viewers built that are part of iPhone OS to open files directly. Supported formats include plain text, RTF, Microsoft Office documents, iWork documents, PDFs, Web pages, images, music files, and videos. Dropbox only supports viewing files; files must be edited with another application.

Some mobile applications such as GoodReader can read and write files from the Dropbox service, although the process is a little convoluted. Dropbox recently added a new mobile API to allow iPad applications to easily save files to a Dropbox account. Saving files to Dropbox is far easier with the most recent version of the GoodReader iPad application due to the new APIs. Even better, the Dropbox iPad application allows you to open files directly in other applications. Hopefully the iPhone Dropbox application will gain this functionality with the next major version of iPhone OS.

Dropbox is a subscription service that uses the Amazon Simple Storage Service (S3) for the backend store. A free account is available with 2 GB of storage. There are two paid upgrade options—a 50 GB option for $10 (USD) a month or $100 (USD) a year and a 100 GB option for $20 (USD) a month or $200 (USD) a year. Paid accounts can optionally save file revision history forever.


GoodReader works well with long and complex PDF documents. I have used it to read PDFs that are several hundred pages long without a problem. The iPhone and iPad support PDF files natively, but navigating long documents is cumbersome as there is no support for jumping to a specific page, for using PDF bookmarks and outlines, or for searching PDF files. GoodReader supports navigation to specific page numbers, PDF bookmarks and outlines, and full text and bookmark-based search. The application includes a night mode for reading in the dark and an autoscroll mode for reading long files without having to manually select the next page.

GoodReader’s support for text files includes a number of features not available in the native viewer, including the ability to edit text files and reflow text when the font size changes or the device is reoriented. One feature, called PDF Reflow, extracts plain text from PDF files and displays it in GoodReader text file viewer so it can be reflowed, copied to the clipboard, or edited. PDF Reflow should not be confused with accessible PDFs that are sometimes called reflowable PDFs.

GoodReader supports file transfer over WiFi in addition to many storage services including POP and IMAP email servers, WebDAV servers, Apple’s MobileMe, Dropbox,, Google Docs, and FTP servers. There are two versions of GoodReader for the iPhone. The standard version is $0.99 (USD). Access to POP and IMAP email servers, Google Docs, and FTP servers require a $0.99 (USD) in-app upgrade purchase each. GoodReader Light includes all available types of server access and is available for free on the iPhone, however it is limited to storing five files. GoodReader for the iPad is currently on sale for $0.99 (USD) and includes all available types of server access.


Simplenote is a note taking application for the iPhone and iPad that automatically synchronizes with the Simplenote Web service. The application has a basic feature set, but it works very well and is easy to use. Notes are stored as plain text and can be forwarded as email messages or deleted individually. Unfortunately, there is no mechanism to work with multiple notes at once. The built in search is fast and searches incrementally as you type, to quickly narrow down the list of notes with the search term. Options include changing the sort order, preview, link detection, and display of file modification dates. If the user has installed TextExpander, snippets will expand automatically. All notes can also be viewed or edited in any Web browser using the Simplenote Web service.

Currently, support for the iPad is limited to the same feature set as the iPhone aside from running in full screen mode. The developer plans to add additional iPad specific features shortly. The Simplenote API enables synchronization with multiple desktop applications including Notational Velocity—a simple, fast, stable note taking application for Mac OS X. This means I can create notes, make changes or additions either on the desktop or my iPhone and they are automatically synchronized. I am very happy with the setup.

Simplenote is free and ad-supported. A $9 (USD) a year premium add-on removes ads, provides an automatic backup, an RSS feed, the ability to create notes by email, access to beta versions, and prioritized support.

* This article originally appeared as Great iPhone and iPad Apps for Reading and Sharing Docs in the May 2010 issue of Messaging News.