Tracking, Geolocation and Digital Exhaust

You are unique… In so many ways…

The accounting systems on which modern society depends are surveillance systems when viewed with another lens. All administrative, financial, logistics, public heath, and intelligence systems rely on the ability to track people, objects, and data. Efficiency and effectiveness in tracking have been greatly aided by improvements in data analysis, computational capabilities, and greater aggregations of data.

Advances in social network analysis, traffic analysis, fingerprinting, profiling, de-anonymization/re-identification, and behavioral modeling techniques have all contributed to better tracking capabilities. In addition, modern technological artifacts typically contain one or more unique hardware device identifiers. These identifiers—particularly in mobile devices, but also RFIDs, and soon Intelligent Vehicle-Highway Systems—are widespread, but also effectively unmodifiable and relatively unknown to most of their owners. For example, with mobile devices, each network interface (cellular, Bluetooth, WiFi) requires a minimum of one unique hardware identifier—all uniquely trackable. One hand, aggregating these unique identifiers allows services like Google, Skyhook, and others to associate geolocation data with WiFi access points and provide useful services. On the other hand, Samy Kamkar’s work described in Hack pinpoints where you live: How I met your girlfriend shows the potentially awkward and invasive side effects.

Individuals generate transactional data from common interactions offline such as card key systems and nearly every online transaction. Improvements in techniques to correlate disparate data as well as techniques to analyze the unique characteristics of software, hardware, network traffic to form a fingerprint is frequently unique. For example, a large-scale analysis of web browsers from the Panopticlick project showed that over 90% of seemingly common consumer configurations were effectively unique. IP geolocation data can be used to increase security as with Detecting Malice with ModSecurity: GeoLocation Data or it can be used in ways that are quite Creepy.

Another major shift is the widespread collection and aggregation of geolocation information from mobile devices. Location can be a highly unique identifier, even if the mobile device changes. Philippe Golle and Kurt Partridge show that two data points sampled during the day—one at home and one at work are enough to uniquely identify many individuals, even in anonymized data. Geolocation data can also reveal significant information about the people spend time with and a view of their social network. Jeff Jonas sums this up well in Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food! In a sense the mobile phone has caused an enormous increase in uniquely identifiable data that can be used for tracking.

An average person now generates a constant stream of geolocation data that is collected by mobile carriers. Geolocation information is generated from cellular triangulation, geolocated IP addresses, and integrated GPS units, which deliver down to 10 meter accuracy. Geolocated mobile transaction data aggregated across multiple carriers is increasingly available for commercial use. It is possible to accurately track large numbers of individuals in constrained environments simply by sniffing the ITMI (temporary ID) as Path Intelligence does in mall, although they could sniff the IMEI just as easily, but they say they do not to protect privacy. Still, large-scale analysis of geolocation data is in its infancy. ReadWriteWeb describes how Developers Can Now Access Locations of 250 Million Phones Across U.S. Carriers

Tracking technologies—particularly when combined with geolocation information—have matured far beyond tracking individuals and are rapidly becoming capable of tracking groups and larger populations, which could be applied to entire enterprises or political organizations. Tools and techniques have made it feasible to correlate geolocation information, commercially aggregated profiles of online use, digital fingerprints, and offline transactional data. In addition, analysis of current anonymization techniques has repeatedly shown that simply adding another source of data is enough to re-identify a large percentage of the population. The Spatial Law and Policy blog is doing a nice job of tracking the policy implications of geolocation data.

The immense potential value of geolocation and other tracking data may well provide enough incentive for it to be used in ways counter to our own interests. Potential threats for misuse of the data need to be taken into account when designing systems. For example, what is the value of highly accurate logistical data about a US corporation derived from geolocation data and social network analysis to a foreign industrial competitor? Even a small amount of data that allowed a rudimentary analysis of external individuals meeting with internal high-level executives would be a worthwhile target. Similarly, both foreign industrial interests and foreign states may be willing to spend significant resources to acquire details on the movements and meetings of political parties.

More broadly I have been thinking about the question—What does it mean for a third-party to acquire better logistics about an organization than the organization has itself? What are the policy implications when and if these tracking tools are deployed in places without the rule of law, stable transitions of government, and low levels of corruption that we assume in the US? Could changes in the design and implementation of these systems mitigate the risks outlined? For example, should these design changes include internal controls, data scrubbing capabilities, and user interfaces that more clearly indicate a big picture of what data is being given off. Are there behavioral strategies that would reduce risks? To what extent can user education reduce risk?

You should follow me on Twitter.

How and Why to Sniff Smartphone Network Traffic

Smartphone Network Connection Monitoring

Tools for monitoring and modifying connections between web browsers and web servers are essential for debugging, testing, optimizing performance, and assessing vulnerabilities of web-based applications and native applications. Developers, security professionals, and anyone with an interest in gaining insight into the lower levels of web traffic commonly use these tools.

There are many mature options for monitoring connections from desktop machines. Unfortunately, there are fewer tools to monitor connections on smartphones and these tools often require more complex configurations, as the monitoring software must run on a separate device. In this article, I present an overview of tools and methods for monitoring network connections on Smartphones including devices based on Apple’s iOS–iPhone, iPod Touch, iPad), Google’s Android OS, BlackBerry OS, and Symbian. This article focuses on inspecting HTTP and HTTPS traffic, although many of the tools and techniques described work equally well to analyze other protocols.

This article is the first part in a series: The articles in the series include:

  • An overview of the tools and techniques for monitoring smartphone network connection
  • Pros, cons, and limitations for monitoring smartphone network connections
  • Network monitoring for security analysis and self-defense

Why Monitoring is Useful

Potential use cases for monitoring HTTP and HTTPS traffic–the two primary protocols of the Web:

  • Inspecting network traffic often simplifies debugging AJAX XMLHttpRequest requests, compressed content encoding, and cookies.
  • Network connection details such as number of HTTP requests, DNS lookups, cache hits are also valuable for optimizing web application performance.
  • Many tools allow modifying requests and responses to simulate valid and invalid user input when testing applications for vulnerability analysis in addition to monitoring.
  • Network monitoring is an effective way to verify that a smartphone application securely handles user authentication and identify any inappropriate transmission of personally identifiable information such as unique identifiers and location.
  • Inspecting and modifying network traffic is essential for security analysis. For example, searching for Cross Site Scripting (XSS), SQL injection, and path traversal vulnerabilities.

Types of Monitoring Tools

Common network monitoring tools come in four major varieties: browser-based development tools, general purpose packet sniffers and network protocol analyzers, specialized HTTP/HTTPS sniffers, and specialized web proxies for debugging and security analysis.

Each type of tool has advantages and disadvantages, but there is no requirement to use a single type and combinations of tools may offer more power and flexibility. This list is in no way comprehensive, there are many specialized and hybrid tools for monitoring connections.

Two LiveCD Linux distributions contain a large number of tools optimized for penetration testing a subset of which is useful for network connection monitoring. BackTrack Linux is a very well-regarded distribution. AppSecLive the OWASP Live CD Project–soon to be known as the OWASP Web Testing Environment (WTE)–is another respected collection.

See the Top 100 Network Security Tools from SecTools.org provides a larger list.

Configurations for Monitoring

I’ll talk more about the constraints and pros and cons for each option in the second piece of this article, but briefly here are several potential configurations for monitoring.

  • Simulators allow the simplest configurations where the simulator and the monitoring software run on the same machine and share a common network interface.
  • Web proxies are a convenient option as all modern browsers supported them and only require a small change in the browser settings rather than a change in the network configuration.
  • Ad-hoc networks combined with internet connection sharing are one method to gain access to traffic. If the network monitoring host is located between the mobile device and the internet, it will typically require two network interfaces, usually one wired and one wireless.
  • Network hubs are one method to work around the problems with common switched network configurations.

Limitations for Monitoring

There are significant constraints for monitoring network connections. I’m specifically talking about WiFi-based traffic and not cellular traffic. Monitoring cellular traffic is substantially more complicated and requires specialized equipment. In nearly every case, all important web-related traffic will travel over WiFi if the cellular data connection is disabled on the device.

Limited software is one constraint. For example, there is currently no way to run Webkit Web Inspector, Firebug or LiveHTTPHeaders directly on a Smartphone. Limited networking options is adds another constraint as well as added complexity to the monitoring configuration. Typically, smartphones must communicate over wireless connections rather than wired connections, which eliminates some options for monitoring network traffic. Most modern network hardware is switched, which further limits the ability to access the traffic, even when an access point is plugged into a wired network. Additionally, wireless access points protected by WPA/WPA2 encryption employ per-user keys difficulties in sniffing are similar to switched networks.

Finally, monitoring connections encrypted with SSL/TLS also requires more complex configurations. The most straightforward option involves adding a new Certificate Authority to the trusted list in the browser. This effectively creates a man-in-the-middle attack for the browser that allows decryption of the HTTPS traffic. The browser will produce a series of warning messages, but it will be possible to view the encrypted traffic.

No Frills SSL Certificates are Inexpensive and Useful

SSL De Facto for Securing Connections

SSL, short for Secure Socket Layer, is a cryptographic protocol for securing network traffic that is the de facto mechanism for securing transactions on the web and many other protocols including email (SMTP/IMAP/POP), IM (Jabber/XMPP), VoIP (SIP), and SSL-based VPNs. The topic of SSL certificates is a bit arcane, but the much of security of our everyday online purchases depends on SSL. Yet, fewer services use SSL than one might hope. It is possible to buy a basic no-frills SSL certificates from a universally accepted certificate authority very inexpensively–less than $15 a year–if you shop around. In most cases, it makes no sense to use a self-signed certificate, to purchase a certificate from a second tier provider, or to purchase a chained certificate. This article is a substantial revision of an article in Messaging News from a few years ago. I receive some requests for an update and have also found an even more inexpensive provider in the meantime, which make the update worthwhile.

Securing a connection requires that at a minimum both the client and server application support SSL and that the server application must have a digital certificate with a digital signature from a Certificate Authority (CA). This is the most basic and the most common form of SSL Public Key Infrastructure (PKI), which a client to securely authenticate a server. Nearly every online shopping transaction uses this form of SSL to secure the payment details from the user’s browser to the merchants servers. One quick aside, the Transport Layer Security (TLS) protocol released in 1999 superseded the last version of SSL released in 1996, but nearly everyone still calls the protocol SSL.

The January 2009 Netcraft SSL Server Survey found nearly 2.1 million sites that responded to a request for a SSL certificate, but only about 40% of those were valid third-party certificates. Netcraft has been collecting SSL certificates since 1996 and reports that in recent years, use SSL has been growing at a rate of 30% a year. Still the August 2010 Netcraft Web Server Survey found over 210 million sites, which means the number of SSL enabled sites is a small percentage overall.

Why Is Server-Side Adoption of SSL So Low?

Given that nearly every consumer web browser and email client is SSL-enabled, why is server side adoption of SSL so low? In addition there are many reasons why businesses and even technically inclined individuals would want SSL certificates. There is substantial debate around the efficacy of the security provided by SSL for many common configurations, especially with its ability to prevent phishing and man in the middle attacks. Still, the security of an endless number of services such as small webmail providers, dashboards for managing blogs, and web-based router configuration consoles would all benefit from SSL. The majority of high volume ecommerce vendors use SSL, but I regularly see services that ask for credit card numbers over (shudder) unencrypted connections.

The relatively low use of SSL is due in part to the expense and difficulty of purchasing SSL certificates, the complexity of installing them, and the need for a static IP address. For small and medium businesses and individuals no-frills SSL certificates are affordable, especially if you are willing to shop around. The inexpensive certificates provide the same level of functional security for network traffic as the inexpensive certificates. The no-frills certificates are typically domain validated meaning someone just needs to be able to receive and email or possibly respond to an automated phone call in order to validate the domain, which makes the process fast but does not offer any particular assurance the certificate owner is who they say they are.

Other features beyond the level of security provided to network traffic are important for some business. For example, a business handling large numbers of consumer transactions may consider the branding of the certificate or the site seal important, or they may want the green bar shown by sites with Extended Validation (EV) certificates, or a Unified Communications (UC) certificates for an Exchange server. In these cases, then the no-frills route is probably the best one. No matter what kind of SSL certificate you want the process of purchasing them is frustrating and it is difficult to make any sense of the actual differences between the certificates by reading the marketing literature.

Certificate authority certificates, any intermediate certificates, and server certificates form a certificate chain that are verifiable through the SSL Public Key Infrastructure (PKI). It is possible for anyone to set up a private certificate authority and produce a “self-signed certificate.” This is often done for personal use or development purposes.

Inexpensive Certificates

Self-signed certificates require the same amount of effort to install and configure as a commercial certificate, they also require additional work to install and configuring a local certificate authority to sign the certificate. Self-signed certificates are not verifiable through the public PKI chain and most applications will produce warning messages that the certificate is not valid unless the user explicitly loads the credentials for the private certificate authority into each browser. Many second tier SSL providers offer chained SSL certificates, which are more complicated to install in many configurations and are typically less compatible on older browsers and mobile browsers. This said, chained certificates theoretically offer the certificate authority more security as they may revoke a compromised intermediate certificate with far less disruption than the root certificate.

RapidSSL is one of most economical of the top tier SSL certificates. RapidSSL has a bit of a convoluted history, but it is part of the GeoTrust family of certificate authorities, which is far and away the largest digital certificate vendor. GeoTrust was purchased by Verisgin in 2006 and in May 2010 VeriSign’s sold its certificate authority business to Symantec. Luckily, for the purposes of my argument the history is not important. What is important is that the GeoTrust family of certificates is recognized by nearly every browser.

For example, most recently I purchased certificates from a reseller called Revolution Hosting Pricing, Their pricing SSL certificates follows:

Type 1 Yr  2 Yrs 3 Yrs 5 Yrs
RapidSSL  $14  $24  $33  $50
RapidSSL Wildcard $135 $260 $360 $550
QuickSSL  $45  $86 $126 $300
QuickSSL Premium  $75 $140 $195 $300
True BusinessID $105 $190 $270 $425

Problems Purchasing Certificates

For many organizations, SSL certificates are moderately expensive, complicated to purchase, and even more complicated to install. In my own personal experience, the process of purchasing certificates has not improved greatly over the last decade. Going through the process, it is easy to see why so few sites, especially smaller ones, use SSL certificates. Clearly, there is great room for improvement in the user experience of the purchasing process. Unfortunately, I don’t see the process improving any time soon.

It can be surprisingly difficult to get a list of the certificate authority roots (often called a CA bundle) included in specific browsers and even more difficult to get the root certificate bundles included in most mobile devices. Unless the vendor provides a public list of included certificates, it is difficult to determine what CA’s are supported without extracting the CA bundle and analyzing it, which is a major pain. The lack of detailed information about the root certificates substantially complicates the problem for businesses that wish to determine which certificate may meet the needs of their users.

Because there is effectively no standard CA bundle for applications, operating systems, or mobile devices, each vendor has its own bundle of “trusted” certificates. This means, every application that employs SSL may use a different bundle, even if they are on the same machine. For example, both Windows and Mac OS X have a system-wide list of root certificates, but Firefox will use its own list of root certificates regardless of the platform.

To make matters worse many certificate authorities offer multiple types of certificates that may be signed with different roots. I looked at GeoTrust, Comodo, and GoDaddy, and Network Solutions web sites. Only GeoTrust clearly listed which root certificate signed each type of certificate on the main part of their site and not buried in a support document. The situation with GeoTrust was not always so simple, last time I checked a bit more than a year ago, I had to do quite a bit of work digging around the site to determine which root would sign the certificate I wanted to purchase.

Previously, a quick side project to SSL enable and IMAP server turned into an annoying extended detour after I realized that one of the older smartphones did not include the root certificate used on the IMAP server. While, it was possible to load the certificate manually, the process is too complicated for multiple users, although it could be handled in a bulk provisioning process. I ended up spending a significant amount of time searching for certificate authority lists and extracting certificate bundles for several smartphones to figure out which certificate to purchase that would cover them all.

Some Improvements in Purchasing Certificates

SSL certificate compatibility is gradually improving as applications, systems, and devices with out of date certificate bundles are gradually retired. As root certificates and intermediate certificates begin to time out and certificate authorities issue new root certificates. This means that if you have a server with a multi-year SSL certificate issued several years ago, its root certificate may differ from the current one. This is important if you are trying to connect to your SSL server from machines or devices with out of date certificate bundles.

Unfortunately, a market for automatic certificate installation in common machine configurations never developed. Both Microsoft and Apple have made strides with better GUI administration tools for SSL certificates. A number of web hosting services sell SSL certificates with installation for users who pay for the certificate and a static IP address. Another improvement on the horizon is RFC 3546–the Server Name Indication (SNI) extension for TLS. SNI will effectively allow name-based virtual hosting to use SSL similar to the name-based virtual hosts in HTTP 1.1. One major benefit is that this will allow multiple SSL enabled hosts on the same IP address. These are welcome improvements, but we still have a long way to go.

Appendix: A Brief History of RapidSSL and GeoTrust

GeoTrust became a certificate authority in 2001 when it purchased Equifax Digital Certificate Services from Equifax, which is why many of the GeoTrust root certificates are Equifax. FreeSSL launched in 2001 and offered free SSL certificates with its own single root certificate. These were popular, but only had 92% browser compatibility. In 2002, FreeSSL began to offer chained SSL certificates under the ChainedSSL brand for $35 a year, which was a very low price at the time. In 2003, FreeSSL relaunched and temporarily offered free one year ChainedSSL certificates and ChainedSSL wildcard certificates. In February 2004, FreeSSL launched a new brand called StarterSSL, which was a single root certificate. Also February 2004, FreeSSL relaunched the FreeSSL brand as a 30-day free trial certificate. The FreeSSL root certificate signed both the FreeSSL and StarterSSL certificates. Later in 2004 FreeSSL launched another brand called RapidSSL, which combined the StarterSSL single root certificate and included support.

In 2005 FreeSSL formally changed it’s name to RapidSSL. VeriSign purchased Thawte in 2003 and GeoTrust in 2006. At this point some of the details are fuzzy and involve a number of subsidiaries in Europe and Japan, but GeoTrust now apparently owns RapidSSL. In May 2010 Symantec purchased VeriSign’s Security Certificate Business and now controls all roots from all the prior acquisitions.

You should follow me on Twitter.

Why Pinboard is My Favorite Bookmarking Service

Pinboard is a bookmarking service that allows you to easily save, tag, annotate, share, and archive bookmarks independent of your browser. Pinboard describes itself as “antisocial bookmarking,” which highlights its capabilities as a private and personal archiving tool compared to the social features offered by Yahoo’s Delicious service. I find Pinboard a simple, fast, and reliable way for me to save bookmarks and archive web pages for future reference. I have been happily using the service for nearly five months (Update a year) and recommend it highly.

Pinboard has become a part of my everyday online reading experience as I use it archive both a bookmark and the full text of any article I found interesting or that I plan to read later. My primary use of Pinboard is as a personal archive rather than a public bookmark sharing service, and I prefer it to Yahoo’s Delicious bookmarking service, although Pinboard has fewer options for sharing and tag management. For example, it does not support the Delicious style of aggregating multiple tags in tag bundles or the ability to share a bookmark with a specific user.

To start using the service, simply drag one of the Pinboard bookmarklets into your browser bookmark bar. The first style of bookmarklet can either open a new page or a popup window allows you to edit the URL, title, description, tags, and optionally mark the bookmark as private or “to read”. I use the send style of bookmarklet that Pinboard calls “read later.” This bookmarklet saves the page, automatically marks it as read later, and returns you to the place on the page where you left off without opening a new window or a popup. The “to read” status allows you to quickly build up a reading list without interrupting your workflow.

You can aggregate links posted to multiple services by configuring Pinboard to watch for links in your Twitter posts, Twitter favorites, or pages saved to Instapaper, Read It Later, Delicious, and Google Reader. You can easily save links from a BlackBerry or iPhone using a private email address from Pinboard. I find the ability to centralize my bookmarks from multiple services very convenient. Pinboard automatically expands any shortened links and stores the original URL. Full text search on Pinboard include the title, description, tags, and notes, but not the text contained in the pages themselves. Pinboard also allows you to narrow the results of queries with public vs. private status, starred status, and the source e.g. Twitter.

Pinboard offers a single paid add-on, that will archive the entire page, HTML, CSS, and images for each bookmark you save. You can then view the snapshot of the page, even if the original disappears. The cost for this is $25 a year minus your original sign-up price. Pinboard recently introduced a feature where all users can download an offline copy of the last 25 URLs saved. The developer says that he plans to eventually allow users to download their entire archive.

Pinboard offers multiple ways to import and export data including including a format compatible with that is compatible with Delicious. Pinboard offers both public and private RSS feeds of bookmark data including tag-based feeds. The Pinboard API is compatible with the Delicious API. This means that any application that uses the Delicious API should work with Pinboard by simply changing the URL to the API endpoint. Unfortunately, most bookmarking applications do not allow end users to change the API endpoint URL and few directly support Pinboard. On the Mac, both Delibar ($18) and Pukka ($17) desktop applications support Pinboard. The best solution for mobile devices is to use the Mobile web version of Pinboard. Update The Delibar touch application for the iPhone and iPad ($1.99) works with both Pinboard and Delicious. I recommend it.

Overall, Pinboard is an excellent option for storing and archiving bookmarks and I recommend it highly. The service is not free. Currently the price to join is $6.38 (Update $7.41) and the cost increases by a fraction of a cent for each new user. I like this pricing model as it is inexpensive and allows the developer to support the service without ads and without taking external funding. This leaves the service with a smaller, but more active user-base, and more importantly almost no spam. Recent Pinboard releases have improved bulk editing capabilities, but it is not currently possible to add or remove tags on a set of items returned from a search of your own bookmarks. Hopefully, the developers will eventually add this feature as it would make it possible to quickly and easily organize large numbers of uncategorized bookmarks. Update The developers added this functionality. Tag management is now far more flexible.

If the idea of social bookmarking seems foreign or the benefits do not seem clear, I highly recommend taking three minutes to watch the short and entertaining animated video Social Bookmarking in Plain English by Common Craft. What is Antisocial Bookmarking? is a nice post on the Pinboard blog by, Maciej Ceglowski, the founder of Pinboard explaining his reasons for creating the service.

* This article originally appeared as Why Pinboard.in Is My Favorite Bookmarking Service in my Messaging News “On Message Column.”

Update 2010-12-16 Mentioned feature additions, Delibar touch support, and price update.

You should follow me on Twitter.

iPhone Screenshot and Photo Smart Album Hack

I take a lot of screenshots when I research products, both on the desktop and on the iPhone, so having some way to automate organizing my collection is important. The problem is that screenshots images taken with the iPhone have no EXIF metadata. This means there is no straightforward way to produce a list of all your screenshots.

After a little bit of experimentation, I found a workable but not ideal solution. You can use the lack of EXIF metadata as conditions to group all the images. Screenshots are saved as PNG files on the original iPhone and the iPhone 3GS (the two models I had access to) and have no EXIF records. The only other metadata fields available are filename, file size, and modified, and imported dates. The PNG extension for the filename is the one existing feature you can search for, all others have to be unknown. I selected two features aperture and ISO, even though one would work in the hopes that this would reduce any false positives.

A Smart Folder recipe for iPhone Screenshots

  • Match all of the following conditions
  • Aperture is Unknown
  • ISO is Unknown
  • Filename contains PNG

iPhone Screenshot 3 Item Smart Folder.png

Photos taken on the iPhone are saved as JPEGs and contain EXIF metadata. The iPhone 3GS embeds many more fields than the original iPhone. The easiest feature to select is “Camera Model.” The field type must be is or is not, there is no option for contains, so you will have to specify each phone separately.

A Smart Folder recipe for iPhone Pictures

  • Match any of the following conditions
  • Camera Model is Apple iPhone
  • Camera Model is Apple iPhone 3GS

iPhone Pictures Smart Folder.png

Searching for Screenshots from the command line

All iPhone screenshot images have a width 320 pixels and height 480 pixels in portrait or landscape. It is possible search for these files using the Spotlight command line tool mdls to integrate them into other scripts. There are many other options for searching for images with the full Spotlight syntax and it is possibly to execute these as Raw Querys in the Finder or use a Spotlight front end such as HoudahSpot, but that is a topic for another post.

mdfind -onlyin $HOME/Pictures 
  'kMDItemKind == "Portable Network Graphics image" && 
  kMDItemPixelHeight == 480 && kMDItemPixelWidth == 320'

Great iPhone and iPad Apps for Reading and Sharing Docs

Instapaper, Dropbox, GoodReader, and Simplenote are my favorite applications for reading, writing, and sharing documents on the iPhone and the iPad. I have used each application for more than six months and I highly recommend all of them.

Instapaper

The Instapaper application makes it simple and pleasant to read lengthy articles on your mobile device. Instapaper is optimized for the type of articles where you find yourself starting in your browser and thinking, “I’d rather read this later”. The application automatically loads any new content from the Instapaper Web service, which reformats Web pages for small screens and strips away unnecessary elements. The service provides an experimental option to save pages formatted for the Kindle as well.

There are multiple ways to save content to the Instapaper service including a bookmarklet, email, or applications that integrate Instapaper directly. The “Read Later” bookmarklet is compatible with most desktop browsers, mobile browsers and Google Reader. Each Instapaper user receives a unique email address that will import included links and text. Many iPhone and iPad RSS feed readers, Twitter clients, and social bookmark clients support saving links to Instapaper directly. The Instapaper service allows sharing of individual articles via email, Tumblr, and several Twitter clients.

Instapaper is available in two versions, a free ad-supported version with a limit of 10 articles, and a $5 (USD) pro version with a 250-article limit. The pro version includes additional features, such as background updating, folders, remembering the last read position, tilt scrolling, multiple font options, and disabling rotation. I find that the pro version is well worth the price.

Dropbox

In a crowded market of Web-based consumer storage services, Dropbox is popular and widely praised. The minimal user interface of the desktop application is one reason for its popularity. When I say minimal user interface, in most cases I mean non-existent. This is the beauty of Dropbox. After installing the application, Dropbox appears as a folder on your desktop. The folder is essentially magic. Any files in the folder are automatically synchronized to all other machines where you have Dropbox installed. Mobile Dropbox clients synchronize with the server upon launch. In my experience, it just works, and this is high praise. Dropbox is fully accessible via a Web interface for devices without an installed Dropbox client. Dropbox saves any revisions to your files for 30 days by default. These revisions are only available via the Web interface and do not count against your storage quota.

Working with shared files on Dropbox is as easy as working with files on the desktop. Shared files and folders are synchronized with all authorized users’ accounts. My only real complaint is that sharing must be configured from the Dropbox Web interface rather than a Dropbox client, which is not intuitive. Access control for sharing is based on email addresses and can only be configured via the Dropbox Web interface. It is important to recognize that any shared files count against the storage quota for all shared accounts. Each user’s Dropbox folder has a public directory—any files placed in that directory become publicly accessible without access control. The mobile Dropbox can generate links and can be used to share individual files with any email address. Be careful, the mobile links can also share private files and currently there is no way to revoke access.

Another reason for Dropbox’s popularity is its broad platform support. Mobile clients for Dropbox are available for the iPhone, iPad, and Android devices. A BlackBerry version is in development. Desktop clients are available for Mac, Windows, and Linux. All Dropbox clients are free. The mobile Dropbox client takes advantage of the document viewers built that are part of iPhone OS to open files directly. Supported formats include plain text, RTF, Microsoft Office documents, iWork documents, PDFs, Web pages, images, music files, and videos. Dropbox only supports viewing files; files must be edited with another application.

Some mobile applications such as GoodReader can read and write files from the Dropbox service, although the process is a little convoluted. Dropbox recently added a new mobile API to allow iPad applications to easily save files to a Dropbox account. Saving files to Dropbox is far easier with the most recent version of the GoodReader iPad application due to the new APIs. Even better, the Dropbox iPad application allows you to open files directly in other applications. Hopefully the iPhone Dropbox application will gain this functionality with the next major version of iPhone OS.

Dropbox is a subscription service that uses the Amazon Simple Storage Service (S3) for the backend store. A free account is available with 2 GB of storage. There are two paid upgrade options—a 50 GB option for $10 (USD) a month or $100 (USD) a year and a 100 GB option for $20 (USD) a month or $200 (USD) a year. Paid accounts can optionally save file revision history forever.

GoodReader

GoodReader works well with long and complex PDF documents. I have used it to read PDFs that are several hundred pages long without a problem. The iPhone and iPad support PDF files natively, but navigating long documents is cumbersome as there is no support for jumping to a specific page, for using PDF bookmarks and outlines, or for searching PDF files. GoodReader supports navigation to specific page numbers, PDF bookmarks and outlines, and full text and bookmark-based search. The application includes a night mode for reading in the dark and an autoscroll mode for reading long files without having to manually select the next page.

GoodReader’s support for text files includes a number of features not available in the native viewer, including the ability to edit text files and reflow text when the font size changes or the device is reoriented. One feature, called PDF Reflow, extracts plain text from PDF files and displays it in GoodReader text file viewer so it can be reflowed, copied to the clipboard, or edited. PDF Reflow should not be confused with accessible PDFs that are sometimes called reflowable PDFs.

GoodReader supports file transfer over WiFi in addition to many storage services including POP and IMAP email servers, WebDAV servers, Apple’s MobileMe, Dropbox, Box.net, Google Docs, and FTP servers. There are two versions of GoodReader for the iPhone. The standard version is $0.99 (USD). Access to POP and IMAP email servers, Google Docs, and FTP servers require a $0.99 (USD) in-app upgrade purchase each. GoodReader Light includes all available types of server access and is available for free on the iPhone, however it is limited to storing five files. GoodReader for the iPad is currently on sale for $0.99 (USD) and includes all available types of server access.

Simplenote

Simplenote is a note taking application for the iPhone and iPad that automatically synchronizes with the Simplenote Web service. The application has a basic feature set, but it works very well and is easy to use. Notes are stored as plain text and can be forwarded as email messages or deleted individually. Unfortunately, there is no mechanism to work with multiple notes at once. The built in search is fast and searches incrementally as you type, to quickly narrow down the list of notes with the search term. Options include changing the sort order, preview, link detection, and display of file modification dates. If the user has installed TextExpander, snippets will expand automatically. All notes can also be viewed or edited in any Web browser using the Simplenote Web service.

Currently, support for the iPad is limited to the same feature set as the iPhone aside from running in full screen mode. The developer plans to add additional iPad specific features shortly. The Simplenote API enables synchronization with multiple desktop applications including Notational Velocity—a simple, fast, stable note taking application for Mac OS X. This means I can create notes, make changes or additions either on the desktop or my iPhone and they are automatically synchronized. I am very happy with the setup.

Simplenote is free and ad-supported. A $9 (USD) a year premium add-on removes ads, provides an automatic backup, an RSS feed, the ability to create notes by email, access to beta versions, and prioritized support.

* This article originally appeared as Great iPhone and iPad Apps for Reading and Sharing Docs in the May 2010 issue of Messaging News.

Preparing Your Site for the iPad

The Apple iPad does an excellent job of displaying most web sites. However, there are a few obstacles you may want to avoid. There are also a few customizations that will make your site look even better on the iPad. I will summarize the most important issues you should start to plan for and the differences between the iPad browser, the iPhone browser, and desktop browsers. As an added benefit, most improvements made for the iPad will also benefit users with an iPhone or an iPod Touch. There is list of resources to find more information and a list of tools to help you test your site at the end of the article.

Differences in Mobile Safari on the iPad

The primary differences you should account for first are:

  • No support for plugins such as Adobe’s Flash or Sun’s Java for ads, navigation, and multimedia
  • The fixed viewable screen size (viewport) may affect your layout
  • The touch screen is the primary means of interaction and offers different modes of user control

Unlike most desktop browsers, the iPad does not support plugins such as Flash or Java. Any navigation elements, embedded audio and video, or banner ads written in Flash or Java will not appear. Based on public statements, Apple is unlikely to support either language in the future. This means you will need to provide alternative or fallback navigation elements and multimedia embedding options. Apple’s official recommendation is to avoid plugins entirely and use HTML5 elements across your site. Navigation elements may be implemented with standard AJAX techniques. If your revenue depends on banner advertising delivered via Flash or Java, you will need to need to make some changes. If your ad server supports mobile devices, you can turn this on for iPad users. An alternative is to treat mobile users the same as email campaign advertisements. Today at the iPhone OS 4.0 press event, apple announced its own mobile ad platform and ad network called iAd, implemented entirely in HTML5. The mobiThinking Guide to Mobile Advertising Networks in the references surveys most of the available mobile ad network options.

The standards and implementations of HTML5 audio and video tags are still evolving and making your content available in all browsers is still complicated. Supporting HTML5 H.264 encoded video with a fallback to Flash for browsers that do not support it is likely your most straightforward solution. In the references, I have linked to some of John Gruber’s articles on H.264 and Flash that explain the problem in more detail. Video for Everybody from Camen Design and the upcoming SublimeVideo from Jilion are two options for hosting HTML5 friendly video on your site.

The iPad has a 9.7-inch touch-sensitive screen, a fast processor, and fast network connectivity. It provides a web browser experience that is much closer to the desktop experience than a smartphone. This means you should avoid sending iPad users to versions of your site optimized for mobile phones if you are sniffing for iPhone or mobile user agents. If you look at the user-agent strings for the iPad and the iPhone, you will notice that the iPad user-agent lists “like Mac OS X” rather than “iPhone OS.” Both browsers include the “Mobile” in the user-agent string. Most browsers have mechanisms to change the user agent string. I’ve listed some of these in the references.

The current version of iPhone OS (version 3.1.3) uses the following user agent string (line artificially wrapped):

Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_1_3 like Mac OS X; en-us)
    AppleWebKit/528.18 (KHTML, like Gecko)
    Version/4.0 Mobile/7E18 Safari/528.16

While the iPad with iPhone OS 3.2 uses the following user agent string (line artificially wrapped):

Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us)
    AppleWebKit/531.21.10 (KHTML, like Gecko)
    Version/4.0.4 Mobile/7B367 Safari/531.21.10

The iPad viewport is set to 980 pixels wide, in portrait mode the iPad is 768 pixels wide, but the content will scale to 980 pixels. If you have content that wider than the viewport that uses fixed CSS positioning, that content may end up off screen and your users will not see it since they can not resize the window in Mobile Safari.

Users control the iPad with a multi-touch interface and a touch screen keyboard. The “Apple iPhone Human Interface Guidelines: Introduction” is a great document for starting to think about multi-touch user interaction as the metaphors and modes of physical interaction differ. For example, a flick action rather than a mouse controls scrolling and a pinching action controls how a page scales up and down.

There are other issues, some of which Apple may resolve in a future update. In John Gruber’s review of the iPad, he points out that often only a single page is held in memory at one time, subsequent pages often take all the memory available for web pages. This means that if you could loose form data on a page that you have not submitted if you open another page. The memory problem could also appear on AJAX heavy pages.

iPhone OS User Base

Apple announced the iPad at then end of January and released specifications, documentation, and a software development kit (SDK) for those paid members of the iPhone developer program under an non-disclosure agreement. The WiFi only model of iPad began shipping this week and Apple released the SDK to everyone registered in the Apple Developer Program. Apple announced that it sold more than 300,000 iPads on the first day and more than 450,000 as of April 8th. The iPhone OS platform user base is significant. Steve Jobs announced that there were 75 Million iPhones and iPad Touch devices running iPhone OS at the iPad launch in January. The Apple’s 2010 Q1 filing said that it had sold more than 42 million iPhones total. Today at the iPhone OS 4.0 launch Jobs announced that there were 85 million iPhone OS devices.

Mobile Safari on the iPad uses the open source WebKit rendering engine as do iPhone, and iPod Touch devices. Testing your site with the WebKit rendering engine is now essential. Desktop versions of the Safari browser, Google’s Chrome browser, all iPad, iPhone, and iPod Touch devices, Android devices, Palm webOS devices, Symbian Series 60 (S60) devices all use WebKit. RIM has stated that future BlackBerry devices will use WebKit. This means that every major smartphone browser aside from Windows Mobile will be WebKit-based in 2010.

Testing Your Site on the iPad

Testing your site directly on an iPad is the only way to guarantee that your experience will match your visitors with iPads. There are numerous reports by developers of minor differences between the iPad and the iPad in a simulator.

However, next to owning an iPad, the iPhone simulator comes closest to rendering your site as an iPad would. The iPhone simulator that ships with the iPhone SDK 3.2 has an iPad mode under the device option. Anyone can register as an Apple Developer for free and then download the SDK. The iPhone SDK includes the XCode development environment and is nearly a 2.5 gig download, it also only works on Mac OS X 10.6.2 (Snow Leopard) or higher.

The paid iPhone Developer Program is $99 a year. The subscription allows developers to submit native iPhone and iPad applications to Apple’s App Store. Apple also allows paid developers early access to upcoming versions of its SDK such as the iPhone OS 4.0 SDK announced today.

iPad Peek by Pavol Rusnak is a web service that allows you to see what your web site will look like on an iPad. It is free and the source code is available under an open source license. Three things will make your experience with iPad Peek closer to than of an actual iPad.

  • Use a browser with a WebKit-based rendering engine, preferably Safari, since it is the most similar to the iPad browser. Chrome will works too.
  • Disable all plugins in your browser. Otherwise your browser will still load the plugins even though an iPad would not.
  • Change your user agent string in your browser to match the iPad one listed earlier.

Apple’s Official Developer Documentation

Other Resources

Tools

The easiest way to change your user agent in Safari is to use the option in the developer menu. The easiest way to change the user agent in Chrome and Firefox (uses the Gecko rendering engine, not WebKit) is to use an extension.

Further Reading

John Gruber at Daring Fireball has written a series of posts about Flash, HTML5, and H.264 video. They are really worth reading for background on the technical and political issues related to HTML5.

* This article originally appeared as Preparing Your Site for the iPad in my Messaging News “On Message Column.”