Privacy and open government: conversations with EPIC and others about OpenID

By Andy Oram
August 3, 2009

Ideas about privacy policies, anonymity, and technical impacts, springing from a discussion with a director from the Electronic Privacy Information Center and from comments on an earlier blog posting. This article was originally published on the O’Reilly Media web site.

A few days ago I proposed a way to offer more privacy to people visiting government web sites. This blog posting builds on that proposal, which was largely technical, by examining the policy and organizational issues that swirl around it.

My ideas are informed by a discussion I had with Lillie Coney, Associate Director of the Electronic Privacy Information Center. The posting is also inspired by two comments on the earlier posting and brief email I exchanged with one commenter, which intertwine with Coney’s in intriguing ways.

As I said in the first posting, my proposal focused on a very narrow question driven by the Obama Administration’s interest in revising a memorandum from 2000 concerning the use of cookies in web browsers. The proposal suggested a way to better approach anonymity, but didn’t look at the related social and political issues:

The kinds of privacy and the degree of privacy people want
When it’s appropriate to make visitors identify themselves, or at least to provide some persistent identity
Whom people trust to maintain identity information

This posting offers a number of points about those issues. The sections are:

Can the government be your friend?
Anonymity, pseudonymity, and participation
Who should run an OpenID server?
Thought experiment: could federal agencies offer anonymous authentication for whistle-blowers?

Can the government be your friend?

The kinds of government/public collaboration pursued by CIO Vivek Kundra and others in the Obama administration sees people doing much more than submitting ideas. The administration wants information sharing and an exchange of ideas that allow both sides to reveal vulnerabilities.

But as one commenter pointed out on my previous posting, the government has a lot of power that should make us hesitate before sharing too much. Coney, whose work at EPIC includes a focus on domestic surveillance, pointed to an incident where the Las Vegas Review-Journal was served a subpoena requiring it identify readers who had posted online comments about an article involving a case with the Internal Revenue Service. (The newspaper is fighting the subpoena.) Some agencies have enough power to be scary. And some agency heads may take heavy-handed measures without even being malicious or vindictive—just out of a concern for security.

So you may be living it up like Obama, Gates, and Crowley on one agency’s web site, forming great relationships and having an extremely productive discussion, only to discover that your comments come back to bite you when you tussle with an entirely different agency. And of course, the data you give these sites lasts forever.

Such promiscuous information sharing is supposedly outlawed by the 1974 Federal Privacy Act. This oft-cited law, along with the 1966 Freedom of Information Act, remain centerpieces in the armory of those protecting personal privacy in the U.S. However, the Federal Privacy Act creates many exceptions for agencies that want to opt out from its rules, and fails to cover private contractors. Coney says, “EPIC’s goal is to develop fair information practices that are enforceable and transparent to protect users of government information.”

Having studied the privacy policies requested by different agencies, Coney finds them in two camps. Agencies whose mission is to reach and out and help people, such as the Department of Health and Human Services, favored as much privacy as possible—the same goal Kundra has expressed many times. On the other hand, law enforcement and other agencies concerned with protecting the public would like to log all accesses and try to attach personal information to all visits—even access to public information.

That last policy puzzles me. If the government offers information freely, Carl Malamud or I or anybody can grab it and put it on another web site. There is no way to track who accesses free and open information. Tracking access in the hope of preventing criminal use is not only obnoxious but futile.

In short, forming a partnership with government takes a bit more consideration than friending someone on Facebook. The new age of government participation we’re hearing about, then, rests on some assurances to the public. Personal information should not be collected unless absolutely necessary, and should not be used for purposes unrelated to the reason for capturing it, especially by other government entities.

We’re all excited about the expanding collaboration between government and citizens, but the historic change intensifies the need to take a fresh look at laws and policies on a regular basis, just as the OMB has done in requesting comments on their cookie policy.

Anonymity, pseudonymity, and participation

People phone anonymous tips in to the police all the time. To allow the same kind of anonymity online would be just an invitation to spam. In fact, anyone with something to hide would make sure to flood the system with irresponsible accusations just to drown out the people who have legitimate crimes to report. (The FBI tip site asks you to identify yourself.)

The proposal in my previous posting delivers pretty good pseudonymity, allowing someone to submit repeated comments with the assurance that they all come from the same person, but without surrendering personal information.

One commenter on my posting asked whether we can really trust the government to protect pseudonymity. Well, of course they can always trace you if they want to. Even non-government actors can do that, as we’ve just seen from the recording industry’s testimony at Joel Tenenbaum’s trial. Privacy is a cat-and-mouse game in which both sides have escalating levels of attacks and parries.

Tracking you through contact information

Attack: My proposal let people leave a contact, such as an email address or Twitter account, where the government could report information about their account. Although the government should promise not to misuse the contact, it could be used to identify a visitor.

Parry: Leaving a contact is optional, and you can manage your account without leaving one. You can also use a free email address from popular providers.

Tracking by IP address and time

Attack: The government can require your ISP to provide your identity based on the time you were logged in and the dynamic IP address they assigned you.

Parry: Find an open wireless access point or use an onion routing network.

Who should run an OpenID server?

From this point on, I’ll assume that OpenID will be used by federal agencies in some configuration, because that’s the only technology with a widespread implementation that can provide the protections discussed in this posting.

One of the central policy questions we have to deal with, then, is whom we should trust with our OpenID account. My proposal called on the federal government to run an OpenID server for all its agencies, mostly because I want the government to kick the habit of using commercial services for such essential information-age functions.

Coney and I discussed several options for ensuring reliable servers. There’s no reason not to allow multiple options. Running an OpenID server is pretty easy. If EPIC had a hankering to serve up privacy directly, this is its chance. The problem is whether visitors can trust any particular server 1) to stay up, 2) not to go out of business, 3) not to leak information, 4) not to abuse the information for private gain, and 5) not to cave in to government pressure and release information outside of the scope of the law.

Here are a few options.

The federal government runs its own dedicated server

Pros: The government can probably do the best job of guaranteeing that the server stays up and is not broken into. The government is not depending on outside entities for this essential function.

Cons: A central OpenID server offers a compelling target, and a stream of recent news reports shows that government agencies suffer from the same security lapses as private companies. Furthermore, many people don’t trust the government to protect their privacy and feel more secure with a private server.

The federal government regulates the organizations that provide servers

Pros: Personal data is stored in a variety of private servers, complicating attacks, while the government ensures they are run professionally.

Cons: Defining service-level agreements and quality control is difficult, and legislating or regulating it is even more difficult.

The organizations that provide servers define a code of conduct and monitor compliance

Pros: Self-regulation is much lighter-weight than laws and regulations, and the experts who know the technical and business issues the best will be in charge of ensuring quality.

Cons: Self-regulating privacy agreements—we’ve seen that before! The failures of TRUSTe and P3P leave us twice-scarred and reluctant to try again. (See my article Promises, Promises, Promises.) Still, TRUSTe and P3P provided no protection because the organizations creating privacy policies were disingenuous and lacked an interest in truly protecting privacy. A sincere self-regulatory effort by new organizations committed to privacy might succeed.

The federal government sets up a dedicated agency that is monitored by a private firm or non-profit

Pros: This was suggested by Coney. It combines the reliability of the government with the disinterested independence of an outside observer.

Cons: Malicious actors in the government agency may succeed in hiding bad behavior from the monitors, whose inspection would quickly settle into an uninspired routine. Moreover, the requirements that the monitor has to enforce are just as complex as in the previous solutions.

Free market: let each visitor choose a server and take his chances

Pros: This leads to the most diversity, which is a strength in the area of security. And if a server goes down, how much is lost? The visitor can open a new account elsewhere and rebuild the lost personal information.

Cons: No one can evaluate the competence and reliability of another organization’s server, and weaknesses don’t become apparent until disaster strikes.

As usual, the policy, organizational, and social issues in deploying a technology are thornier than the technology itself. I still think the architecture I offered in my proposal to OMB provides a good basis for building any of the systems considered in this posting.

Thought experiment: could federal agencies offer anonymous authentication for whistle-blowers?

I’ll end this posting by exploring an identity system that would allow an agency to authenticate a pseudonymous whistle-blower by verifying “Yes, this is a current employee” or “Yes, this is a former employee” without giving further information about that individual.

I believe that any such authentication system would have to be based on a two-tier approach such as I laid out in my OpenID proposal. The system I lay out in this section is too complex, organizationally and technically, for the government to implement at this point, but it shows the tools available to privacy advocates.

Each government agency participating in the authentication program sets up a server to digitally sign IDs. Another server hosts accounts for every employee.
Each employee is encouraged to create an account on the OpenID server and to keep it secret.
When logged in at his or her agency account, the employee submits his or her account name with the agency’s digital signature. This produces an unforgeable string that combines verification of the employee with verification of the agency.
At any time, the employee can post information to any web site that accepts an OpenID login, using the employee’s secret OpenID account. The employee includes the string obtained in the previous step. By checking the signature, anyone can verify that the employee had an account at one time on the agency server. Because the text revealed underneath the signature is the account name, it proves that the person who obtained the signature is the same person posting information currently.

In order to masquerade as an agency employee, someone would have to obtain both the employee’s signed string and access to the employee’s secret account on the OpenID server. This might be possible if the employee is lax in protecting the information (for instance, by putting it unencrypted on a cell phone and losing it). Other problems with this system include:

There is no way to revoke the signature, unless the agency revokes all signatures at once. Thus, there is no way to tell whether the employee is still employed. This may be acceptable.
This system would allow any employee to leak any agency information without personal repercussions. That’s probably not a policy we want to foster.

Technology confers power, and so does anonymity. Technical, legal, and policy experts are all needed to study the implications of the systems we have for participation, and the systems that are proposed to replace them.

“So what’s this conference you’re going to?” asked my friends, not braced for an explanation that usually took me more than ten minutes. Ultimately, though, they all expressed excitement about the ideas driving Personal Democracy Forum.

These friends care about politics. They argue over all the issues, and at some level they take note of the processes that often matter more than any arguments. But although some know what an API was and a few even understood the concept of mash-ups, it’s remarkable how completely they had been bypassed by the current movement toward open government, whose importance to the Obama administration was signaled by his release of a memorandum on transparency and open government on his first full day in office.

I hooked my friends through the idea of an irreversible political shift. Not a regulatory regime that could be dismantled like the agencies responsible for civil rights, or a mandate that could be defunded like federal housing initiatives—no, in this case a movement integrating the public into government functioning, and that therefore creates an external constituency that helps to perpetuate the system; an ecosystem of non-governmental organizations that will react precipitously and aggressively if the government tries to shut them out.

Digging for themes

PDF is appropriately held in New York City, a culturally open megalopolis that is ethnically and politically uncategorizeable. Free speech holds forth on the subways where the exhortations of the homeless prove that the great art of oratory is still alive.

A thousand people signed up for the conference (leading, of course, to more than a thousand Twitterers). At the gorgeous Jazz at Lincoln Center location, the Rose auditorium was totally filled, and the hallway was choked as attendees strove to reach pitifully undersized rooms for breakout sessions.

As a conference with a contemporary, tech-oriented bent, PDF ripples off into all kinds of online resources. At several points the keynotes were held against a real-time twitter feed, goading on the feeding frenzy by showing the accounts of the people who tweeted the most. This focus on immediate response—and on quantity of response—had a specific effect on the consciousness of the audience. The twitter feed reinforced through highlighting and repetition the most provocative sound bites and the statements most clearly relating to current issues at the top of attendees’ minds

This is a useful function to play, but the provocative utterance and timely issue is only one superficial level of conference engagement. We all need to take away what we’ve experienced, sit with it a bit, and look for underlying themes that represent a significant trends that can guide us.

Give a few hours for reflection, I’ll use this blog to synthesize three recurring themes I heard during the first day. I’m sure more ideas will settle out as I spend even more time thinking through these two days of meetings.

The prerequisite: the power for change lies with the public
The platform for democracy: infrastructure we all need
Time to tune in: we can’t tolerate static
Miscellaneous insights from speakers and participants

The prerequisite: the power for change lies with the public

It’s scary being a politician, let alone the an agency head. These people may seem indescribably powerful to the rest of us, but they live in fear of public pillory triggered by their own missteps.

Jeff Jarvis listed, as one of his four key elements of change, the ability for government to fail without risk of recrimination. David Weinberger approached the same theme from a different direction, talking about how all wisdom is provisional, emerging, and scattered. Vivek Kundra and Beth Noveck—who will be speaking tomorrow—have repeatedly made similar statements in the context of bringing the innovation culture of the Silicon Valley to the area around Foggy Bottom.

In my first ramp-up blog for PDF I talked about a four-part cycle for successful public/government collaboration. Perhaps we need to start the cycle earlier, or add some kind of parallel cycle, to recognize that the public has to make the commitment asked by Jarvis: the promise to show forbearance when the government fails and to grant it a mandate to do innovation.

The platform for democracy: infrastructure we all need

If one engages in some deep listening, you can hear beneath all the celebrations of transparency a recognition that success depends on several elements of infrastructure. Early experiments in open government may produce exemplary and even spectacular successes, but the culture won’t take hold until this platform is in place.

Computing networking and computer technology are the most obvious requirement. Mark McKinnon, a Republican communications strategist, called for universal broadband during his keynote.

But as audience members pointed out, literacy is another requirement: basic literacy as well as media-savvy literacy and knowledge of the tools that let one participate.

Ethnologist dana boyd took the discussion to the next level by pointing out that even when people do go online and do use social media, they self-segregate by race, class, and educational status. Her case study for this claim was limited (the demographics of MySpace users versus Facebook users) but the statements she culled from young people showed that the digital divide is possibly even deeper online than these social divisions are offline.

I believe that a predilection for different forums and ways of interacting online doesn’t have to prevent different races and classes from coming together on issues of common interest, such as health care. But boyd’s point that people set up online barriers that make it harder for them communicate across these barriers is salient. She pointed out that we need to recognize that the sites we visit are not the same sites everyone visits, to spend time on the sites of people we want to influence or collaborate with, and to embrace different modes of interaction among different social groups.

Finally, open discussion requires a tolerant environment. Recent events in Iran, as well as the introduction of Internet filtering software in China, show that governments can choke off civil society online; the technology was described as a cat-and-mouse game where both the side of information dissemination and the side of repression learn how to increase their power.

Time to tune in: we can’t tolerate static

The last theme I’ll highlight from the first day is the sense that we can’t stand still. Americans (and particularly young Americans) expect more and more that we can have a say, that we can move quickly and have choices, that we can contribute to decisions and their implementation. We’ve already seen how many businesses (not all, of course) that fail to keep pace with these expectations are shrinking. If governments don’t meet the expectations, people won’t be able to replace it the way they replace businesses, but there could be increased feelings of alienation and increased social dissatisfaction.

Miscellaneous insights from speakers and participants

ChallengePost (now Devpost) announced today a site that brings together people with needs and problem-solvers, using a challenge model similar to the Netflix prize or the TopCoder software firm. In publishing a challenge, someone can offer money or just recruit people to offer thanks. Respondents may be motivated to solve the challenge by intangible rewards as well as money. ChallengePost offers advice on how formulate a good challenge and judge it expertly, but the form of each challenge is the prerogative of those who post it.

The Digital Literacy Contest tries to develop a generation of problem-solvers who can analyze the streams of government data coming online. They will run contests in high schools and colleges that start with test problems and then move to questions to which they do not have the answers. When several students converge on the same solution, it is published for the public benefit.

Morley Winograd of NDN briefly analyzed Ron Paul’s failure in the presidential election despite his sophisticated use of social media. If I understood Winograd, the medium--which is well constituted for bringing groups together--contrasted too much with the message of individualistic libertarianism.

In a forum on participatory medicine, Esther Dyson said of the current health care debate, “We’re focusing too much on health care and not enough on health, just as one might complain that the government focuses too much on laws and not enough on getting people to do good things.” This was the start of a session that discussed ways patients and doctors could use information sharing to improve outcomes and lower costs.

New York Mayor Michael Bloomberg called in over Skype instead of coming to the conference. Over his call he announced an expansion of the famous 311 service and various initiatives to accept public complaints and provide public data online. I was glad Skype was available for the call, but I find it odd for the government to be using commercial services (Kundra moving staff to Google Docs, YouTube hosting White House videos, agencies going on Facebook, etc.). I can see why the government wants to use available social media for convenience, and it provides a familiar access method for constituents. But eventually governments should develop their own public-domain software, tailored to government needs and open to all.

Blair Levin, who is designing a national broadband plan at the FCC, started out buttering up the audience by making fun of incumbent telephone companies, then gave us a “homework assignment” of reviewing and making improvements to its presentation at the the July 2nd FCC meeting, material for a set of staff workshops in August, and plans to be make in the Fall to do research. A panel following Levin’s presentation—matching up a much-applauded representative from Free Press with representatives from the cable and telco industries—looked at the issue of speed. Is it fair to set a single target for speeds? Will the FCC define broadband to more closely match more advanced countries?

This work is licensed under a Creative Commons Attribution 4.0 International License.