privacy.txt

privacy.txt is last weekends project.  It’s a simple idea.  I’m currently putting together a parser in PHP.  I also want to see about implementing browser plugins for this as well.  A little plugin that will sit and run in your browser and make a request for privacy.txt on a page it visits, and alerts you if the site has a privacy.txt file.

I also put the project up on Drumbeat, Mozilla’s new project site, as well as posting it to Forrst to get feedback.  Anyways, I’m posting it here to get additional feedback from all my one reader.  Let me know what you think.

It might be your data, but it’s not your API

Normally the commentary is good over on Hacker News. However, a post I read today concerning Google changing it’s terms for API use caused a bunch of concerned posters to chime in with remarks about how it’s their data, and if they want to give that data to Facebook, Google shouldn’t stop them.

They make the assumption that Google is stopping them. This is simply not the case. You are free to share your Google data with your account on Facebook. You’re even allowed to share your Facebook data with any other social network. Neither Facebook or Google prevent this (that I know of). However, you can’t do this with their API without their permission.

Their is an expectation from hackers that data is only free if it can be easily accessed. However, the assumption here is that easy access should be provided by the company hosting the data via an API (Application Programmers Interface, a set of rules programmers can rely on to perform actions and retrieve data). Essentially, the thought is that by Google changing their terms of API use to prevent Facebook from using Google’s API to extract data from Google about a user (with that user’s permission, of course) is like locking the data to Google. This is simply not true.

The API is governed by Google, not the user. The user is free to do what he will with the data, but a business shouldn’t be required to provide tools for others to grab that data if they don’t want.

Users would do well to remember this. This is why efforts encouraging business to be more formal about opening up is important (and why measure like the GPL exist). What Google is saying is that if you want to use our API to retrieve customer data, you need to allow customers to send their data back to us if they want using your API.

If you want to use the tools we created, you have to pony up and offer tools to do the same.

As a user of these services, it’s important you realize the differences and the reasons. It’s your data, and you can choose how to use it, but it’s the companies API, and they have the same right to use it how they see fit.

What we can learn about privacy from Google’s Buzz fiasco

If your not a regular internet user, you might not be aware that Google’s Buzz has sparked a lot of criticism regarding it’s privacy policies.  Basically, the problems weren’t as bad as everyone made it seem, and privacy settings were in place to prevent everything that came up.  In many cases, the claims were simply not true, or the issues weren’t exactly as described.  In the case of one blogger, they were partly right and Google did fix a display bug, but even in that case, the private information wasn’t disclosed.

But this post isn’t about any of that.  Enough has been written about this, and frankly, everything I’ve read has amounted to nothing more than repeating was has already been said, and we haven’t moved toward learning anything from this incident.  That is what this post is about: learning.

Privacy: What is it good for?

I use my private Gmail account to email my boyfriend and my mother.

There’s a BIG drop-off between them and my other “most frequent” contacts.

You know who my third most frequent contact is?

My abusive ex-husband.

Which is why it’s SO EXCITING, Google, that you AUTOMATICALLY allowed all my most frequent contacts access to my Reader, including all the comments I’ve made on Reader items, usually shared with my boyfriend, who I had NO REASON to hide my current location or workplace from, and never did.

First, let’s make it clear that Google didn’t allow people access to her reader.  This was merely a display error that showed that certain people had access, but really didn’t.  But, this post is probably the greatest argument against “If you aren’t doing anything illegal, why do you need privacy?”  It’s a real life answer that has real life implications, and shows why it’s important to consider what we can learn from this.

Privacy is also important at a business level.  My actions, my privacy, the data I create, are mine, and I have a certain right to them.  Sure, I use Google, and I accept that they track my movement online.  The price I require Google pay is a good service.  It knows me, so it can innovate and create new features in search and other areas that work from that history.  If Google ceases to pay that price, I can always stop providing my goods, that is, stop giving them more information to use.  There are numerous ways to do this, and it devalues Google a little if I remove information from their system.  After all, without that information, they can’t provide services to other people.  Indeed, that’s show search engines work.  They provide a service as a payment for using your information.  When their service isn’t valued, you remove your information from their system, devaluing their service.

But this presents why privacy is important even beyond just a right.  My actions, my content, the data I create, are mine.  An email I write, a blog post I create is mine, and I have the right in many ways to control the use of the product.  Google doesn’t have to provide it’s search service to mine.  It doesn’t have to direct traffic to my post.  I would lose that payment, but as the content creator, that is in many ways my choice more than Google’s.

The simple fact is, for many people, providing Google with data allows Google to do a better job at providing their services, and we value that service at a certain level.  A level where we have no issue with sharing our data to a certain extent.

That’s the business side of privacy.  It’s our data, and we have a right to control it.

Who controls privacy

Computers suck at intent. Inferring privacy preferences for new software, based on prior actions in old software, is a recipe for failure, and a PR nightmare.  People assume computers are great at intent. We publish things to much wider contexts than we intend, and don’t notice or care until new products and features make incorrect inferences based on that. – On Privacy (or: what Buzz failed to learn from Newsfeed)

This is probably one of my favorite quotes that came out of this incident.  More precisely, “Computers suck at intent.”  The other side of the coin for me is this: “People assume computers are great at intent.”  A great example of the later is the Facebook Login fiasco.  People make the assumption that the computer works the way the think it should work.  They think computers understand their intent.  This is obviously not true, to a point.  Google does go a long way toward trying to understand the intent of the user.  All that data they collect is directly related toward enhancing aspects of their business: search and advertising.  If they can better understand users and use that data to deliver that user relevant search data, they can also use that data to enhance the ads they show.  And the ads they show are in many ways geared toward helping us.  Their is a fine balance between an ad that adds value and an ad that is just plain annoying.  However, a search engine is nothing more than a giant advertising engine.  The difference between search results and ads is merely that ads are paid for with money.  Search results are paid for in different ways, as described above.  It’s another case of give and take.  You give Google the ability to search your site, categorize it, and use it to better provide users with what they are looking for, and for that you are rewarded with more visitors from Google.  But we’ve already discussed this.

The Facebook Login fiasco demonstrates the value that Google provides, and the strong case that people really do thing computers understand intent.  But Google Buzz also shows us that computers can’t understand intent.  The second sentence of that quote explains succinctly: “Inferring privacy preferences for new software, based on prior actions in old software, is a recipe for failure.”  Now, I don’t think this directly relates to Buzz (I think Buzz was seen by Google more as a new feature of Gmail rather than a new product like Docs or Reader), but it still relates in some ways.  But it’s also essential to understand this because it all relates to who controls privacy settings.

Now, it’s easy to say you should be the one to control your privacy settings.  But what Buzz taught us is that most people don’t.  They assume the computer is controlling it for them, and that the computer understands the intent of the user.  Computers don’t.  This is why Buzz relied on existing privacy settings.  Settings people were not aware of.

On a slight side note, I’ve been using Profiles for a while now, and I had made myself aware of the settings.  I made the assumption that most other people would utilize the tools if they weren’t happy with the defaults, and anything concerned at all about privacy would take the time to make themselves aware.  On that last part, I was wrong.

So, who controlled privacy? I could argue that the individual should have been more diligent about their privacy and looked into this sooner.  The fact is, a lot of people were very much unaware of the tools available and were scrambling at the last minute.  Settings that were obvious to many of us who took the time when the tools were provided were shocked that settings that had been in the same place for years were suddenly proclaimed to be hidden.  It’s as if suddenly the populace of self-proclaimed privacy-aware individuals suddenly realized that they had to make privacy changes, rushed to the website, and scanned quickly for the big flashing “Privacy Settings” link.  When they couldn’t find it within the first few seconds, they proclaimed loudly that Google was hiding these things from them.  Hiding settings that have been available for a long time.

But I digress.

Anyways, the ability to control all of this was in the users’ hands, and I could argue they elected to not use the tools provided.

But, I could also argue the opposite side of things.  Google clearly didn’t make important parts of this new feature clear enough.  Lots of people had concerns, and suddenly, without warning, a cry of alarm sounded.  Doing things without explicit permission is dangerous.  Sure, Google might be relying on privacy settings that the user agreed to by using the service, but most likely, these were the default settings that the user never really looked at.  Surely Google must realize that a default setting is really Google making the choice and not the user?  If someone never viewed their privacy settings, if someone never said “Yes, I want to use these privacy settings” explicitly, is this really a choice?

Yes, it’s a choice because the user accepted the default settings.

No, it’s not because the user never explicitly agreed to settings they weren’t explicitly aware of.

Google was right in using the default settings.  What other settings could it use? The users used the service, accepted the settings by default.  Should Google not use these settings? Should they make up new settings?

Google was wrong because the default settings were given additional meaning.  Not new meaning, but the average user couldn’t apply the new features, the new explanations, and keep in mind the new issues that might arise.

Let me explain that last part there.  Google Profiles could be made private.  Google Buzz followers would be displayed on your profile.  Contacts could be automatically followed to your Buzz.  In the context of the Buzz situation, is it easy to follow that your contacts might be display on your profile for the world to see?  No, not really.  Is anything inherently wrong? No, not really.  Google is merely display what amounts to what Twitter and numerous other services already do.  They are following the status-quo.  The implications aren’t immediately apparent.

At the same time, you have all these informants and what not who emailed journalists from emails that could be traced back to themselves.  The journalists apparently stored these people as contacts in such a way that made them traceable.  These people who wished to remain anonymous were no longer in control of their privacy.  They allowed their privacy to leave their control.  This isn’t even a Google thing.  It’s a control thing.  The more private information you give away, or the more often you give it, the less chance it will remain private.  And the less control you have over it.  This is important, because the problem here has to do with privacy policies we set for ourselves.  The informant allows the email they use to be traced back to them.  The journalist allows information to be traced back to the informant by storing it.  The informant allows his private information into the hands of someone who’s privacy concerns are shared, but not enforced.

Concerning ourselves with our privacy is a right, but like all rights, we have to exercise this right.

What to do with all of this

So far I’ve typed a lot, but I don’t think I’ve said much.  Here is where I hope I will make up for the length of this post.

Privacy settings should be explicitly defined, accepted, even if just accepted as defaults.

The problem with this approach is the “Next, Next, Next” syndrome.  People click right past it.  Already privacy policies are displayed, but rarely read.  Privacy settings displayed will just be seen as another hoop to get through.  If privacy settings change, logging into the service will be a nightmare.  After all, I don’t want to concern myself with saying “Yes, I Accept” to every new change.  It also doesn’t solve the problem of finding the privacy settings once they are set.  After all, using the service might make you realize you want to set the privacy settings differently.  But do we really want an obtuse UI that asks our permission to do everything little thing?

After all, most people are willing to accept some lack of privacy control for ease of use.

On signing up for a new service, or agreeing to a new feature, it would be nice to see these policies and allow me to review them, but I also feel it’s somewhat counter-intuitive to ask the same questions over and over again.  But at the outset, the initial launch of a new piece of software, a review of policies and how they apply to the new feature should be made easy to get to.  Don’t clutter the UI, but allow me to review the policies that are needed.

Design an interface that is removed from the website

This is the big idea here.  We already do something like this for SSL certificates.  Browsers display this information to the user in way that they hope is easy to use.  Phishing sites and other dangerous sites are also monitored, and users are protected from those sites by browser makers.  It’s a real problem.  Privacy concerns are also something that is a major concern, but privacy on a site-by-site basis.  There is no standard, but this shouldn’t be that hard to correct.  After all, RSS feeds are already supported by browsers.  Why not take this a step further and support Privacy URLs.  This would be the location the site proclaims as their “Privacy Center.”

I propose two types of links:

<link rel="privacy_policy" href="http://www.example.com/Privacy/Policy">
<link rel="privacy_settings" href="http://www.example.com/Privacy/Settings">

Using these like you would RSS feeds, you allow the browser makers to create buttons, widgets, or whatever to allow the user browser based control to go immediately to the privacy policy and settings for a website. Just like an SSL certificate, it gives the users a way to avoid dealing with a sites design. It also allows the software developer the knowledge that even if user misses the information on the site, they’ve given the user a default way to access privacy settings.

Essentially, this is a common, default way to access something that is rather important. This can be further used by allowing sites to change privacy settings by simply changing the URL they use in the header, and this would notify the browser that the site itself has changed the policy, and can then alert users. This alleviates the developers from coming up with numerous ways to make sure their settings are noticed. They should still allow access via the website by normal means, but it also provides them with additional tools. In a way, it would also make users more aware of privacy settings they might not have otherwise taken the time to explore.

Going this route, I believe we solve a number of the problems.

First, we have a common way of notifying users, alongside other traditional ways. A big problem with Buzz was not finding the settings. The location, while not necessarily hidden, wasn’t obvious. A simple browser button or UI feature would allow us to worry less about finding this information, and deal instead with acting on the information.

Secondly, having these settings in place allow easily review of these settings at any time. We know that we are at the right place, the place that contains all the privacy settings, or the privacy setting center of the site. This is the right location. We aren’t guessing if we are in the correct place. This was another issue with the Buzz roll out.

Finally, this allows companies to update privacy policies and privacy settings in a much more efficient and simple manner. Updated and new settings can be alerted to the user, and the user has an easy place to review these new settings, and it’s within an appropriate context.

Combined with the first step, this would essentially make privacy as important as other features people find important in browsers. Browsers are, after all, the vehicle by which we surf the internet, and this idea, I think, is sort of like your seat belt, something you put in place before an accident happens.