Archive for July 23rd, 2008

Data Relating to People

July 23rd, 2008

In my last couple of posts I’ve described why I believe Mozilla must pay attention to data in order to help individual people deal with  data about them.

There’s a lot of data about people being created.  I’ve listed below some of the basic kinds of this data  that I think we need to be able to distinguish in order to speak meaningfully the effects.  I’m calling all of these categories “Associated Data” for the reasons described at the end of the post.

Is there a type of data about people that’s of interest or concern to you? If so, take a look and see if it fits into one of the sections below.

  1. “Personal and potential personal data.”  These terms are already in reasonably wide usage to mean specific information that identifies an individual, such as name, address, email address, credit card number, government-issued identification number, etc.   In some cases it’s used to include other information that can be combined to create personal information, such as an IP (Internet Protocol) address.
  2. “Intentional Content.” Data intentionally created by people to be seen by people.  When we post to social networking pages,  blogs, photo sites, product review sites, create wishlists, send gifts and other online markers we intentionally create content about ourselves or associated with us.   Sometimes this information is in big chunks, like a blog post or photostream; other times the information is in small bits like a recommendations, “pokes,” etc.  Sometimes we want this data to be public and sometimes we may not.
  3. “Harvested Data.” Information gathered or created about an individual through the logging, tracking, aggregating and correlating of his or her online activities.   It’s possible today to record just many of the actions someone takes online (the “clickstream”) and then to harvest patterns and other useful facts from that data.  For example, an e-commerce website you visit regularly will know a great deal about your shopping patterns, what kinds of items and what price ranges you look, how many times you look before you buy, the average purchase amount, the average time between purchases, etc.   They’ll know which ads you respond to and which you ignore.
  4. Relationship Data.  Our relationships with other people, such as our “friends” or followers at various sites.  This can  be either Intentional Content or Harvested Information.  I call this out specifically because a relationship always involves at least two people.  And so the treatment of this information — is it public or private, how is it used — always affects at least two people.  I’m not yet positive this is a useful topic, but (obviously) I think it likely enough to include it here.

“Associated Data.” It will be helpful to have a term that describes all these types of data.  In a vacuum “Personal” would seem the best because this is all information that somehow identifies, is related to or associated with a specific person.  But I think “personal” is understood as item 1 already.    I’m using the term “associated data” to mean all of the types of data listed above.

Are there other broad categories of information about people that would help us think clearly? Are there different categories altogether that would be more helpful?  And are there examples of this kind of data you’d like to make sure we think about? If so, note them in the comments or somewhere where we can find them.

Skip past the sidebar