In past posts I’ve said that I believe there is a need to make basic, aggregate, anonymized information about Internet usage more widely available. If everything that is known about the basic usage of the Internet is closed and proprietary then the Internet as an open platform will suffer. Here I’ll try to describe the kinds of data I’m talking about. For now I’ll call it “usage data” though that’s just a term of convenience.
There is a set of usage data that we’re quite accustomed to seeing in aggregated, anonymized form. Unconsciously I think many of us have come to realize that without public availability of this data we cannot understand even the basics of how the Internet is working.
One familiar example is the amount of bandwidth a site serves. Bandwidth data is critical to planning capacity and making sure the website doesn’t “go down” when spikes of traffic occur. Bandwidth usage is also tracked quite carefully by the ISPs (Internet service providers) for their planning and billing purposes. As an example, here’s a blog post showing bandwidth usage when we brought our facility in Amsterdam online. In addition to the data, there are also a series of posts about what was involved in making this happen, which we hope helps others who want to do similar things.
Another familiar example is the amount of “traffic” to a website in a day or a month. This is one important method of determining how popular a website is. Changes in these numbers can reflect trends and changing behavior. A specific page view might be associated with a particular person, and thus be sensitive personal data. But the total number of page views is not related to a specific person. It tells us overall how popular a site is.
A third familiar example are download numbers, which can be very informative in specific settings. For example, we had a real-time download counter during the Firefox 3 Download Day event. We were able to provide automated counts of downloads and current number of downloads per minute, each broken out by language, during this event. Here’s some basic analysis of download locales, showing how global a project Mozilla is. And here’s a post showing the effect on download rates caused by a popular talk-show host. This information can be useful without any personal or individual data being disclosed.
These examples are clearly very general. I use them precisely for this reason — to demonstrate that we already understand the usefulness of this type of data and that it can be presented in an aggregate, anonymous form. There are other forms of aggregate, anonymous data that can be equally useful in understanding how the Internet is being used and ultimately, understanding what the Internet really is. I’ll describe some of those in a subsequent post; this one is long enough for now.
The types of data I’ve described above are carefully tracked, analyzed and used in planning and decision-making across the industry. It’s often not publicly available. We’d like to see more of this sort of information publicly available. We hope to start publishing more of this type of information about Mozilla. To do this, we need to be confident that people understand this is not publishing personal or individual data, and this is not Mozilla changing.
This is part of our effort to make the Internet accessible. At the same time, Mozilla will continue to be at the forefront in protecting individuals’ security and privacy.
viku6ka18 said on September 28th, 2008 at 12:43 pm:
karl dubost, W3C said on September 28th, 2008 at 7:44 pm:
christoph said on September 29th, 2008 at 12:32 am:
OliverMMS said on September 29th, 2008 at 4:24 am:
Franz B. said on September 29th, 2008 at 4:48 am:
Reger said on September 29th, 2008 at 5:10 am:
tim said on September 29th, 2008 at 5:11 am:
Matthias Versen said on September 29th, 2008 at 5:52 am:
Stephen Obermeier said on September 29th, 2008 at 6:09 am:
Pingback from netzpolitik.org: » Links der vergangenen Tage » Politik in der digitalen Gesellschaft
Markus Rham said on September 29th, 2008 at 8:41 am:
one said on September 29th, 2008 at 9:58 am:
two said on September 29th, 2008 at 9:06 pm:
Pingback from Der gl
Roman Friesen said on September 30th, 2008 at 2:45 am:
Pingback from Anonyme Datensammlungen? - Mozilla-Chefin, Meinung, Blog-Beitrag, Allgemeinheit, Benutzerdaten, Anonymisierte, wpseo, blog - Der MozillaBlog
seneca said on September 30th, 2008 at 9:30 am:
Pingback from stefan.waidele.info » Blog Archive » Wenn die Benutzer Firefox nicht mehr trauen können, ist Firefox tot…
Pingback from Basic Thinking Blog | Mozilla und die Daten
tor-user said on September 30th, 2008 at 5:41 pm:
Pingback from Links der vergangenen Tage | World of Warcraft
dowel said on October 1st, 2008 at 12:19 am:
Fevrier said on October 1st, 2008 at 12:23 am:
Thomas said on October 1st, 2008 at 1:48 am:
AndyP said on October 1st, 2008 at 3:55 am:
Pingback from Firefox: Bilgi avı ile ilgili detaylar | SEO DANİSMANİ
Sandra said on October 1st, 2008 at 6:15 am:
Pingback from Firefox: Bilgi av
Pingback from Web Makaleleri » Blog Archive » Firefox Bilgi Avı
Pingback from Firefox: Bilgi avı ile ilgili detaylar | mIRC,mIRCmarket,Türkçe mIRC,mIRC indir
widget said on October 3rd, 2008 at 6:50 am:
blue said on October 3rd, 2008 at 10:59 am:
ingiltere vizesi said on October 3rd, 2008 at 3:07 pm:
mirc said on October 4th, 2008 at 4:53 am:
Pingback from Mitchell’s Blog » Blog Archive » Disconnect Regarding Data
Mitchell Baker said on October 7th, 2008 at 8:43 am:
Pingback from Geode - dein Firefox weiß wo du wohnst | F!XMBR
Pingback from Firefox: Bilgi avı ile ilgili detaylar.. | idealsohbet.com - ideal sohbet - günlük haber blogu
Pingback from Tab Usage Insights: Survey vs Instrumentation
Pingback from The power of Mozilla = 2mW/user < mrz’s noise