So Ian Dees sent me an Email asking if he could look at some of my raw log data and some some analysis of it. I'm a little concerned about releasing that data since there are privacy issues, which Ian completly understood. So we decided a reasonable approach to balance those privacy concerns, but provide him useful data would be for me to release a random sampling of visitors and XXX out the last octet of the IP address.
So I went through the Apache log data for December 24th, 2006 and pulled all the IP's hitting the christmas lights webcam. I then wrote a short Perl script that and generated a random sub-sample of 10,000 records which is sorted by timestamps and shows the first 3 octets of the IP address. Note that same IP addresses show up - this is because people reload the page (not actually neccessary, since iframes/AJAX are used), proxies are the same IP's, etc. All times are MST (GMT-7) on the NTP'ed web server. Traffic was coming from all over the place with noticeable spikes from Slashdot (front page at 20:33:38) and DIGG (front page at 23:10:36) - yes, you can tell from the raw log data! ;-) Note that since it was Christmas Eve, traffic was probably lower than normal from these sites ... plus DIGG is a single link whereas the Slashdot article also had several links in it to other sites.
You can access/download the 10,000 random sample data file here.
I would ask that if you download this file and do anything "interesting" with it to please let me know. And if it is really useful and you want to express your appreciation, please consider donating to the University of Maryland Center for Celiac Research.