Web Statistics Shortcoming

Author: Jeff Anderson

I recently read about django-webalizer http://github.com/arneb/django-webalizer/tree/master

I decided to set it up. I'd never really cared to analyze my log files before. I set up webalizer, and it was great. The django-webalizer app provides a good interface for my statistics. I'll definitely include it for any freelance work I might do in the future.

When I ran webalizer on my site for the first time, I realized there were a couple broken links, and I noticed there were some things I could do to better my site. I don't use a favicon, so I had several hundred 404s in my log. I also analyzed logs for a public facing site at work, and fixed quite a few things.

This was good, but all those errors that happened at the beginning of the month have been taken care of. webalizer doesn't have an option where I can generate a report from the last 24 hours. A report like that is useful because I can see what is happening with my websites now.

I decided to look into other open source log analyzers. I haven't looked super hard, but it seems like awstats and webalizer are the two super active log analyzer projects now. awstats didn't seem to have a feature where I could set a custom time frame for report generation.

I'd like to write a patch, but I'm not sure it's something I want to spend time on. I do have a workaround: I can grep my logs for today's date, and pipe it to webalizer or awstats.

What should I do? Should I take the time to patch webalizer? There are plenty of webalizer forks out there already. Would the creation of another one just complicate things for the community? Should I patch awstats? I'm not really up to speed with my perl anymore. I'm not that great with C. Another option I have is to write my own. It's entirely possible that there is one available written in Python.

I could also write a post-process tool for webalizer that would run the log files through grep, re-run webalizer, and add the link to the report to the statically generated index.

I'm in favor of the KISS principle. In this case, I think it'd be easiest to write a very small script that can run webalizer with only the portion of the log files I'm interested in. I also like doing things right. I think that the proper solution to the problem is to have a tool that natively handles it. I guess it'll be added to my long list of things I'd like to do someday.

Posted: Dec 24, 2008 | Tags: Django Open Source Python

Comments are closed.