KDnuggets : Web Mining Course : Assignment 2

Web Mining Course: Assignment 2 - Global log analysis

For the "toy" log file d100.log (or another log file), compute
  1. the total number of hits
  2. number of different IP addresses
  3. What methods were used (GET/HEAD, etc), and how many hits used each method?
  4. number of different files requested
  5. number of HTML pages requested.
    Count files ending in .html, .htm, and /, (directories)
What interesting observations can you make ?

For extra credit, compute the same values for the large log file kdlog.zip.


KDnuggets : Web Mining Course : Assignment 2