FutureQuest, Inc. FutureQuest, Inc. FutureQuest, Inc.
Knowledgebase: Stats/Logs
Interpreting Your Raw Log Files
Posted on 24 October 2003 06:09 PM
This quick tutorial will guide you through the process of downloading and interpreting your access log files.

NOTE: FutureQuest stores only the current and previous month's raw log files. It is the site owner's responsibility to download a copy of these logs prior to them being purged from the servers.

Step 1: Downloading the file

To download your log file, you should FTP into your account and navigate into the logs_web directory (found at /big/dom/xyourdomain/logs_web). Within this directory, you will see a number of files in the format of "access.YearMonthDay.gz" i.e. "access.20001211.gz". Once you have determined the log file you want, you must download it to your computer in BINARY mode.

Step 2: Opening the file

Your next step is to uncompress the zipped file with your favorite unzip utility, and open it in a text editor. Because log files can become rather large, you may need to open it in an editor other than Note Pad (Word Pad should serve the purpose).

Step 3: Reading the file

While interpreting the file, it will help to know a few basics. (Each line of your log file represents a "hit" to your site, so most pages will generate numerous hits/entries).

Below is a snippet from an actual log file, and a summary of its functions. - - [09/Dec/2000:06:04:53 -0500] "GET /directory/index.htm HTTP/1.1" 200 17645 "http://www.yahoo.com" "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"

or in broken out form:

2) -
3) - or USERNAME
4) [09/Dec/2000:06:04:53 -0500]
5) "GET /directory/index.htm HTTP/1.1"
6) 200
7) 17645 or -
8) "http://www.yahoo.com"
9) "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"
  • The first part on this line "" represents the IP address of the visitor/computer that accessed your page.

  • The second part on this line denotes the Remote logname which is derived from identd services. This will always be a dash '-' character as the performance penalty for this feature is not worthwhile.

  • The third part on this line is a dash '-' character in our example, but would look something like USERNAME if the page the person accessed was password protected (substituting the username entered by the person with USERNAME in this example).

  • The fourth part in brackets "[09/Dec/2000:06:04:53 -0500]" represents the time the page was accessed by the visitor.

  • The fifth part represents the request line sent by the visitor's browser, consisting of three elements (request type, file path, and the protocol version).

    For the request type you will normally see either GET, HEAD, or POST.

    For the file path, you will always see an entry starting with a "/" which represents the directory containing your file(s). If the path ends with a "/", the server looked for and returned the index file in the directory that was requested.

    For the protocol version you will see either HTTP/1.0 or HTTP/1.1. In the example above you see "GET / HTTP/1.1" which means, "get the index file in the folder containing your file(s), using the HTTP/1.1 protocol".

  • The sixth part of each line (i.e. in our example "200") represents the web server's return code. A list of valid return codes with their meanings are shown by clicking here.

  • The seventh part of the request line (i.e. in our example "17645") is the number of bytes sent back to the visitor (not including the size of the return header). This field can also be represented by a dash '-' character when the server return code is a 304 meaning that the browser is pulling it from it's local cache, since the object has not changed, therefor no file transfer takes place except for the actual HTTP headers.

  • The eighth part of the request line includes the "referrer", (the URL of the web page that contains the link the visitor clicked on, to browse this page). In our example this was represented by "http://www.yahoo.com." If the user typed in the URL or used a bookmark to load the page this field will be empty and look like "-".

  • The nineth and last part of the request line includes the "user agent" i.e. the browser and operating system of the visitor. In our example this looked like "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)".

Now that you are more familiar with reading your log files, you can find out who is visiting your pages, what pages they are viewing, what time they are viewing them etc. Most of the above information can be obtained from the graphics statistics included on your FutureQuest account at http://www.yourdomain.com/stats/ , though the most recent stats can only be obtained after your stats have been updated (once per night).