SEO with log file analysis
Every request made to your server is recorded in your website’s log file, and evaluating this information may reveal details on how search engines crawl your site and its web pages.
In this post, you’ll discover how to perform a log file analysis and what it could be used for in SEO.
What Is Log File Analysis?
Log file analysis is a part of technical SEO from which you can get to know how Googlebot (and other web crawlers and users) interacts with your website. A log file provides valuable insights about your SEO strategy or solves problems regarding the crawling and indexing of your web pages.
What Is a Log File and What Data Does It Hold?
The log file for your website is saved on your server and records information about the requests that were made.
Every time a user or bot accesses a page on your site, an entry is made in your log file for each resource loaded. The log reveals how visitors, search engines, and other crawlers interact with your website.
A log file contains information such as:
- The URL of the page or resource you’re looking for.
- HTTP status code.
- • The server’s IP address.
- • The hit’s timestamp (time and date).
- The user agent who makes the request (e.g., Googlebot).
- The request’s method (GET/POST)
The client IP, the time it took to download the resource, and the referrer may also be included.
There’s no debating that looking through a log file for the first time might be puzzling. However, if you know what log file analysis is and how to do it, you’ll be able to acquire some extremely useful information.
What Is the Purpose of Log File Analysis in SEO?
As an SEO, you may gain various insights from your site’s log file, and here are some of the most important things to be aware of:
- How often Googlebot crawls your site and its most significant pages (and whether or not they’re crawled at all), as well as identifying pages that aren’t crawled frequently.
- Determine which of your pages and folders are the most frequently crawled.
- If your crawl budget is being spent on unimportant pages.
- Find URLs that are being crawled unnecessarily with parameters.
- If you’ve switched to a mobile-first indexing strategy.
- The particular status code for each of your site’s pages and identifying problematic regions.
- If a page is overly huge or takes too long to load.
- Identifying static resources that are crawled excessively.
- Identifying often crawled redirect chains.
- Crawler activity that suddenly increases or reduces.
Where To Get Your Log File
You must first obtain a copy of your site’s log file before you can analyse it.
You’ll need access to your webserver to download a copy of the log files. If you don’t have this level of access, talk to your web developer or IT staff about getting it or getting a copy of the log file.
You can access the log file using the file manager in your server control panel, the command line, or an FTP client.
We’ll assume you’re connecting to your server via FTP because that’s the most popular method.
After you’ve connected to the server, you’ll need to find the server log file’s location. This can be found in the following locations for common server setups like Apache, Nginx, IIS.
However, it’s vital to keep in mind that accessing your site’s log file isn’t always straightforward, and common issues include:
- Finding that log files have been disabled and are no longer accessible by a server administrator.
- Clients or other internal teams who refuse to provide log files or access to them.
- Huge file sizes
- Log files that only store recent data (based either on several days or hits)
- Issues caused by CDNs
- Custom formats
However, each of these problems has a solution that can typically be implemented with the help of a developer or server administrator.
How To Do a Log File Analysis
Now that we’ve seen some of the benefits of log file analysis, let’s look at how to perform it.
- Your website’s server log file
- Access to a Log File Analyzer tool
While you can simply rename a.log file to make it a.csv, which can then be read and analysed in Excel or Google Sheets, using a specialist application makes the process easier and faster. This means you may spend more time taking action on any issues you discover rather than manually evaluating the data.
However, if you want to perform a manual analysis, you’ll need to be comfortable with the advanced features of each of these programmes, including pivot table creation.
In addition, here are some ideas for what to look for in your log file data and how to use it in your research.
Log files might help you figure out how your crawl budget is spread through your site. When the status codes of the pages crawled are grouped, it becomes clear how much resource is being allocated to critical 200 status code pages versus being wasted on broken or redirected pages.
You may pivot the results from the log file data to check how many requests are made to various status codes. You can make pivot tables with Excel, but if you have a lot of data to look at, you might want to consider using Python instead. Pivot tables are a great method to see aggregated data for several categories, and I especially use them for studying huge log file information.
You can also see how search engine bots browse indexable pages versus non-indexable pages on your site. Comparing log file data with a crawl of your website will help you figure out if any pages are wasting the crawl budget because they aren’t required to be included in a search engine’s index.
Most vs. Least Crawled Pages
Data from log files can also assist you in determining which sites are crawled the most by search engine crawlers.
This allows you to ensure that your most important pages are located and crawled, as well as that new pages are discovered quickly and those frequently modified pages are crawled frequently enough.
Similarly, you’ll be able to observe if any pages aren’t being crawled or aren’t being seen as frequently as you’d like by search engine crawlers.
Crawl Depth and Internal Linking
You can also observe how deep into your site’s architecture search engine bots are crawling by combining log file data with insights from a crawl of your website.
If you have important product pages at levels four and five, but your log files show that Googlebot doesn’t crawl these levels very often, you may want to consider making improvements to improve their visibility.
Internal links are one way for this, and they’re another essential data point you may examine from your combined log file and crawl insights.
In general, the more internal links a page has, the more discoverable it is. You can understand both the structure and discoverability of pages by integrating log file data with internal link statistics from a site crawl.
You may also compare bot hits against internal links to see if there’s a link between the two.
Desktop vs. Mobile
As previously stated, log file data also reveals the user agent used to access the page, allowing you to determine whether the page was accessed by a mobile or desktop bot.
As a result, you’ll be able to see how many pages of your site are crawled by mobile vs. desktop, as well as how this has evolved.
You may also discover that a given section of your site is scanned predominantly by a desktop-user agent, in which case you should investigate why Google prefers this to mobile-first crawling.
Using Log File Analysis to Make Improvements
After you’ve done some log file analysis and found some useful information, you may need to make some modifications to your website.
For example, if you notice that Google is crawling a lot of your site’s broken or redirected pages, this could indicate that these pages are overly accessible to search engine crawlers.
As a result, you’ll want to check for any internal connections to these broken pages, as well as clean up any redirecting internal links.
You might be reviewing log file data to figure out how recent changes have affected crawling or to gather information ahead of changes you or another team are planning.
If you wish to change the architecture of a website, you’ll want to make sure that Google can still find and crawl the most significant pages on your site.
Regular log file analysis may assist SEO professionals in better understanding how their website is crawled by search engines like Google, as well as reveal key insights that can aid in making data-driven decisions.