JS for SEO
The technology at Google that handles the rendering process is known as the Web Rendering Service (WRS).
Let’s say we begin the process at URL:
GET requests are sent to the server by the crawler. The server responds with the file’s headers and contents, which gets saved.
Because Google is currently primarily focused on mobile-first indexing, the request is most likely coming from a mobile user agent. With the URL Inspection Tool in Search Console, you can examine how Google is scanning your site. Check the Coverage information for “Crawled as” when you run this for a URL, and it should tell you whether you’re still on desktop indexing or mobile-first indexing.
The requests are predominantly from Mountain View, California, USA, but they also crawl for locale-adaptive pages from other countries. This is because some websites will ban or handle visitors from a given country or IP address differently, potentially preventing Googlebot from seeing your material.
Resources and Links
Google does not navigate from one page to the next in the same way that a human would. Checking the page for links to other pages and files required to construct the page is a part of Processing. These URLs are extracted and added to Google’s crawl queue, which is used to prioritise and schedule crawling.
Before the downloaded HTML is delivered to rendering, duplicate content can be removed or checked. The HTML response for app shell models may contain very little material and code. In reality, the same code may appear on every page of the site, and this code may appear on multiple websites. This can result in pages being treated as duplicates and not being rendered right away. It would be worse when search results may include the incorrect page or even the incorrect website. This should resolve with time, but it can be a pain, especially with newer websites.
Most Restrictive Directives
3. Render queue
Google is doing this via a headless Chrome browser that is now “evergreen,” meaning it should run the most recent Chrome version and support the most recent features. Many features were not available until recently since Google was rendering with Chrome 41.
Google makes extensive use of cache resources. Everything is cached before being transmitted to the renderer: pages, files, API calls, and so on. They’re not downloading each resource for every page load; instead, they’re leveraging cached resources to speed things up.
This can result in situations where past file versions are used in the rendering process, and the indexed version of a page contains portions of older files. When major changes are made, you can use file versioning or content fingerprinting to establish new file names so that Google has to download the latest version of the resource for rendering.
No Fixed Timeout
The renderer only takes five seconds to load your page, according to a prevalent JS SEO assumption. While it’s always a good idea to make your site speedier, with the way Google caches data as explained above, this myth doesn’t make sense. They’re effectively loading a website that has previously been cached. The notion stems from testing tools such as the URL Inspection Tool, where resources are fetched in real-time and an appropriate limit must be established.
The renderer does not have a set timeout. They’re probably doing something similar to what Rendertron does in public. They’ll probably wait for something like networkidle0, which means there’s no more network activity, and then set a time limit in case something gets stuck or someone tries to mine bitcoin on their pages.
5. Crawl queue
Although Google has a page dedicated to crawling budgets, you should be aware that each site has its crawl budget and that each request must be prioritised. Google must also assess the crawling of your site against the crawling of all other websites on the internet. Sites that are newer or have a lot of dynamic pages will be crawled more slowly. Some pages will receive less frequent updates than others, and some resources may receive fewer requests.