“The severity of the Google Cache problem (File not found on the server) could be very high and finding the actual reasons for it is very crucial! As a webmaster or as an SEO professional you cannot ignore this.”

If you are trying to find the Google cached version of your website through browser, the browser supposed to show the information as follows,

Image ”A”

cached version of webpage

If you observe the above image “A”, it’s a cached version of the webpage page with the details of time and date, you can also take look at the text version and source code of this page.

The text version is how exactly bots read the content.

The visibility of the cached version of webpage is an indication that Google has retained your webpage and its component on its server and in case of this being an error, it implies the cached copied is removed from its server and this is serious issue and you would lose rankings of your keyword on SERP if this has happened due to any technical issues or penalty, in case if this is an issue from the Google front it will be back shortly and which is again a rare case of something going wrong with Google and there wont be much impact on rankings or no impact at all.

I found people trying to find a solution to Google cache issues by querying as follows and hence I am adding some of the most searched queries for your reference so that you will be able to find it easily.

  • google 404 error that’s all we know
  • that’s an error. the requested URL was not found on this server. 000webhost
  • google drive the requested URL was not found on this server.
  • that’s all we know google
  • how to fix the requested URL was not found on this server
  • the requested URL proxy was not found on this server
  • access error 404 not found
  • that’s an error. google chrome

I am publishing an error image for your reference and to find the cached version you will have to add “cache:” in front of your domain name,

E.G: My domain is https://kandr.pro and I will be adding cache:https://Kandra.pro and the results will be as shown in the above image and in case of an issue with the cache of Google server you would see it as follows,

Image “B”

webcache error

This error implies that Google doesn’t have the copy of the website on its server and this is nothing but removing your webpage from Google index, to find if the webpage is indexed by Google or not just query as follows on Google search,

Site:domainname.com or site:domainname.com/URL for any URL index status

E.G: site:https://kandra.pro/problems-solved/finding-drop-in-organic-search-traffic-case-study-of-client/ and you will see it be as in below screenshot (image ‘C”), we understand that the page is found in Google index and if you want to understand when was that page last crawled and which date copy Google holds you can find the cache version of the same as in the other image D and B,

Image “C”

Index status checkIndex status check

The above image “C” is case where Google has saved a copy of your webpage on its server and hence showing on search results, meanwhile if in case if you see an error with your page as shown above (Image “B”) you have to believe that Google has already removed your webpage from Google index or Going to remove shortly and hence addressing this problem is very important,

Image “D”

Cached version of webpage

Removing your webpage from Google index indicates your webpage is no more appearing for any queries which it used to rank up earlier and hence there will be a significant drop in organic traffic.

Every time if you are noticing a recent date for the cached (cache:websitename.com or cache: URL) copy it implies the crawl frequency of the webpage is good and you will be reaping all the benefits of good crawl frequency.

Understanding when will Google remove a cached copy from its server and de-index from search results?

There are multiple cases that Google decides to remove your webpage from its server and are as follows:

Thin webpages/No value to the user:

  • The web pages those doesn’t have any kind of value for the user will be more prone to this kind of problems,
  • The web pages that have very thin content
  • Web pages those are not optimized and are also there are no searches happening for it on the internet.

If the webpages content, & quality are not good search engines ignore /boycott crawling such pages, since they are not crawling on a regular basis they would push them to secondary index and then completely remove them from the index.

Hit by Penguin/Panda algorithm:

In case if a website is hit by any of the algorithm updates even then search engines or bots not crawl the webpage and hence without any update they would completely remove the cached webpage from their servers.

Poor internal links:

In case if any of your file (URL) is an orphan on the server (Internally not connected well) there are very high chances of you seeing it going out of index and Google too would remove it from its cache in case of nonavailability of file for the crawl.

Severe Server Issues:

If the website hosted servers are not that great search engines would face difficulty in crawling the website or webpages and then, in turn, they start removing the web pages from the index as they haven’t heard of those web pages since a long time.

Poor shared hostings are one of the reasons for the server issues

Issues in the configuration files on Server:

Development teams do a lot of changes as per the requirement of marketing, sales, product manager and other teams and the requirement can be anything starting from content update, Ip detection based content serving, performance of the website, and many more, in case if the development team is working on the sensitive issues it has to make sure that its all going really well without any mistakes.

Here is one example of how it can go wrong, this is a case during the implementation of load balancer configuration on the server,

Load balancer internal IP is dynamic and changes frequently and this IP or range of IP’s must be configured in all applications on the system, In case if updating of this IP is missed in any one place when while doing recent changes on the server this can turn out to be a serious issues and bots would finding difficulties in rendering the content of your webpage.

To identify if this has caused a problem you can always cross check it on your webmaster tools, whenever you attempt to do “Fetch and render” on Google webmaster tools you would see an issue of “Temporarily unavailable” if there is an issue with the configuration or on the config file.

With the issues in configuration file the Google bots may not be allowed for “Fetch & Render” as seen on  GWT

Fetch as Google errors

After fixing the issues with the config file GTW is now allowing Google bots for Fetch and render as shown in below image,

Fetch As Google On webmaster Tools

New websites or Inactive Websites:

In case if Google couldn’t access your webpage for a longer duration since its previous crawl it will start moving the web pages to its secondary index and then finally removes from its index completely.

This is a usual problem with the very new websites which usually doesn’t have much content/activity on the website and also they might not have any kind of citations (Backlinks/external links) to the website or web pages.

Search engine bots don’t consider crawling your web pages until there is some content update or invite through webmaster fetch as Google or through backlinks or at least there should be Good brand searches or web pages optimized for good search volume keywords so that search engines consider crawling and indexing the web pages.

Since this is happening with the new websites or websites which are very inactive and hence the loss incurred will be negligible or no loss at all.

Highly competitive product/Service and sudden drop in crawl frequency:

If your product or service or content is of very high demand and its of proportionate competition its very much crucial to provide an updated information to search engine bots on regular basis and in case of failures for any of the above mentioned reasons Google would immediately remove them from the search engines index and thereafter there will be a huge drop in organic traffic coming to these pages from search engines.

Significant Drop in External Links or brand searches on search engines:

With too many numbers of backlinks search engines are able to find the website/webpage often and hence then consider crawling and indexing of the same quite often and any sudden disturbances in the backlink may cause this problem of the page not found, 404 that’s an error that’s all we know.

Too much traffic to the server or unwanted bot traffic:

Too much of spam traffic from spam bots or any other unwanted bot would keep the server engaged and may not allow the actual search bots to crawl the web pages and that’s when they start removing the web pages from their index.

Conclusion:

In recent times I have come across the “Google Cache Issue” for one my client where all the revenue-driving pages of the website had got removed from search engines results/index and we found it to be due to a configuration issue on the config file during the load balancer implementation on our servers to handle a large amount of traffic coming in.

The deindexing can happen for any of the above-mentioned reasons and finding it on time is very important and not finding it on time would impact very big on the organic search traffic and sometimes recovering it also would take lots of time.

-

Manjunath Chowdary

Digital Marketing Expert, consultant, Mentor and
Director of KandraDigital Marketing
Solutions Pvt Ltd.

-Kandra Digital

An agency that’s been built with the core purpose of delivering the quality digital marketing in the era where Digital marketing services are just business rather than the value for the business, business owners and their resources/time.

Get to us