Getting your website ranked in Google is usually a huge priority for site owners and SEO’s. The goal is to work on great content, have it index, rank and be seen by the public.
What happens if you have content that you don’t want Google to cache or content that has cached and you need to have it removed?
While we focus on creating great content, blogs, FAQ pages, sometimes site owners publish content that is incorrect, duplicated or is just plain bad. Bad can mean offensive, or it can mean 60 pages of demo content that all reads “Lorem Ipsum” and unless you a Digital Marketing company that focuses on how great your use of lorem ipsum is, that’s bad.
Sure, you didn’t mean to, but as you are checking your analytics and rankings and notice something is off, you dig deeper. You seem a demo install of Woocommerce, dummy products, a whole dummy website coming up in Google, indexing and cached and counting as your site’s contributions to the search engine.
Luckily, there are ways these issues can be fixed.
Cache Me Ousside, How Do Dat?
So let’s start with how do we find these issues in the first place? Were’s assuming you have a WordPress site here, so first you can look under POSTS and PAGES and see how many pages you have published. This is a good place to start, but we are going to dig further.
A good next step is to go to Google and type in: site:yoursite.com (replace yoursite.com with your “actual” site address. As you can see from the image to the left (click to expand), this will show you how many pages you have indexed in Google and the pages themselves. This is a long way around, but it can be helpful as a starting place.
Did you build 15 good pages, have about 30 blogs, but now you’re showing 500 pages indexed? This can show a greater set of problems. It might be a problem with pages indexing you don’t want, like a blog category page, or something internally that can cause duplicated content.
Or it can be the problem why we’re here – bad pages.
Look at the results, look for the pages titles and descriptions showing. Do you see something unrelated, like a page on Veterinary services when you have a site for Plumbing? Seeing our old friend “lorem ipsum”?
There are tools that can search the same thing and make it much easier to analyze pages and data. One great tool we use at Infinity Digital is Screaming Frog SEO Spider Tool.
Screaming Frog, besides having the best name of all the different tools we use, allows you to crawl websites URL’s and fetch onsite elements that are key to analyzing onsite SEO. It allows you to scan up to 500 pages for FREE, and there is a low-cost option I recommend for now other reason than supporting the team that developed this tool, even if you have less than 500 pages. Otherwise, you can use it free, but you can’t save your data or take advantage of all the other features.
After you scan your site, you’ll be presented a dashboard of cells outlining page address (URL), status code (good for any 404 or 301 issues), page title, meta description and more. Even H1 & H2 titles on the page.
Analyze the information here and see if pages come up that you didn’t know were part of your site. While it won’t grab everything, again, it helps and sometimes it takes multiple tools or techniques to find out all the issues you may have. You can also look at Google Search Console and see if you have any additional sitemaps that have been indexed or a spike in indexed pages. Remember, the more places you check and validate the data, the less you miss that could cause harm.
Now we know, how does it go (away)?
Let’s now assume you have identified the pages on your site that you don’t want in Google. You have your spreadsheet you made to track the URL’s and the progress of addressing their removal. Oh, you didn’t do a spreadsheet to track… okay, do that now.
Good.
Now that we know the pages, we need to make sure we do this right, or else it’s a waste, or it can be made worse. Let’s start with the content itself. Was its junk content, like a leftover demo, or a client’s build link that accidentally was indexed or a blank page that sat there without content for the last six months? What is the relevance of the content and can it be redirected to another page, a 301, or is it bad and you want to remove any notice it was ever there 410. Decide now and do those items first. Make sure if the search engine is going to crawl it, it returns anything but a 404 not found page. Tell them what happened the proper way and you’re one step closer to fixing your mess.
Now we 301’d that “WEB-DESIGN-COPY-2-TEST” page you never finished to your proper “BEST-WEB-DESIGN-IN-PA“, or the Woocommerce page on your site for a book you never actually wrote and published, but had a great deal for at $6.99 + FREE SHIPPING. All the pages now are saying “we go somewhere else now” or “we’re gone, don’t look here anymore“, but there are still some things you can do even further to get out of the cache and not get those mistakes counting against you, or having a potential customer see it!
Tool? Who are you calling a tool?
Making these mistakes can make you feel like a tool, but you can hammer things out (yes, bad pun!) and go a step further and let Google know you want the page removed – and there’s a tool for that.
It’s called Google Remove URL’s Tool in Search Console.
These are the following actions you can choose on the form:
- Temporarily hide page from search results and remove from cache: Hides the page from Google search results for about 90 days, and also clears the cached copy of the page and snippet. The page can reappear in search results after the blackout period. Google will recrawl the page during the blackout period and refresh the page cache and snippet, but will not show them until the blackout period expires.
- Remove page from cache only: Clears the cached page and snippet, but does not remove the page from search results. Google will refresh the page cache and snippet.
- Temporarily hide directory: Hides an entire directory from search results for about 90 days and also clears cached pages and snippets for all pages in the specified directory. The directory can reappear in search results after the blackout period. Google will recrawl the pages during the blackout period and refresh the page caches and snippets.
If content was deleted from a site but still shows up in Google search results, the page description or cache might be outdated. Use this tool to submit the URLs for removal. You can submit the URL and it will take it out of the cache and index (in time, does not happen right away) and it will actually allow you to reinclude it at a later time.
So, if it was a page that can be fixed but you don’t have time or need time, you can take it out of the index, fix it, then resubmit the page in better condition and optimized for better search placement in the future.
There it is. You found the content in question, put together a game-plan, removed or redirected the content on your site, then asked Google to remove using the Remove URL’s Tool. You’re all done right?
Yes, and no.
Where do you go from here? Keep checking the index to make sure your pages come down, and/or make sure no new pages creep in. Tighten up your ship – make sure you keep development files off your main site, no demo content, no “tests” and “test pages” and be aware of all posts, pages, and content that goes up on your site. A web page is your identity, your storefront to your customers online. Imagine walking into a home remodeling store and seeing signs for produce and lunchmeat! Confusing, right?
Protect your site and use this information to make sure no problems exist. If they do then use these steps to fix them and use them again to prevent any in the future. You may see dips in rankings and traffic as you fix the issues, but it should be temporary and let’s be honest – it was better to happen fixing it then getting a penalty or traffic loss due to these issues sitting there for a long time unaddressed.
Be patient, it gets better! You’ll be okay and you just learned something, so you’re better off now than when you started!