Google explains what “crawl budget” means for webmasters

Gary Illyes from Google has written a blog post named What Crawl Budget Means for Googlebot. In it, he explains what crawl budget is, how crawl rate limits work, what is crawl demand and what factors impact a site’s crawl budget.
First, Gary explained that for most sites, crawl budget is something they do not need to worry about. For really large sites, it becomes something that you may want to consider looking at. “Prioritizing what to crawl, when, and how much resource the server hosting the site can allocate to crawling is more important for bigger sites, or those that auto-generate pages based on URL parameters,” Gary said.
Here is a short summary of what was published, but I recommend reading the full post.

Crawl rate limit is designed to help Google not crawl your pages too much and too fast where it hurts your server.
Crawl demand is how much Google wants to crawl your pages. This is based

Search Engine Land Source

Fetch and Horror: 3 examples of how fetch and render in GSC can reveal big SEO problems

In May 2014, some powerful functionality debuted in the “fetch as Google” feature in Google Search Console — the ability to fetch and render.
When you ask Google to fetch and render, its crawler will fetch all necessary resources so it can accurately render a page, including images, CSS and JavaScript. Google then provides a preview snapshot of what Googlebot sees versus what a typical user sees. That’s important to know, since sites could be inadvertently blocking resources, which could impact how much content gets rendered.
Adding fetch and render was a big deal, since it helps reveal issues with how content is being indexed. With this functionality, webmasters can make sure Googlebot is able to fetch all necessary resources for an accurate render. With many webmasters disallowing important directories and files via robots.txt, it’s possible that Googlebot could be seeing a limited view of the page — yet the webmaster wouldn’t even know without fetch and render.
As former Googler

Search Engine Land Source

Fun with robots.txt

One of the most boring topics in technical SEO is robots.txt. Rarely is there an interesting problem needing to be solved in the file, and most errors come from not understanding the directives or from typos. The general purpose of a robots.txt file is simply to suggest to crawlers where they can and cannot go.
Basic parts of the robots.txt file

User-agent — specifies which robot.
Disallow — suggests the robots not crawl this area.
Allow — allows robots to crawl this area.
Crawl-delay — tells robots to wait a certain number of seconds before continuing the crawl.
Sitemap — specifies the sitemap location.
Noindex — tells Google to remove pages from the index.
# — comments out a line so it will not be read.
* — match any text.
$ — the URL must end here.

Other things you should know about robots.txt

Robots.txt must be in the main folder, i.e.,
Each subdomain needs its own robots.txt — is not the same as
Crawlers can ignore robots.txt.
URLs and

Search Engine Land Source

It’s scary how many ways SEO can go wrong

We’ve all had those moments of absolute terror where we just want to crawl into the fetal position, cry and pretend the problem doesn’t exist. Unfortunately, as SEOs, we can’t stay this way for long. Instead, we have to suck it up and quickly resolve whatever went terribly wrong.
There are moments you know you messed up, and there are times a problem can linger for far too long without your knowledge. Either way, the situation is scary — and you have to work hard and fast to fix whatever happened.
Things Google tells you not to do
There are many things Google warns about in their Webmaster Guidelines:

Automatically generated content
Participating in link schemes
Creating pages with little or no original content
Sneaky redirects
Hidden text or links
Doorway pages
Scraped content
Participating in affiliate programs without adding sufficient value
Loading pages with irrelevant keywords
Creating pages with malicious behavior, such as phishing or installing viruses, trojans or other badware
Abusing rich snippets markup
Sending automated queries to Google

Unfortunately, people can

Search Engine Land Source

Why crawl budget and URL scheduling might impact rankings in website migrations

Earlier this year, Google’s Gary Illyes stated that 30x redirects (301, 302, etc.) do not result in a loss or dilution of PageRank. As you can imagine, many SEOs have greeted this claim with skepticism.
In a recent Webmaster Central Office Hours Hangout, I asked Google’s John Mueller whether perhaps the skepticism was because when SEOs experience loss of visibility during migrations, they might not realize that all signals impacting rankings haven’t passed to the new pages yet, so they assume that PageRank was lost.
Mueller’s reply:
Yeah, I mean, any time you do a bigger change on your website — if you redirect a lot of URLs, if you go from one domain to another, if you change your site structure — then all of that does take time for things to settle down. So, we can follow that pretty quickly, we can definitely forward the signals there, but that doesn’t mean it will happen from one day to the next.
During a migration, Googlebot needs to collect huge

Search Engine Land Source

Exploring a newly-granted Google patent around social signals

Disclaimer: When discussing patents, it’s important to remember that simply filing a patent does not mean a technology is in use or will ever be used. It is simply a strong indication that an idea is being considered and likely tested.
Every now and then, a patent comes across my radar that gets me excited, and one granted recently to Google fits that bill perfectly.
We’re heard repeatedly from Google that social interactions are not a search ranking signal.  In fact, you can read a Tweet from Google’s Gary Illyes in response to the statement, “Some controversy over whether Google takes social into account for SEO….” His reply:

@RicardoBlanco take a look at this video. The short version is, no, we don’t@louisgray @PRNews @JohnMu
— Gary Illyes (@methode) June 7, 2016

So the answer is “No,” right?
Maybe, and here’s where it gets interesting. Understanding that the folks at Google tend to give answers that are technically correct but not always in the spirit of

Search Engine Land Source

Is your HTTPS setup causing SEO issues?

Google has been making the push for sites to move to HTTPS, and many folks have already started to include this in their SEO strategy. Recently at SMX Advanced, Gary Illyes from Google said that 34 percent of the Google search results are HTTPS. That’s more than I personally expected, but it’s a good sign, as more sites are becoming secured.
However, more and more, I’m noticing a lot of sites have migrated to HTTPS but have not done it correctly and may be losing out on the HTTPS ranking boost. Some have also created more problems on their sites by not migrating correctly.
HTTPS post-migration issues
One of the common issues I noticed after a site has migrated to HTTPS is that they do not set the HTTPS site version as the preferred one and still have the HTTP version floating around. Google back in December 2015 said in scenarios like this, they would index the HTTPS by default.
However, the

Search Engine Land Source

Help! I just launched a new website and my search rankings tanked!

Imagine this nightmare scenario: you’re on the verge of launching your newly redesigned website, and you’re already anticipating new leads and returning customers. You’ve spent countless hours working through every last detail before even considering unveiling your new creation to the world. The big day arrives, and you give the green light to launch.
Suddenly, you realize you forgot to plan for one crucial element: the SEO best practices that you had so carefully incorporated into your old website.
Unfortunately, this is not just a nightmare that can be forgotten once you’ve had your morning coffee, but something I’ve seen happen to countless small businesses over my 10 years as the owner of an SEO and online marketing agency.
Your redesigned website was meant to give your business a new lease on life, but instead, you’ve destroyed your organic search rankings and traffic overnight. When you change your site without thoroughly thinking through the SEO implications, you might do something harmful like throw away

Search Engine Land Source

The role of technical SEO is “makeup?” Really?

I was appalled by the main message of Clayburn Griffin‘s column the other day, “The role technical SEO should play: It’s makeup,” and I made my feelings towards the post pretty clear on Twitter.
Many others shared my sentiments. So when Search Engine Land’s Executive Features Editor, Pamela Parker, reached out to see if I wanted to write a reaction piece to express some of those views, I was happy to oblige.
The purpose of this post is to provide a counterpoint, and I will attempt to (re)position technical SEO as I see it. This necessarily means I will need to point out what I perceive to be some misconceptions in the original post…
Misconception #1: Technical SEO is superficial
The main issue I took with Griffin’s piece is that the analogy he used — “technical SEO as makeup” — is totally off the mark.
Technical SEO is a fundamental requirement for websites of any size to rank in organic search, so to describe it as superficial is simply inadequate. You can

Search Engine Land Source

How to quickly find and export all subdomains indexed by Google

An SEO audit is rarely limited to the www (or non-www) version of a website. When looking for potential duplicate content, it’s often important to know how many subdomains exist and, more importantly, how many of them are indexed by Google.
The good old search operators
An easy way to find indexed subdomains is to use search operators.

Start with “site:” and the root domain.

One by one, remove each subdomain (including: “www”) from the results with the “-inurl:” operator.

When there are no more results for Google to return, your query with the search operators should include all subdomains indexed.

However, this technique has its limits. It’s unlikely that the site you’re auditing has as many subdomains as, but you may come across a site with several dozen subdomains. This can potentially cause the following issues:

The process can be long, especially if it needs to be done for several domains.
You might get Google “captchas” along the way.
The size of queries is limited (around 30 keywords).

Search Engine Land Source