Last month’s Google Updates

Google’s search quality updates in March have been available now for over one week. There has been updates, or tweaks, for example in anchor texts, image search, navigational search and indexation of profile pages.

There are few items that stand out, and we will have a look at those more closely.

Anchor text tweaks

There are two items on the list in regards to tweaks in anchor texts: tweaks to handling of anchor text and better interpretation of and use of anchor text. The first one talks about how the specific classifier has been turned off, and the second mentions a new way of determining anchor text relevance.

Word-to-word from Google’s announcement:

Tweaks to handling of anchor text: This month we turned off a classifier related to anchor text (the visible text appearing in links). Our experimental data suggested that other methods of anchor processing had greater success, so turning off this component made our scoring cleaner and more robust.

Better interpretation of and use of anchor text: We’ve improved systems we use to interpret and use anchor text, and determine how relevant a given anchor might be for a given query and website.

The both explanations are very unclear, so we have to just keep guessing what Google exactly means by these updates.

Image Search Changes

You can spot a couple of items from the list that are related to image search, more specifically to the quality of the pages on which images appear: more relevant image search results and improvement to image search relevance. The first one talks about how lower quality pages with relevant images are rewarded; the second one talks about how images on better quality pages are rewarded.

More relevant image search results: This change tunes signals we use related to landing page quality for images. This makes it more likely that you’ll find highly relevant images, even if those images are on pages that are lower quality.

Improvement to image search relevance: We’ve updated signals to better promote reasonably sized images on high-quality landing pages.

Again, sounds complicated and quite unclear how these changes should actually be interpreted.

Other items

There were few other items that raised my attention as well. These are:

Better indexing of profile pages: This change improves the comprehensiveness of public profile pages in our index from more than two-hundred social sites.

Improvements to handling of symbols for indexing: We generally ignore punctuation symbols in queries. Based on analysis of our query stream, we’ve now started to index the following heavily used symbols: “%”, “$”, “\”, “.”, “@”, “#”, and “+”. We’ll continue to index more symbols as usage warrants.

Fewer undesired synonyms: When you search on Google, we often identify other search terms that might have the same meaning as what you entered in the box (synonyms) and surface results for those terms as well when it might be helpful. This month we tweaked a classifier to prevent unhelpful synonyms from being introduced as content in the results set.

Improvements to freshness: We launched an improvement to freshness late last year that was very helpful, but it cost significant machine resources. At the time we decided to roll out the change only for news-related traffic. This month we rolled it out for all queries.

As you can see, Google doesn’t provide us with very clear explanations regarding these updates. It is just us who need to guess how to react to these changes and how to start implementing them.

If you are interested in checking out the whole list, you can find it here.

Watch out with overly SEO’ed sites!

The head of Google’s search spam team, Matt Cutts, announced that Google is releasing an algorithm update specifically to target sites that are overdoing their SEO.

According to Matt Cutts this is Google’s attempt to “level the playing field” between webmasters that build quality content versus webmaster who are just simply doing aggressive SEO.

Barry Swartz quoted Matt Cutts in his article in Search Engine Roundtable:

“What about the people optimizing really hard and doing a lot of SEO. We don’t normally pre-announce changes but there is something we are working in the last few months and hope to release it in the next months or few weeks. We are trying to level the playing field a bit. All those people doing, for lack of a better word, over optimization or overly SEO – versus those making great content and great site. We are trying to make GoogleBot smarter, make our relevance better, and we are also looking for those who abuse it, like too many keywords on a page, or exchange way too many links or go well beyond what you normally expect. We have several engineers on my team working on this right now.”

There has been complaints of ranking changes this week, even though Google has denied any updates of any sorts. Maybe it is just Google testing the becoming update…

You can listen the audio recording from the panel Matt Cutts was on at SXSW about a week ago, when they were discussing about the new update.

Do you have too much ads on your site?

Google has introduced again a new algorithm change, “page layout algorithm”.  The new change is aimed for penalizing sites that are too heavily loaded with ads.

Google posted the same information about the new change on its Inside Search blog and Google Webmaster Central blog:

“We’ve heard complaints from users that if they click on a result and it’s difficult to find the actual content, they aren’t happy with the experience. Rather than scrolling down the page past a slew of ads, users want to see content right away.

So sites that don’t have much content “above-the-fold” can be affected by this change. If you click on a website and the part of the website you see first either doesn’t have a lot of visible content above-the-fold or dedicates a large fraction of the site’s initial screen real estate to ads, that’s not a very good user experience.

Such sites may not rank as highly going forward.”

This change doesn’t impact on sites that are using pop-ups, pop-unders or overlay ads; it only applies to static ads in fixed positions on pages themselves.

How do you know what is too much?

According to Matt Cutts, the head of Google’s web spam team, Google won’t be offering any kinds of tools to tell if you have too much ads or not.  Google is encouraging people to make use of  e.g. its Google Browser Size tool to understand how much page’s content compared to ads is visible to visitors under various screen resolutions at first glance when they open the page.

Google’s blog post addresses though, that the change should only hit pages that have abnormally large number of ads above-the-fold (compared to the web as a whole).  And according to Cutts again, the change will impact less that 1 % of Google’s searches globally.

So if you have little or no content showing above the fold for commonly-used screen resolutions, I would advice you to fix your site, just in case. And do it fast, the change has already started going into effect.

“Fresh” Google Algorithm Change

Once again, Google is rolling out a new search algorithm change that will make the search results “fresher”. Fresher results are often also more relevant results.

This new change will not only help making the search results “fresher”, it will also change about 35 % of all searches. The impact is larger than the impact of the Panda update was: Panda affected only for 12 % of the searches conducted.

The new change will impact searches related to

  • Recent events or hot topics. When you search for current events like “economic situation of Greece”, or for the latest news about “smartphone reviews”, you will see more good quality pages that are only few minutes old.
  • Regularly recurring events. Some events are recurring on regularly bases, like “presidential election” or “Eurovision song contest”. When you search with these search phrases without specifying more in detail, you will see the most recent event in the top results. Even if you are searching for an event that recurs more frequently, you will see the latest information.
  • Frequent updates. Some information changes often, but it’s not really a hot topic. If you are searching for example information about “google algorithm change”, you will get information about the newest change.

It’s not a new thing for Google to go after the freshest content. Already in 2007 “Query Deserved Freshness” was a ranking factor. Last year Google did a Caffeine Update, which made possible for Google to gather content even faster, which in turn has potential to rank better.

So, what is new and different now then?

Freshness is getting more rewarded. So much, that every 3rd search has been impacted. That is huge. The old “freshness” algorithm had an impact on about 17,5 % of search queries, now the impact is double, 35 %.

Is all fresh content of good quality? Is it enough to make a small change to a page and that will make it fresh?

There is a risk of decreasing relevancy, or letting spammy and “light” content in to the rankings. Most likely Google will use its other search ranking factors in combination with “freshness” algorithm to help qualify if something is both fresh and good.

“Freshness is one component, but we also look at the content of the result, including topicality and quality.”

Google also says that one of the freshness factors is the time when they first crawled the page.

Finally, it important to keep in mind, 35 % change in freshness doesn’t mean 35 % improvement. There is no commonly accepted way of rating the quality of search engine results in a numeric fashion. So there is no way to say whether something has improved by a particular percentage.


Leaked Google Documents – Any Surprises?

On 20th October 2011 leaked Google Search Quality Rating Guidelines started circulating the web after it was picked up at the blog of Pot Pie Girl. It is a confidential 125 pages document dated March 30, 2011 used to educated Search Quality Raters working for Google.

The last 5 pages of the document provide a summary that you can start reading and then dig into details in areas that are of special relevance for you.

Most of the information in the document is merely confirming common SEO knowledge but it can still be worth a reading to further understand how Google reasons in regards to quality and spam. I will here sum up some of the key takeaways from the document.

The most important aspect of search engine quality is how helpful the page is for the user intent. The intent is divided in 3 categories:

  • Action Intent –  “do” queries  where users want to accomplish a goal or engage in an activity
  • Information Intent – “know” queries where the users want to know something
  • Navigation Intent – “go” queries where users want to navigate to a specific page

Google uses a rating scale divided in 6 categories:

  • Vital – A page that is the official homepage of a person, place, business etc. Social Networking sites for companies are NOT considered vital
  • Useful – A page that is very helpful for most users. Usually have some or all of the following characteristics: highly satisfying, authoritative, entertaining, and/or recent
  • Relevant – A page that is helpful for many or some users, being on-topic
  • Slightly Relevant – A page that is not very helpful for most users, but is somewhat related to the query. Some or few users would find this page helpful
  • Off-Topic or Useless – A page that is helpful for very few or no users
  • Unratable – Pages that doesn’t load or that are in a foreign language

Google is also assigning flags in 5 different categories:

  • Not Spam – if page has not been designed using deceptive web design techniques
  • Maybe Spam – if the page is spammy but you can’t with confidence say that it has been designed using deceptive web design techniques
  • Spam – if the page has been designed using deceptive web design techniques. It will be considered spam if “pages only exist to make money and not to help users”, i.e. there needs to be an added value and information/help for the user
  • Porn – all pages with pornographic content, when there is a query that could have both porn and non porn interpretation the non porn interpretation will be dominant
  • Malicious

Google take a few different steps to discover sites considered SPAM:

  • Identifying hidden texts
  • Identifying keyword stuffing – it will not be flagged SPAM if keyword stuffing appears only in the meta tags
  • Identifying sneaky redirects
  • Identifying cloaking with JavaScript redirects and 100% frame
  • Identifying sites obviously created for advertising
  • Identifying sites that has been automatically mass produced

In order to determine if pages with ads are spam or not Google look at content that is considered helpful for users such as:

  • Price Comparison functionality
  • Product Reviews
  • Recipes
  • Lyrics, quotes, proverbs, poems etc.
  • Contact Information
  • Coupon, discount and promotion codes
  • Ability to register/login
  • Ads clearly marked and not distracting

In order to determine if a page is a “thin affiliate” or not Google look at:

  • Click buttons on the page – Trying to click on “more information” or “make a purchase” to see if they are taken to a different domain
  • Properties of images on the page – right clicking on image and looking where image originates
  • Original or duplicate content – content copied from other webpages will be considered as a thin affiliate factor
  • Domain registrants – if clicking on a button and coming to another page the “who is” is checked to determine if registrant is the same (if same not considered thin affiliate)

Google is doing all checking Firefox browser and every task will be rated by a group of raters each working independently. If raters disagree by a wide margin task will be reevaluated.

Any Surprises? No, but many good examples that will further make you understand how Google reason in their search for a qualitative search result.