In 2006 both the Oxford English and Merriam-Webster dictionaries added the verb “google” to their pages. Though Google itself resists the public’s broad use of the term to dissuade them from genericizing its trademark, it’s difficult to believe the company didn’t feel just a touch of pride. After all, it achieved this landmark less than a decade after Google.com was first registered as a domain.
According to NetCraft, there are about 189 million active websites on the internet today. Google estimates that this number breaks down into 60 trillion individual pages, a number that is constantly growing. In navigating through the internet’s many articles, images and videos, Larry Page, co-founder and CEO of Google, describes the perfect search engine as one that “understands exactly what you mean and gives you exactly what you want.” Therefore, Google’s stated goal “is to make it as easy as possible for you to find the information you need” in this great sea of data “and get the things you need to do done.”
Contrary to popular belief, Google does not search the naked web. Rather, Google searches its own index of the web. Drawing from its internal database is what allows the engine to pull such accurate and extensive answers to search queries.
In this section we will discuss what happens when a user queries Google and how Google delivers the results that it does.
1) A Request is Typed into the Query Box
Whether you’re searching “Panda cams” or “How do I fix a broken heart,” the process is the same. Over the years Google has become much more sophisticated in what queries it recognizes. Today, Google Instant will begin to auto-complete your query as soon as it gets enough letters, providing suggestions based on what you’ve searched before and the popularity of searches by other users across the web.
2) The Search Begins
This is where Google’s algorithm kicks in. We’ll discuss PageRank in more detail in the next section; for now we’ll say that the algorithm is a set of instructions for Google’s computers, informing them how to do their job – that job being to find what you’re looking for. The keywords in your query are picked out and used to identify relevant pages in Google’s vast index. Google’s 2013 update, Hummingbird, has improved this initial process to also take the query as a whole into account.
3) Combing the Clusters
It is believed that Google owns or leases about 200 data centers all over the planet. The software for Google’s domain-name servers runs on computers in these centers, each forming an incredible cluster of information consuming an equally incredible volume of power. Which of these clusters is combed in relation to your query is an efficient process that takes into account the nearest data center to you as well as which cluster is the least busy at that moment. Google’s web server splits the components of your query across hundreds of machines in the center (potentially thousands) to allow them all to search simultaneously.
Every relevant entry of Google’s index is compiled while each component of your query is run through an advertisement database. These matches are fed to the web server when the Search Engine Results Page (SERP) is generated. However, if the ad results take longer to pull than the search results, they will not appear on the SERP. Though Google makes no money from searches that do not display ads, the speed of its results takes precedence over advertisements.
5) The SERP
In answer to your query, the web server pulls the data of its thousand or so operations into an organized results page.
This whole process takes less than one second.
Find Out how to increase your SEO click Google Search Equation Help
This is the meat and potatoes of Google search, a question search engine optimizers have been trying to answer for years. While it is generally understood how Google’s patented PageRank algorithm generates an SERP, the specifics remain as secret as Coke’s magic formula or the Colonel’s Kentucky Fried recipe. To hear Google tell it, this is to prevent both white and black hats from gaming the system. But though the finer points of PageRank remain mysterious, there is enough information available to construct a reliable guide to what attracts the algorithm’s fancy.
As previously stated, Google does not pull its results from the internet proper but instead from its own vast index. It compiles this index by crawling the web with software programs called “crawlers” or, more whimsically, “spiders.” These programs build a map of the internet based on the web of links that exists between internet pages. The spiders follow links from page to page, multiplying as the links and pages multiply. Years ago, when this process became general knowledge, black hat SEOs and spammers would create giant link farms that looped a few websites back and forth to each other. Google’s algorithm is now much more sophisticated and will actively demote link farms and link wheels to the bottom of its SERPs, or ban them outright (we’ll get more into bad SEO practices and spam in the next section).
As links are the essential component to how Google compiles its index, they have a lot of bearing on how it pulls its search results. It should be noted however that Google’s constant updates have given more and less weight to many other factors, a source of frustration and oftentimes panic in the SEO world. But as we’ll learn shortly, the major tenets of SEO have remained virtually unchanged.
So how does Google decide what are the best pages to display in response to a user query? According to Matt Cutts, currently head of Google’s Webspam team, Google asks some questions of its own. These questions are based on the keywords in a given query, and include:
- Do these keywords appear in the title of a webpage?
- Do they appear in the URL?
- Are the keywords close to each other?
- Does a page contain known synonyms for the keywords?
- Most importantly) is the page containing the relevant terms from a quality website or is it from a spammy, or untrusted, site?
Links to and from a page are counted, as well as a page’s popularity (how often it is visited and clicked on). Other factors influencing the SERP are the safesearch filter, a user’s preferences and the “freshness” of pulled content. Content is considered fresh if it is new and original to the site being searched.
As you’ve probably guessed, keywords are the magic ingredient here. Indeed, trustworthy links and matching keywords are the two most important factors in building a quality site that Google trusts. But as should come as no surprise, spammers took this idea and ran with it. “Keyword stuffing,” the act of jamming as many keywords into a webpage as many times as possible, became a common black hat SEO tactic. Keywords hidden in a site’s pages, the same color as a site’s background, inserted into the code, these generated plenty of false positives in the early days of the algorithm. But as it did with link farms, PageRank has since evolved to distinguish and discard these spam sites.
Today links and keywords still matter, but how they are used is of just as much if not greater importance. For instance, a link from a high ranking web page (such as a trusted news source or business entity) counts for greater weight than a low ranking page. Creating a bunch of shell sites just for the purposes of multiplying one’s links will earn a site nothing more than a spam warning. In the same way, it is not just keywords that matter now. How keywords are used on a page, where they appear and how they are dispersed also contributes to a site’s ranking.
Understandably then, there is a right and a wrong way to use links and keywords. Not always so understandable is what else puts a site at the top of the SERP. There are a few things:
1) How closely a site’s content matches a long-tail keyphrase.
If a keyword is something small like “panda” or “Jiminy Cricket,” it’s easier to search. Long-tail phrases like “don’t take your guns to town” are often broken up into their component parts (“guns,” “town,” etc.). But since those are the lyrics to a Johnny Cash song, the whole phrase will generate results. Google Instant and the 2013 Hummingbird update are meant to improve Google’s ability to match long-tail keyphrases.
2) How fresh and and how frequent a site’s content is.
This has been true since the beginning: Original, quality content is great for boosting your PageRank. The more targeted content that exists on your site, the more varied the keywords and content Google can search through. If you write for even a niche audience, frequently adding to this content will ensure frequent visitors and readers. The more visitors to your site, the higher Google will rank your site. The higher Google ranks your site, the more readers will be able to find your site. Be aware however that the emphasis here is on original material. Sites that scrape content from others or stockpile a bunch of low quality bushwa will be penalized as spam.
3) How popular you are.
No, things haven’t changed much since your high school prom. Simply put, the more people visit your site, the higher your rankings will rise. For a plucky little website just starting in the business this may seem unfair, but remember that Google’s job is to select the most trusted and relevant pages in answer to users’ queries. Established brands have much more recognition and, since Google’s “Vince” update, are actually favored by the search engine. However, even a small business can achieve top rankings in Google by following appropriate SEO tactics, honing in on keyphrases, adding quality content and being recognized for its service.
What about the Ads at the Top of the SERPs?
The links that appear at the top of a Google results page are paid advertisements, which Google highlights in yellow to distinguish them from its organic results. The benefit to these ads is that a business will appear at the top of an SERP whenever a user searches a keyword that business is paying for.
The negative aspect to these ads is two-fold: First, users tend to put more stock in links that appear naturally. GroupM UK and Nielsen conducted a study in June 2011 that encompassed 28 million UK users and 1.4 billion searches. The results were overwhelmingly in favor of organic results, with 94% clicking on organic rather than paid links. The second negative aspect to paid ads is their impermanence. Once the payments stop, the link disappears. Organic search engine optimization that is done professionally and appropriately lasts as long as the SEO is maintained.
Since the beginning of the internet there have been spammers intent on defrauding users and generally spoiling things for legitimate businesses. Because you may be a legitimate business attempting SEO but unaware of its best practices – or wondering why Google issued you a manual spam penalty – below is a list of spam types as defined by Google.
• Pure Spam – A term for sites that repeatedly or blatantly violate Google’s webmaster guidelines. This is the worst of the worst: auto-generated gobbledygook, content scraping, link farming, cloaking, malware, etc.
• Cloaking – This is when a site presents different content to a search spider than to a user. Google will interpret it as a safe site, but when the user clicks on the link they are sent to a spam wasteland.
• Hidden Text, Keyword Stuffing – See How Does PageRank Work?.
• User-Generated Spam – Or, “why you need to put filters on your comments.” Any part of your site that allows user input (forum pages, comments sections, guestbooks and user profiles) is vulnerable to spammers. The best way to combat this is to use spam blocking software in your content management system, and hire a forum moderator.
• Thin Content – Low quality pages with little value to site visitors and readers; auto-generated or copied content.
• Unnatural Links – Any kind of link manipulation: link wheels, link farms, link schemes, etc. Don’t do it. It’s not worth it.
• Hacked Site – You may be all sweetness and light. That hacker that shanghaied your site to display spam content and links? Not so much. Once you’ve cleaned your site and eliminated the hack, inform Google and it will review your site for legitimacy.
In late 2010, the New York Times ran an article on the now infamous website DecorMyEyes.com and its sinister webmaster-in-chief Vitaly Borker. Twisting every tenet of advertising and customer relations on its head, Borker was able to parlay a history of shoddy service and personal threats to his customers into the number one spot on Google’s SERP. When users queried “eyeglasses,” “new glasses” and similar keywords, Borker’s site was usually the first result they came to. It was only after they encountered his downright disturbing business practices that customers investigated the website and discovered the pages of vitriol heaped on the man and his company.
Borker, however, was unrepentant. He was fully aware that he was the most hated man in eye care and he did everything he could to enhance that perception. The sheer volume of dissatisfied reviews, coupled with the many links back to his site, had fooled Google’s algorithm into rewarding Borker’s bad behavior. The crawlers for “eyeglasses” calculated all this attention to mean that DecorMyEyes was extremely popular and the site rocketed to the top of its SERP. When the New York Times brought this to light, Google was not amused.
Wishing to cut this poor SEO practice off at the head, Amit Singhai, a Google Fellow at the time and now the Senior Vice President of Search, wrote on his blog that “being bad to your customers is bad for business.” For a company whose motto remains, “Don’t be evil,” this flaw in their search equation was a very public embarrassment.
While the New York Times did an excellent bit of journalism in exposing this flaw, Google received a great deal of shaming that was without merit. What users often forget is that, despite its apparent omniscience, the Google algorithm is a tool just like any other. A tool can be used for its intended purpose or it can be abused, but it cannot break from the confines of its own parameters. Over the years PageRank has been refined and shaped, each time to deal with new flaws or concerns that have arisen. This has been Google’s defining characteristic, its ability to evolve. The flaw that allowed DecorMyEyes to rise to the top of SERPs has since been eliminated.
Below is a list of prominent Google updates and summaries of how they have revised the Google algorithm.
2003 – Boston, Cassandra, Dominic, Esmeralda, Fritz
This series of updates, named in the fashion of hurricanes, generally updated the algorithm to deliver improved, fresher search results. This was during a very brief period when Google intended to update its algorithm on a monthly basis.
Example: Bob, of Bob’s Beautiful Bouquets, has heard that the best way to reach the top of Google’s SERP is to have as many flower-related keywords as possible and as many backlinks to his site as he can fit on his pages. But Bob already has an example of each of the flowers he sells in his gallery. After taking a tip (and some candy) from a mysterious stranger, Bob hides flower-related keywords throughout his website and includes several links to the stranger’s own site, the stranger in turn creating multiple backlinks to Bob’s Beautiful Bouquets. For several months this goes great for Bob, until April 2003. The Cassandra update rips through his website, counting each hidden keyword and link as a major strike against him. Trying desperately to rebuild, Bob is then hit again in May. Dominic’s Google bots, “Freshbot” and “Deepcrawler,” scour the web for questionable backlinks. The mysterious stranger’s help is suddenly a poison, and all of those redundant links melt Bob’s search advantage to nothing.
2003 – Florida
This update was infamous for downranking a number of small businesses and is responsible for a good deal of lingering anxiety over Google updates ever since. Florida was actually a revision that targeted unethical SEO and backlinking tactics as well as spam sites. Though a punishment for the black hats, many naive SEOs were penalized, a good reminder to stick to the most professional and appropriate organic marketing strategies.
Example: Bob learned his lesson. He removed all the invisible links and keywords on his site and was looking forward to doing SEO the right way. Following the example of other businesses in his area, he made sure all of his keywords were front and center where everyone could see them. He filled his URLs and page titles with keywords and made sure every outbound and inbound link on his site had plenty of keywords and phrases. He also made sure to only link to similar businesses and sites as his own. Then November rolled around. With its massive ad database, Google was now able to study keywords and determine which were likely to be overused by spammers. Bob’s overoptimization of keywords in his pages and links raised the algorithm’s suspicions and once more he was plunged to the bottom of the SERP. Though his affiliate links were legitimate, the Florida update no longer assigned them as much weight. Google’s Hilltop algorithm, which assigns higher importance to documents written by experts and authorities, also took Bob’s Beautiful Bouquets to task, prioritizing other sites over him. Though Bob did see his site rise a bit after the initial downranking, he was nowhere near where he’d once been.
2005 – Bourbon
Bourbon followed the the anti-spam cleanups of Austin and Allegra (which cracked down on deceptive page tactics like invisible text, meta-tag stuffing and suspicious links) and the technical improvements of Brandy (an index expansion, higher relevance given to anchor text and an improvement to ranking synonyms in keyword analysis). Matt Cutts revealed that the update made “3.5 changes in search quality,” but little more than that. After the dust settled, it seems that Bourbon targeted duplicate content and how non-canonical (non-www) URLs were treated, as well as content scrapers and link wheels.
Example: Bob took his website back to basics and pared down his keyword links and phrases to more reasonable levels (“Bob’s Cheap Flowers Geranium Flowers and Local Flower Shop” was now “Fresh Geraniums from Bob’s Bouquets”). However, Bob was just starting to implement content marketing, having heard that blogs and articles could bring new visitors to his site. And though Bob was not a writer, he was a great appreciator of the written word. Happily, he found blogs and articles that he liked online and copied parts of them or pasted them whole onto his site. Sometimes he linked back to the original article, but not always. He didn’t always have time. When Bourbon pored over his website, it choked on his duplicate content. Not only did it discover that several of his remaining affiliate links were part of a suspicious “link neighborhood,” it also deemed Bob a content scraper. Penalized for spamming, Bob was flung even further down the Google SERP.
2005-2006 – Big Daddy
An update with twofold purpose. On the spam side, Big Daddy penalized link manipulators, link buyers and link sellers. On the technical side, it upgraded how the algorithm crawled and indexed sites.
Example: By this point, the frazzled Bob was only linking to sites that were 100% legitimate and contained similar content for similar customers to his own. He saw little change to his own website after Big Daddy. If he’d been writing original blogs or articles for his customers, and hadn’t been downgraded so often, the new index may have treated him more favorably.
2008 – Buffy
Named for the teenage vampire slayer? Possibly. A groundbreaking Google update? Inconclusive. This update encompasses a number of small changes that were made to the algorithm.
Example: Bob, a much savvier SEO than five years ago, was terrified of Buffy. But he, as well as most of the internet, remains baffled as to her significance.
2009 – Vince
Depending on who you talk to, this update either favors recognized brands or simply adjusts the weight of “trusted” sites. The result seems to give bigger businesses an edge in the SERPs, but Matt Cutts deemed this a “minor” update overall.
Example: It had been a long road, but Bob was finally clawing his way back to the top of Google’s SERP for local flower shops. True, “Conglomo’s Mondo Flowers” was a publicly traded company with franchises throughout his state, but he’d always prided himself and his employees on giving customers a personalized experience they couldn’t get at the big box stores. Bob may have been a terrible online marketer but he was still a decent human being. Thus, he hoped the good will his customers bore him would translate into decent search rankings. After Vince, Bob was dismayed to see that without any additional SEO, Conglomo’s was now high above him on the SERP.
2010 – Caffeine
A massive jolt to Google’s entire indexing system, Caffeine would supposedly deliver 50% “fresher” results than its previous index. This update can be thought of as the first part of a two-part infrastructure shift. While 2010‘s Caffeine accelerated Google’s ability to crawl pages, 2013’s Hummingbird improves its ability to sort through those individual pages.
Example: Bob’s dismay lead to an almost comical desire for revenge. He returned once more to the mysterious stranger seeking the SEO answers. The stranger sold him a loathsome creature known as a copywriter, one Harold A. Hack. Harry proceeded to write pages and pages of content for Bob, daily churning out keyword strewn blogs on anything even remotely related to flowers and shops. When Caffeine’s revamped algorithm swept the web, Bob’s frequently updated website actually leapt several ranks higher. However, though Bob’s content was indeed “fresher,” his reign at the top of the SERP would be short-lived…
2011 – Panda/Farmer
As its name suggests, this update targeted content farms, penalizing low quality content sites and sites with duplicated content. (Duplicated content can actually be an issue for legitimate sites, and not just spammers. If significant copy is repeated on multiple pages of a website – think mission statements or company history – it can set off red flags that result in the site being demoted or worse). According to Google, this update affected up to 12% of overall search results.
Example: The mysterious stranger had done it again! Harry was a terrible writer, unable to relate even the most mundane trending topics into his daily blogs, plagiarizing his peers and filling Bob’s website with nothing but brief, barely legible foofaraw. Though it included several of the keywords Bob was hoping to score for, the content on his site was not relevant to his business. The Panda and Farmer updates could see that Bob’s Beautiful Bouquets was not a trusted resource for its visitors, in fact little better than spam. The hundreds of pages Mr. Hack had added hung around Bob’s neck like a lead albatross and, once more, Bob’s website plunged into the depths of web obscurity.
2012 – Penguin
This update was termed an “over-optimization penalty.” Questionable SEO tactics were penalized and additional keyword stuffing and spam factors were adjusted.
Example: Because Bob’s copywriter had been a malnourished creature, sustained mainly on a diet of ramen noodles and gummy bears, Bob had entrusted Harry with various SEO duties across his website. While he was busily rewriting Harry’s blogs, Bob completely forgot to attend to his page titles and meta tags. Thankfully, after Florida and Bourbon, he was much more knowledgable about what links and titles Google would consider problematic. Though the Penguin update did drop him in the page rankings, he was able to fix enough pages to mitigate the blow. The last two years had taught him much in terms of what web users wanted to see. He also noticed that his pageviews were much higher for articles that he’d rewritten about his personal experiences working in the flower shop.
2013 – Hummingbird
The most recent update and considered the most substantial adjustment to the algorithm since Caffeine. Whereas previous updates were adjustments to the classic Google algorithm, Hummingbird is almost a complete overhaul. It is intended to improve Google’s conversational search abilities by better incorporating “sentiment analysis.” This means queries as a whole will be taken into account when generating search results.</>
Example: Bob guest blogged for several SEO websites on his experiences with Google updates. Though he still did not consider himself a literary fellow, he was able to relate the ups and downs of his web marketing in a personal way that readers responded to. His byline and personal profile directed new visitors to his website. In addition, Bob established himself on social media so that he could stay in contact with his customers and offer them great deals on flowers. Bob enlisted several guest bloggers from local nurseries that brought their own audiences to his website, in turn blogging for them about his business practices and flower preferences. This organic web of positive feedback gradually solidified into a robust page ranking for Bob’s Beautiful Bouquets, and Hummingbird rewarded him for it. He had become a trusted resource on the web. But Bob did not allow himself to become complacent. He avoided the mysterious stranger like the plague and enlisted the services of a professional SEO firm. It was true that Bob was a much savvier webmaster than he’d been ten years ago, but he also knew that Google would not stop updating just because he’d stopped spamming. There were other spammers out there and many refinements Google still intended for its algorithm. Bob’s SEO firm had one essential job: staying abreast of updates and ensuring that Bob’s Beautiful Bouquets followed the internet’s evolving best practices. The victory was tentative, but Bob savored it. Success had never smelled so sweet.
The Next Wave is Coming! Google does not stand still and neither does Blue Sand Group.
BSG will soon be introducing a new wave of both content and SEO. This serves as a preview.
BSG has built and developed our reputation on design and criteria that is superior to the crowd that Continue reading “The Next Wave is Here!”