Home | Services | Contact | About | SEO-ology | Portfolio | Tools | Library | Links | SEM-PPC | FLASH-AD


SEO Library Archives

Search Marketing Related Sources and Materials

SEO/SEM Glossary - Search Marketing and Search Optimization terminology

SEO Guidelines - SEO Standards and Practices Information and Guidelines

Current search topics, articles and SEO discussion threads

Source: wikipedia.org/wiki/Search_engine_optimization#SEO_and_marketing

SEO to Go / Editors Note / May 2007

This section is a work in progress and will be updated and expanded over time. The contributions, suggestions and opinions of viewers and users are appreciated and welcomed. Thanks for visiting.

  1. Brian Pinkerton. Finding What People Want: Experiences with the WebCrawler. The Second International WWW Conference Chicago, USA, October 17–20, 1994.
  2. WOULD YOU LIKE TO KNOW THE SECRET TO HAVING YOUR WEB SITE LISTED #1 ON ALL THE MAJOR SEARCH ENGINES?. Google Groups.
  3. The Truth About Internet Marketing. Usenet (1997-07-26).
  4. Cory Doctorow (26 August 2001). Metacrap: Putting the torch to seven straw-men of the meta-utopia. e-LearningGuru.
  5. Pringle, G., Allison, L., and Dowe, D. (April 1998). What is a tall poppy among web pages?. Proc. 7th Int. World Wide Web Conference.
  6. Brin, Sergey and Page, Larry (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine.
  7. Zoltan Gyongyi and Hector Garcia-Molina (2005). Link Spam Alliances. Proceedings of the 31st VLDB Conference, Trondheim, Norway. Retrieved on 2007-05-08.
  8. Danny Sullivan (Sep. 29, 2005). Rundown On Search Ranking Factors. Search Engine Watch.
  9. Christine Churchill (November 23, 2005). Understanding Search Engine Patents. Search Engine Watch.
  10. Laurie J. Flynn (November 11, 1996). Desperately Seeking Surfers. New York Times.
  11. AirWeb. Adversarial Information Retrieval on the Web, annual conference.
  12. David Kesmodel (September 9 2005). 'Optimize' Rankings At Your Own Risk. Startup Journal (Wall Street Journal).
  13. Adam L. Penenberg (Sep, 08, 2005). Legal Showdown in Search Fracas. Wired Magazine.
  14. Matt Cutts (2006-02-02). Confirming a penalty.
  15. Google Webmaster Tools.
  16. Yahoo! Site Explorer.
  17. What is a Sitemap file and why should I have one?. Google Webmaster Tools.
  18. Cho, J., Garcia-Molina, H. (1998). Efficient crawling through URL ordering. Proceedings of the seventh conference on World Wide Web, Brisbane, Australia.
  19. Search Submit. Yahoo!. Retrieved on 2007-05-09.
  20. Yahoo Responds To Site Match Questions. WebProNews (2004-03-11).
  21. Newspapers Amok! New York Times Spamming Google? LA Times Hijacking Cars.com?. Search Engine Land (May 8, 2007).
  22. Andrew Goodman. Search Engine Showdown: Black Hats vs. White Hats at SES. SearchEngineWatch.
  23. Jill Whalen. Black Hat/White Hat Search Engine Optimization. Search Engine Guide.
  24. Ask.com Editorial Guidelines.
  25. Google's Guidelines on SEOs.
  26. Google's Guidelines on Site Design..
  27. MSN Search Guidelines for successful indexing.
  28. Yahoo! Search Content Quality Guidelines.
  29. Andy Hagans. High Accessibility Is Effective Search Engine Optimization. A List Apart.
  30. Matt Cutts (February 4, 2006). Ramping up on international webspam.
  31. Matt Cutts (February 7, 2006). Recent reinclusions.
  32. Melissa Burdon (March 13,2007). The Battle Between Search Engine Optimization and Conversion: Who Wins?. Grokdotcom.
  33. Andy Greenberg (04.30.07). Condemned To Google Hell. Forbes.
  34. Mike Masnick (May 1, 2007). Business Lesson Of The Day: Don't Rely On A Single Source For All Your Business. Techdirt.
  35. {{cite web|url=http://news.com.com/2100-1032_3-1011740.html|title=Judge dismisses suit against Google|publisher=CNET|author=Stefanie Olsen |accessdate=2007-05-10|date=May 30, 2003
  36. KinderStart.com, LLC, et al v. Google, Inc., C 06-2057 RS (March 17, 2006).
  37. Order Graning Motion to Dismiss, KinderStart.com LLC v. Google, Inc. (March 16, 2007).
  38. Order Granting in Part and Denying in Part Motion for Sanctions, KinderStart.com LLC v. Google, Inc. (March 16, 2007).

Search Engine Optimization - A Brief History

Source: Wikipedia wikipedia.org/wiki/Search_engine_optimization - 2006

Search engine optimization (SEO) is a set of methods aimed at improving the ranking of a website in search engine listings, and could be considered a subset of search engine marketing. The term SEO (Search Engine Optimizers) also refers to an industry of consultants who carry out optimization projects on behalf of clients' sites. Some commentators, and even some SEOs, break down methods used by practitioners into categories such as "white hat SEO" (methods generally approved by search engines, such as building content and improving site quality), or "black hat SEO" (tricks such as cloaking and spamdexing). White hatters charge that black hat methods are an attempt to manipulate search rankings unfairly. Black hatters counter that all SEO is an attempt to manipulate rankings, and that the particular methods one uses to rank well are irrelevant.

Search engines display different kinds of listings in the search engine results pages (SERPs), including: pay per click advertisements, paid inclusion listings, and organic search results. SEO is primarily concerned with advancing the goals of a website by improving the number and position of its organic search results for a wide variety of relevant keywords. SEO strategies can increase both the number and quality of visitors, where quality means visitors who complete the action hoped for by the site owner (e.g. purchase, sign up, learn something). Search engine optimization is sometimes offered as a stand-alone service, or as a part of a larger marketing effort, and can often be very effective when incorporated into the initial development and design of a site.

For competitive, high-volume search terms, the cost of pay per click advertising can be substantial. Ranking well in the organic search results can provide the same targeted traffic at a potentially significant savings. Site owners may choose to optimize their sites for organic search, if the cost of optimization is less than the cost of advertising.

Not all sites have identical goals for search optimization. Some sites are seeking any and all traffic, and may be optimized to rank highly for common search phrases. A broad search optimization strategy can work for a site that has broad interest, such as a periodical, a directory, or site that displays advertising with a CPM revenue model. In contrast, many businesses try to optimize their sites for large numbers of highly specific keywords that indicate readiness to buy. Overly broad search optimization can hinder marketing strategy by generating a large volume of low-quality inquiries that cost money to handle, yet result in little business. Focusing on desirable traffic generates better quality sales leads, resulting in more sales. Search engine optimization can be very effective when used as part of a smart niche marketing strategy.

 

Contents

History

Early search engines

Webmasters and content providers began optimizing sites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all a webmaster needed to do was submit a site to the various engines which would run spiders, programs to "crawl" the site, and store the collected data. The default search-bracket was to scan an entire webpage for so-called related search words, so a page with many different words matched more searches, and a webpage containing a dictionary-type listing would match almost all searches, limited only by unique names. Search engines then sorted the information by topic, and served results based on pages they had spidered. As the number of documents online kept growing, and more webmasters realized the value of organic search listings, some popular search engines began to sort their listings so they could display the most relevant pages first. This was the start of a friction between search engine and webmasters that continues to this day.

At first search engines were guided by the webmasters themselves. Early versions of search algorithms relied on webmaster-provided information such as category and keyword meta tags, or index files in engines like ALIWEB. Meta-tags provided a guide to each page's content. When some webmasters began to abuse meta tags, causing their pages to rank for irrelevant searches, search engines abandoned their consideration of meta tags and instead developed more complex ranking algorithms, taking into account factors that elevated a limited number of words (anti-dictionary) and were more diverse, including:

  • Text within the title tag
  • Domain name
  • URL directories and file names
  • HTML tags: headings, bold and emphasized text
  • Term frequency, both in the document and globally, often misunderstood and mistakenly referred to as Keyword density
  • Keyword proximity
  • Keyword adjacency
  • Keyword sequence
  • Alt attributes for images
  • Text within NOFRAMES tags

Pringle, et al. (Pringle et al., 1998) [1], also defined a number of attributes within the HTML source of a page which were often manipulated by web content providers attempting to rank well in search engines. But by relying so extensively on factors that were still within the webmasters' exclusive control, search engines continued to suffer from abuse and ranking manipulation. In order to provide better results to their users, search engines had to adapt to ensure their SERPs showed the most relevant search results, rather than useless pages stuffed with numerous keywords by unscrupulous webmasters using a bait-and-switch lure to display unrelated web pages. This led to the rise of a new kind of search engine.

Organic search engines

Google was started by two PhD students at Stanford University, Sergey Brin and Larry Page, and brought a new concept to evaluating web pages. This concept, called PageRank, has been important to the Google algorithm from the start [2]. PageRank relies heavily on incoming links and uses the logic that each link to a page is a vote for that page's value. The more incoming links a page had the more "worthy" it is. The value of each incoming link itself varies directly based on the PageRank of the page it comes from and inversely on the number of outgoing links on that page.

With help from PageRank, Google proved to be very good at serving relevant results. Google became the most popular and successful search engine. Because PageRank measured an off-site factor, Google felt it would be more difficult to manipulate than on-page factors.

However, webmasters had already developed link-manipulation tools and schemes to influence the Inktomi search engine. These methods proved to be equally applicable to Google's algorithm. Many sites focused on exchanging, buying, and selling links on a massive scale. PageRank's reliance on the link as a vote of confidence in a page's value was undermined as many webmasters sought to garner links purely to influence Google into sending them more traffic, irrespective of whether the link was useful to human site visitors.

Further complicating the situation, the default search-bracket was still to scan an entire webpage for so-called related search-words, and a webpage containing a dictionary-type listing would still match almost all searches (except special names) at an even higher priority given by link-rank. Dictionary pages and link schemes could severely skew search results.

It was time for Google -- and other search engines -- to look at a wider range of off-site factors. There were other reasons to develop more intelligent algorithms. The Internet was reaching a vast population of non-technical users who were often unable to use advanced querying techniques to reach the information they were seeking and the sheer volume and complexity of the indexed data was vastly different from that of the early days. Search engines had to develop predictive, semantic, linguistic and heuristic algorithms. Around the same time as the work that led to Google, IBM had begun work on the Clever Project [3], and Jon Kleinberg was developing the HITS algorithm.

A proxy for the PageRank metric is still displayed in the Google Toolbar, but PageRank is only one of more than 100 factors that Google considers in ranking pages.

Today, most search engines keep their methods and ranking algorithms secret, to compete for finding the most valuable search-results and to deter spam pages from clogging those results. A search engine may use hundreds of factors in ranking the listings on its SERPs; the factors themselves and the weight each carries may change continually. Algorithms can differ widely: a webpage that ranks #1 in a particular search engine could rank #200 in another search engine.

Much current SEO thinking on what works and what doesn't is largely speculation and informed guesses. Some SEOs have carried out controlled experiments to gauge the effects of different approaches to search optimization.

The following factors are speculation on some of the considerations search engines may presently be using or which could be built into their algorithms. A number of these are taken from one of Google's patent applications [4], and may give some indication as to what is in the pipeline. Some are pure speculation. It's also good to keep in mind that Google has over 180 patents and patent applications assigned to them at the US Patent and Trademark Office (USPTO), and a number of those include possible insights into other factors, and other directions that the search engine may follow, some of which may not be consistent with this list.

  • Age of site
  • Length of time domain has been registered
  • Age of content
  • Frequency of content: regularity with which new content is added
  • Text size: number of words above 200-250 (not affecting Google in 2005)
  • Age of link and reputation of linking site
  • Standard on-site factors
  • Negative scoring for on-site factors (for example, a dampening for websites with extensive keyword meta-tags indicative of having been optimized [^SEO-ed])
  • Uniqueness of content
  • Related terms used in content (the terms the search engine associates as being related to the main content of the page)
  • Google Pagerank (Only used in Google's algorithm)
  • External links, the anchor text in those external links and in the sites/pages containing those links
  • Citations and research sources (indicating the content is of research quality)
  • Stem-related terms in the search engine's database (finance/financing)
  • Incoming backlinks and anchor text of incoming backlinks
  • Negative scoring for some incoming backlinks (perhaps those coming from low value pages, reciprocated backlinks, etc.)
  • Rate of acquisition of backlinks: too many too fast could indicate "unnatural" link buying activity
  • Text surrounding outward links and incoming backlinks. A link following the words "Sponsored Links" could be ignored
  • Use of "rel=nofollow" to suggest that the search engine should ignore the link
  • Depth of document in site
  • Metrics collected from other sources, such as monitoring how frequently users hit the back button when SERPs send them to a particular page
  • Metrics collected from sources like the Google Toolbar, Google AdWords/Adsense programs, etc.
  • Metrics collected in data-sharing arrangements with third parties (like providers of statistical programs used to monitor site traffic)
  • Rate of removal of incoming links to the site
  • Use of sub-domains, use of keywords in sub-domains and volume of content on sub-domains… and negative scoring for such activity
  • Semantic connections of hosted documents
  • Rate of document addition or change
  • IP of hosting service and the number/quality of other sites hosted on that IP
  • Other affiliations of linking site with the linked site (do they share an IP? have a common postal address on the "contact us" page?)
  • Technical matters like use of 301 to redirect moved pages, showing a 404 server header rather than a 200 server header for pages that don't exist, proper use of robots.txt
  • Hosting uptime
  • Whether the site serves different content to different categories of users (cloaking)
  • Broken outgoing links not rectified promptly
  • Unsafe or illegal content
  • Quality of HTML coding, presence of coding errors
  • Actual click through rates observed by the search engines for listings displayed on their SERPs
  • Hand ranking by humans of the most frequently accessed SERPs

The relationship between SEO and the Search Engines

The first mentions of Search Engine Optimization don't appear on Usenet until 1997, a few years after the launch of the first Internet search engines. The operators of search engines recognized quickly that some people from the webmaster community were making efforts to rank well in their search engines, and even manipulating the page rankings in search results. In some early search engines, such as Infoseek, ranking first was as easy as grabbing the source code of the top-ranked page, placing it on your website, and submitting a URL to instantly index and rank that page.

Due to the high value and targeting of search results, there is potential for an adversarial relationship between search engines and SEOs. In 2005, an annual conference named AirWeb was created to discuss bridging the gap and minimizing the sometimes damaging effects of aggressive web content providers.

Some more aggressive site owners and SEOs generate automated sites or employ techniques which eventually get domains banned from the search engines. Many search engine optimization companies, which sell services, employ long-term, low-risk strategies, and most SEO firms that do employ high-risk strategies do so on their own affiliate, lead-generation, or content sites, instead of risking client websites.

Some SEO companies employ aggressive techniques that get their client websites banned from the search results. The Wall Street Journal profiled a company which allegedly used high risk techniques and failed to disclose those risks to its clients.[5] Wired reported the same company sued a blogger for mentioning that they were banned.[6] Google's Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.[7].

Google has enforced webpage restrictions for years, such as for hidden-text (background and foreground colors the same hue); in 2006, Google could punish a non-standard website by blocking search-results, automatically, the next day for 30-35 days (or longer), pending a reinclusion request, and if reinstated, revert the index to old/expired/deleted webpages from a year earlier, delaying the re-indexing of the current website for a total of 2-4 months.

Yahoo and MSN Search do not automatically punish entire websites for small amounts of accidental hidden text. Not surprisingly, Google's market share of daily searches has fallen rapidly from 75% to 56% over the past few years, as other search engines find many valuable webpages that Google has banned and cannot display due to Google's severely limited index. In early 2006, MSN Search typically re-indexed small websites every 14 days, and Yahoo also re-indexed quickly, much faster than Google, but all three MSN/Yahoo/Google could require more than a month to index a new page (new file name) on an old website.

Some search engines have also reached out to the SEO industry, and are frequent sponsors and guests at SEO conferences and seminars. In fact, with the advent of paid inclusion, some search engines now have a vested interest in the health of the optimization community. All of the main search engines provide information/guidelines to help with site optimization: Google's, Yahoo's, MSN's and Ask.com's. Google has a Sitemaps program to help webmasters learn if Google is having any problems indexing their website and also provides a data on Google traffic to the website. Yahoo! has SiteExplorer that provides a way to submit your URLs for free (like MSN/Google), determine how many pages are in the Yahoo index and drill down on inlinks to deep pages. Yahoo! has an Ambassador Program and Google has a program for qualifying Google Advertising Professionals.

Getting into search engines' listings

New sites do not need to be "submitted" to search engines to be listed. A simple link from an established site will get the search engines to visit the new site and begin to spider its contents. It can take a few days or even weeks from the acquisition of a link from such an established site for all the main search engine spiders to commence visiting and indexing the new site.

Once the search engine has found the new site, it will generally visit and start to index the pages on the site, as long as all the pages are linked to with anchor tag hyperlinks. Pages which are accessible only through Flash or Javascript links may not be findable by the spiders.

Search engine crawlers may look at a number of different factors when crawling a site and many pages from a site may not be indexed by the search engines until they gain more pagerank or links or traffic. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled, as well as other importance metrics. Cho et al. (Cho et al., 1998) [8] described some standards for those decisons as to which pages are visited and sent by a crawler to be included in a search engine's index.

Webmasters can instruct spiders to not index certain files or directories through the standard robots.txt file in the root directory of the domain. Standard practice requires a search engine to check this file upon visiting the domain, though a search engine crawler will keep a cached copy of this file as it visits the pages of a site, and may not update that copy as quickly as a webmaster does. The web developer can use this feature to prevent pages such as shopping carts or other dynamic, user-specific content from appearing in search engine results, as well as keeping spiders from endless loops and other spider traps. For those search engines who have their own paid submission (like Yahoo), it may save some time to pay a nominal fee for submission, though Yahoo's paid submission program does not guarantee inclusion in their search results.

 


Home | Services | Contact | About | SEO-ology | Portfolio | Tools | Library | Links | SEM-PPC | FLASH-AD

 

SEO Based Web Site Design and Search Engine Marketing Consultant

Internet Consulting, SEO Copywriting and Online Ad Campaign Management

SEO to Go - Los Angeles - New York

www.SEO-to-Go.com is a “white hat” standards and practices company