Sunday 13 May 2007

Google Searches Web's Dark Side

One in 10 web pages scrutinised by search giant Google contained malicious code that could infect a user's PC. Researchers from the firm surveyed billions of sites, subjecting 4.5 million pages to "in-depth analysis".

About 450,000 were capable of launching so-called "drive-by downloads", sites that install malicious code, such as spyware, without a user's knowledge.

A further 700,000 pages were thought to contain code that could compromise a user's computer, the team report.

To address the problem, the researchers say the company has "started an effort to identify all web pages on the internet that could be malicious".

Courtesy of the BBC website.

Sitemaps

One often overlooked area of search engine optimisation is sitemaps.

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

In Mission Beach web design work performed by Ion e-Business never forgoes the use of sitemaps.