Windows 7 Forums
Welcome to Windows 7 Forums. Our forum is dedicated to helping you find support and solutions for any problems regarding your Windows 7 PC be it Dell, HP, Acer, Asus or a custom build. We also provide an extensive Windows 7 tutorial section that covers a wide range of tips and tricks.


Windows 7: bingbot Series: Maximizing Crawl Efficiency

4 Weeks Ago   #1
Brink

64-bit Windows 10 Pro
 
 
bingbot Series: Maximizing Crawl Efficiency

Quote:
At the SMX Advanced conference in June, I announced that over the next 18 months my team will focus on improving our Bing crawler bingbot . I asked the audience to share data helping us to optimize our plans. First, I want to say "Thank you" to those of you who responded and provided us with great insights. Please keep them coming!

To keep you informed of the work we've done so far, we are starting this series of blog posts related to our crawler, bingbot. In this series we will share best practices, demonstrate improvements, and unveil new crawler abilities.

Before drilling into details about how our team is continuing to improve our crawler, let me explain why we need bingbot and how we measure bingbot's success.

First things first: What is the goal of bingbot?

Bingbot is Bing's crawler, sometimes also referred to as a "spider". Crawling is the process by which bingbot discovers new and updated documents or content to be added to Bing's searchable index. Its primary goal is to maintain a comprehensive index updated with fresh content.

Bingbot uses an algorithm to determine which sites to crawl, how often, and how many pages to fetch from each site. The goal is to minimize bingbot crawl footprint on your web sites while ensuring that the freshest content is available. How do we do that? The algorithmic process selects URLs to be crawled by prioritizing relevant known URLs that may not be indexed yet, and URLs that have already been indexed that we are checking for updates to ensure that the content is still valid (example not a dead link) and that it has not changed. We also crawl content specifically to discovery links to new URLs that have yet to be discovered. Sitemaps and RSS/Atom feeds are examples of URLs fetched primarily to discovery new links.

Measuring bingbot success : Maximizing Crawl efficiency

Bingbot crawls billions of URLs every day. It's a hard task to do this at scale, globally, while satisfying all webmasters, web sites, content management systems, whiling handling site downtimes and ensuring that we aren't crawling too frequently or often. We've heard concerns that bingbot doesn't crawl frequently enough and their content isn't fresh within the index; while at the same time we've heard that bingbot crawls too often causing constraints on the websites resources. It's an engineering problem that hasn't fully been solved yet.

Often, the issue is in managing the frequency that bingbot needs to crawl a site to ensure new and updated content is included in the search index. Some webmasters request to have their sites crawled daily by the bingbot to ensure that Bing has the freshest version of their site in the index; whereas the majority of webmasters would prefer to only have bingbot crawl their site when new URLs have been added or content has been updated and changed. The challenge we face, is how to model the bingbot algorithms based on both what a webmaster wants for their specific site, the frequency in which content is added or updated, and how to do this at scale.

To measure how smart our crawler is, we measure bingbot crawl efficiency. The crawl efficiency is how often we crawl and discover new and fresh content per page crawled. Our crawl efficiency north star is to crawl an URL only when the content has been added (URL not crawled before), updated (fresh on-page context or useful outbound links) . The more we crawl duplicated, unchanged content, the lower our Crawl Efficiency metric is.

Later this month, Cheng Lu, our engineer lead for the crawler team, will continue this series of blog posts by sharing examples of how the Crawl Efficiency has improved over the last few months. I hope you are looking forward to learning more about how we improve crawl efficiency and as always, we look forward to seeing your comments and feedback.

Thanks!

Fabrice Canel
Principal Program Manager, Webmaster Tools
Microsoft - Bing


Source: bingbot Series: Maximizing Crawl Efficiency | Webmaster Blog


My System SpecsSystem Spec
.
Reply

 bingbot Series: Maximizing Crawl Efficiency




Thread Tools




Similar help and support threads
Thread Forum
Maximizing the efficiency of my computer with little OS charge
For a few months now I've been hearing about these amazing versions of windows, versions like the Win7 mini, which is a version of win7 that fits inside 4GB of memory, fully installed. Or the Microxp, that only takes up to 200MB of memory, also fully installed. This made me think, these OS' are...
Gaming
PC runs at a Crawl
My wife's PC has got a bug I guess. Everything she tries to open (files, games ect.) takes 30 sec. to a min. to open, if they do at all. How do I back up her important info? Email, favorites. She has a couple back up drives for pictures. I know to unplug them before reinstalling win.7 again. ...
General Discussion
PC slowing down to a crawl...
Built in... January 2012 Win 7 Ultimate 64 bit OCZ 600w ModXStream PSU P8Z68-V Gen 3 Mobo I5-2500 Sandy 3.3 Ghz w/HD Graphics 2000 series 16GBs RAM (2x 8GBs) DDR3 1600 M4 256GB SSD Computer was built 2 years ago and it has been solid but recently, perhaps even after the 01/15/2014 windows...
Performance & Maintenance
Too old to crawl under the desk
When I wanted to install an independent system (e.g. Win8) on another drive than my current OS drive, I used to crawl under the desk, remove the side panel, disconnect the cable of the OS disk and proceed. When the installation was done, same procedure in reverse order. I am really to old and too...
Drivers
The new efficiency
Are they giving Free copys of Windows 7 Professional 32 bit only? or is it coming with 32bit and 64bit discs? I am attending my event on Nov 9th in baltimore, MD, and would like to know what to expect.
General Discussion


Our Sites

Site Links

About Us

Find Us

Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

Designer Media Ltd

All times are GMT -5. The time now is 22:19.
Twitter Facebook Google+