Software Development

Google Instant Browsing

What is Instant Browsing?

Google Chrome 12 comes with a some features, including support for Google’s Instant Searching and Instant Browsing via the address bar (or “omnibar” as they call it). This means that as you type your query or URL it will be sent off with almost every character press constantly updating the page underneath.

If you have this version of Chrome and don’t currently see what I’m talking about you can go into the Options and in the Basic tab look for the Search options. Make sure that “Enable Instant for faster searching and browsing” is turned on.

Now you will see what happens as you type URLs into the “omnibox” (the address bar). If you additionally run Fiddler you’ll see how many requests are being made in the background.

For example, if I start typing my blog URL, by the time I’ve finished typing my forename it has already concluded that I want to see my blog and I can see in fiddler it has already made the request to http://colinmackay.co.uk/ and my blog appears while I’m still typing.

What’s going on?

If I continue on, say I’m looking for something on SQL, I can see this progression in Fiddler of all the requests that get sent to my blog. (I’ve removed some of the other requests that are unimportant for this example)

As you can see, sometimes I can type quite quickly and it has to play catch up. Sometimes, I slow enough that the blog responds with a 301 (the server does its best to guess what you want, treating an invalid URL as a sort of search term and redirecting you to its best guess) or a 404 if it can’t resolve the URL.

Try this on the BBC News website – As you type URLs you get tons of 404 pages back as the intermediate (non-functioning urls) get responded to!

As you can see from the image of Fiddler above there are some requests missing, some were the browser pulling down CSS and images from my blog, others were request off to Google looking to augment the instant browsing feature.

These calls to augment the Instant Browsing feature are all being sent off to clients1.google.co.uk (I suspect that in each locale there will be a different set of URLs that are able to best match queries in that area). It consists of a GET request with the query in it. The query being what you have typed in the “omnibox”. For example: http://clients1.google.co.uk/complete/search?client=chrome&hl=en-GB&q=colinmackay.co.uk%2Fblog%2Ftasks

This results in some JSON being returned. If you are typing URLs it doesn’t appear to be that useful. The above returned the following to me:

["colinmackay.co.uk/blog/tasks",[],[],[],{"google:suggesttype":[]}]

It becomes much more interesting when a search term is used rather than a URL.

By typing simple “rupert” in the omnibox the result from the request is:

["rupert",["rupert murdoch","rupert grint","rupert everett"],["","",""],[],
{"google:suggesttype":["QUERY","QUERY","QUERY"]}]

Chrome then auto-suggests “rupert murdoch” as the primary completion with the drop down also suggesting “rupert grint” and “rupert everett”

Incidentally, you don’t get Instant Search or Instant Browsing while in Chrome’s Incognito Windows event if it is turned on.

Can anybody say DDoS?

While the expansion of Google’s Instant Search feature into Chrome is fantastic, my first thought when I saw Instant Browsing was that it could be used as a way to mount a DDoS (Distributed Denial of Service) attack on a website especially those that may be running on less robust hosting plans. It is okay for Google to inundate their own web properties from their browser but what about other site owners?

If a web site is not expecting the deluge of requests coming from Chrome browsers then it may be saturated dealing with requests that the user isn’t likely to be all that interested in anyway, especially if intermediate results are bringing back 404 responses (or worse 500 responses if the server breaks badly on bad URLs).

Google have thought of this and there is a way to tell Chrome to stop sending requests that you don’t want. If you read the Chrome FAQ for web developers, you’ll see there is a section opting out of Instant URL Loading. In short, you detect a request header that Chrome has inserted into the request and if you want to opt out return a HTTP 403 status code.  This will then have Chrome blacklist that website for the remainder of the user’s session. This means that if the user comes back another day there will still be that initial hit, giving the web site administrators a chance to opt back in.

An instant browsing request looks like this:

GET http://colinmackay.co.uk/blog/task HTTP/1.1
Host: colinmackay.co.uk
Connection: keep-alive
X-Purpose: : preview
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.112 Safari/534.30
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

The important part is the X-Purpose header. This is what tells the server that the browser is rendering the page as part of the Instant Browsing feature.

Note: The FAQ states that the header is “X-Purpose: preview” but fiddler shows an extra colon in there (see above). If you are attempting to detect the mode of the browser this may become important to the way you detect it.

One thought on “Google Instant Browsing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s