Tuesday, October 10, 2006

More info on PageRank

Every few months we update the PageRank data that we show in the toolbar, and every few months I see a few repeated questions, so let me take a pass at some of them. Note: I wrote this kinda quickly, so I think this is pretty good, but if I spot something incorrect later, I’ll change it.

Philipp Lenssen asks: “Matt, I often wonder, how is the PageRank value stored internally, is it a floating-point number as many people suggest or is it just the integer value itself due to the heavy recursive PR computations?”

It’s more accurate to think of it as a floating-point number. Certainly our internal PageRank computations have many more degrees of resolution than the 0-10 values shown in the toolbar.
viggen says: “Do i need to know that? What does it tell me when i know it? Why would i care?

Meaning, what purpose has the Pagerank for the mom and pop site out there?”
viggen, I think that’s a perfectly healthy attitude. If you don’t care about PageRank and your site is doing well, that’s fine by me.

Andrew Hunter asks: “Will the data centers using the slightly older infrastructure be updated in due course, or will my PR be split by data center for the next couple of months?”
The latter. I think most data centers are running the newer infrastructure for things like info:, related:, link: and PageRank, and I believe every data center that has that newer infrastructure has the recent snapshot of PageRank now. I wouldn’t be surprised if it took at least 1-2 months for the other data center IPs to get the newer infrastructure in some way. (Yes, this is smaller, different infrastructure than the stuff that made site: queries have more accurate results estimates.)

Lots of folks ask questions like: “Is this PageRank from day X or day Y? And it looks like backlinks are from day Z?”

Really, I wouldn’t worry about it–I’m not even sure myself. At some point we take our internal PageRanks, put them on a 0-10 scale, and export them so that they’re visible to Google Toolbar users. If you’re splitting hairs about the exact date that backlinks were taken from, you’re probably suffering from “B.O.” (backlink obsession) and should stop and go do something else for a bit until the backlink obsession passes. I highly recommend keyword analysis, looking at server logs to figure out new content to add, thinking of new hooks to make your site attract more word-of-mouth buzz, pondering how to improve conversion once visitors land on your site, etc.

I’ll do a follow-up. Supplemental Challenged said: “The fact that Google can only create a PR update that is a full quarter behind the times is awfully troubling.”
I believe that I’ve said before that PageRank is computed continuously; there are machines that take inputs to the PageRank algorithm at Google and compute the resulting PageRanks. So at any given time, a url in Google’s system has up-to-date PageRank as a result of running the computation with the inputs to the algorithm. From time-to-time, that internal PageRank value is exported so that it’s visible to Google Toolbar users (see the question below for more details on the timing).

Matt Crouch asks: “Actually, I am just curious why you are bothering telling us about a new PR update…. is this the first time you ever did?”

Well asked, Matt Crouch; I’m not sure if I’ve given the official word on a PageRank export before. It’s not a big event here at Google. Frankly, I didn’t even know we’d done our 3-4 month-ish push of this data. When I saw people talking about it online, I went to check and see whether it was a real push or not. In the past few months, people have noticed when an engineer grabs an obscure data center and tinkers around with things like backlinks or info: queries (e.g. when “Update Pluto” got downgraded because it was just an engineer tinkering at one data center). So I figured I’d let people know that this was a real PageRank export and not just one person doing something.

New Jersey SEO asked: “Will this PR update affect SERPs? Are we going to have also a SERP data refresh / update?”

Great question. By the time you see newer PageRanks in the toolbar, those values have already been incorporated in how we score/rank our search results. So while you may be happy to see that the Google Toolbar shows a little more PageRank for a given page, it’s not as if that causes a change in search results at that point. So you won’t see any search engine result page (SERP) changes as a result of this PageRank export–those changes have been gradually baking in since the last PageRank export.