With the Google monopoly cures ruling from the opposite day, we have now much more paperwork from the courtroom mentioning extra about Google’s search index, spam rating, PageRank, web page high quality, Glue and extra.
That is all along with all of the DOJ paperwork we coated earlier and that massive search leak, which Google did find yourself responding to. We additionally coated yesterday the Google FastSearch bit on grounding for Gemini and person interactions and knowledge from in the present day.
Most of those had been noticed by Marie Haynes, however I dug perhaps a bit deeper to tug out extra references that I discovered.
I ought to observe, simply because these courtroom paperwork have these statements, it does not imply these are utilized in Google Search in the present day and these statements had been additionally given by non-Googlers:
Google Search Index
What’s saved in Google’s search index? Doc ID, URL map, time stamps, spam scores, and so on:
Tremendous attention-grabbing data right here on what’s saved in Google’s search index.
– every doc has a DocID
– there’s a DocID to URL map
– every DocID has a set of indicators, attributes or metadata, some derived from person knowledgeThese embrace:
– reputation as measured by person… pic.twitter.com/MlabMDu8r3— Marie Haynes (@Marie_Haynes) September 3, 2025
Spam Rating vs Web page High quality
Google determines what to crawl based mostly not simply on spam rating but in addition high quality and recognition indicators:
Not getting crawled? It might be associated to your spam rating.
High quality and recognition indicators assist Google decide how continuously to crawl internet pages. pic.twitter.com/Fn8wfGBVdk
— Marie Haynes (@Marie_Haynes) September 3, 2025
PageRank vs Webpage
PageRank is a key high quality sign that’s one part of the standard rating however “most of Google’s high quality sign is derived from the webpage itself.”
Now that is attention-grabbing!
PageRank is a key high quality sign that’s one part of the standard rating.
Nevertheless, it seems that “most of Google’s high quality sign is derived from the webpage itself.” pic.twitter.com/3w6CBNIx8C
— Marie Haynes (@Marie_Haynes) September 3, 2025
Glue
Glue logs the question and person knowledge to assist with indicators and rating:
Glue is a question log that collects knowledge a few question and the person’s interplay with the response.
The information contains:
– textual content of the question, language, person location and system sort
– what seems on the SERP
– what the person clicked on hovered over and the way lengthy they stayed on… pic.twitter.com/MnS1pTc4Vq— Marie Haynes (@Marie_Haynes) September 3, 2025
RankEmbed BERT
Google has RankEmbed BERT which is a studying rating mannequin that makes use of 70 days of search logs plus scores generated by human high quality raters:
Oooh, subsequent is RankEmbed, now referred to as RankEmbed BERT.
It is a deep studying rating mannequin that makes use of 70 days of search logs plus scores generated by human high quality raters.
It has robust pure language understanding which permits it to extra effectively establish one of the best paperwork… pic.twitter.com/oxJKkCTRyr
— Marie Haynes (@Marie_Haynes) September 3, 2025
What else did you discover within the courtroom ruling PDF?
Discussion board dialogue at X.