This research section contains articles that have been linked to historically. For our users convenience, research material is now posted onto our blog, rather than take up a seperate section. However as a service to our customers and user base, we have retained some of the articles. The layout has been altered as a part of the migration to new webserver technology, but we have tried to ensure the content has not been meaningfully altered.
Breaking news!

8 Oct 2009 – we’ve broken Google’s size of web milestone!

Majestic SEO was in development for a long time and we have invested a lot of time and money into making sure that our publicly available index is the best in the world. The first prototype of our index was created around February 2007 and then a lot of work went into making sure we can scale to rival and even beat top search engines in the world. The history of our indices is shown below – one picture tells the story better than thousand words!

 

On the chart above you can see our index (blue line going up pretty steeply) size measured in unique URLs as it went through very intensive development in the last 18 months. We’ve actually released first version of the index publicly early in 2008, even though we had fairly large scale test indices made in 2007. We did not release these smaller indices beacause we felt that we did not have sufficient depth and quality in our index to actually make it public. It took another half a year since the first public release for us to actually start selling competitive information.

Since we are dealing with competitive intelligence it would have been wrong to avoid comparing us with the known competitor – Yahoo Site Explorer. They’ve been around for longer than us, so we kept an eye on how close our index is compared to theirs. Finally this month we have reached the point when we have a much bigger index then they have! Catching up took a long time but we are finally here.

As you search for urls in our index you will find that we always include links to Google and Yahoo Site Explorer (YSE) link: commands to make quick comparison of how many backlinks they have against our database. We do that because we are confindent that in vast majority of cases for established websites we have more data. Just for this article we run a few searches and found this relevant blog post that announced arrival of YSE, searching for backlinks in our database here shows that we have 9 external backlinks to it from 7 domains – query Yahoo for the same data and at the time of writing this article they have only shown 2 external backlinks! This is the approach we used to check how well we do for the list of top world domains plus a couple of interest for us shown in the table below:

 

# Checked homepage
of domain
External backlinks count
Majestic SEO Yahoo Site Explorer SEOmoz Linkscape
1 google.com 1,399,343,388 288,913,788(21%) 87,046,699 (6%)
2 yahoo.com 188,798,830 60,165,558(32%) 7,319,669(4%)
3 blogspot.com 3,216,258 69,894 (2%) 64,285 (2%)
4 adobe.com 11,553,778 3,416,027 (30%) 946,488 (8%)
5 microsoft.com 20,439,892 4,724,728 (23%) 1,299,931 (6%)
6 wikipedia.org 67,862,860 5,454,297 (8%) 4,575,880 (7%)
7 w3.org 6,963,932 1,953,095 (28%) 711,314 (10%)
8 amazon.com 17,952,311 63,447,096 (353%) 896,255 (5%)
9 geocities.com 1,847,450 157,320 (9%) 49,796 (3%)
10 youtube.com 23,328,644 12,030,984 (52%) 2,477,407 (11%)
11 myspace.com 9,134,606 4,118,036 (45%) 1,757,238 (19%)
12 msn.com 26,124,250 7,680,464 (29%) 2,960,549 (11%)
13 wordpress.org 300,892,396 154,338,507 (51%) 36,464,098 (12%)
14 macromedia.com 3,370,861 1,258,302 (37%) 182,500 (5%)
15 aol.com 13,066,744 14,102,296 (108%) 1,004,143 (8%)
16 apple.com 10,637,612 2,777,000 (26%) 731,189 (7%)
17 bbc.co.uk 9,030,447 2,795,333 (31%) 678,642 (8%)
18 sourceforge.net 7,648,850 2,894,128 (38%) 827,101 (11%)
19 tripod.com 109,322 38,269 (35%) 9,262 (8%)
20 cnn.com 20,102,927 10,722,839 (53%) 1,783,825 (9%)
21 seomoz.org 263,297 119,208 (45%) 37,588 (14%)
22 seobook.com 465,112 160,352 (34%) 79,915 (17%)

 

In this table we assume that Majestic SEO external backlink counts are 100% and calculate the percentage of our figure that our competitors have – if this figure is less than 100% then it means they have less backlinks for the same URL, in this case we show it using red colour. There is just two case when we have less backlinks than Yahoo – aol.com (marginally less), and also amazon.com (this is a suspected anomaly – wrong reporting at YSE – previously they reported much lower figure). The really interesting part is that we have more backlinks to our competitor’s home sites than they do themselves! The chart below shows this comparison much clearer:

 

We think this chart clearly demonstrates that our index is well ahead of our competitors. Does the size matter? This is not a rhetorical question and we do have a definite answer – yes it does! If you look at our anchor index quality accessment article you will see that we have been running quality checks on our index every time we made new one. The matching ratio was growing with our index size clearly showing that small databases are not sufficiently close to the big databases used by the search engines. What this means is that using data from much smaller databases is likely to show you only partial picture that is far away from the picture seen by the major search engines.

We now believe we have the biggest publicly available database of backlinks and anchor text. Is this the end of the road for us? Not at all! Google recently made a post about the size of the web saying that they have found 1 trillion (1,000,000,000,000) unique urls. We think that they are likely to be telling the truth in this case, unlike the backlink counts they show on their website! Let’s now go back to the first chart comparing index sizes that will include Google’s secret web-graph index of 1 trln unique urls that they use so effectively to rank sites:

 

Suddenly the picture is very different: the addition of Google’s secret index size changes scales big time! Now that is the real competitor we have out there! Our goal is to understand the web as good as the best search engine and we are not resting on our laurels – we actively work towards catching up with Google. Next year we expect to close the gap substantially between our index and Google’s internal webgraph database that they keep secret for a good reason – understanding effects of backlinks and anchor text on ranking is the key to great rankings.

Conclusion

We hope that this article helped you learn more about our work and show that even though we are ahead of our natural competition we still have a big job ahead of us to reach our goal of being able to see the web the way the best search engine sees it. We expect big index growth in 2009 and we are well positioned to actually reach our goal. You can help us and yourself by becoming our customer – we offer a great range of domains with excellent competitive link data!

This article will be updated in when we make our next big index update, we expect very good increase in index size, be sure to check us back soon!

Please note that Majestic SEO is not affiliated with Google, Yahoo, SEOmoz or any other company mentioned in the text unless specifically stated. All trademarks belong to their rightful owners.

THANK YOU!
If you have any questions in the meantime, please contact help@majestic.com
You have successfully registered for a Majestic Demo. A Customer Advisor will contact you shortly to schedule a suitable time to connect.