Hopefully by now you have heard of Remove’ems latest innovation Penguin Analysis. Many folks in the SEO world have been asking what factors used in our machine learned algorithm appear to have the greatest impact over Penguin vulnerability and how we collect those factors for our innovative Penguin Vulnerability Score. While most of the raw data is available over at Open Penguin Data, we definitely have a few other tricks up our sleeves.
However, today I want to talk specifically about some of the factors that we were able to build using data from MajesticSEO’s site explorer and, in particular, their amazing API. We grabbed millions of data points from Majestic SEO and crunched them into little 1s and 0s representing the triggers behind the Penguin algorithm. So, let’s get to it. What factors from MajesticSEO’s data appears to have the biggest impact on the Penguin Vulnerability Score…
Domain Trust Flow
This one was shocking to say the least. Both MajesticSEO and Moz give us excellent metrics related to trust, but the single metric without any derivation that had the greatest impact of the trust rankings was MajesticSEO’s Domain Trust Flow. (To be fair, a derived ranking of MozTrust less than MozRank was slightly more predictive, but is not a single metric in itself).
Majestic SEO provides a great tutorial on what their Trust and Citation flow metrics are exactly. For our machine learning project, we needed to compact this value into 1s and 0s rather than a sliding scale. Subsequently, we built a metric called “Domain Trust Flow 1+ Standard Deviations Below the Mean”. Simply put, if a URL had Domain Trust Flow at the bottom 1/3 of all the links we analyzed, it would receive a 1. If it was above that, the URL would receive a 0. Using this methodology, we could highlight domains that performed particularly poorly on trust metrics.
Now, it is very important to point out that we are not saying that these metrics cause a Penguin penalty. Rather, they simply help us predict those that have characteristics similar to sites regularly impacted by Penguin. Whatever the factors are that improve ones Domain Trust Flow are the ones you need to target, not the metric itself. Google isn’t consuming MajesticSEO’s API to determine which pages to penalize. However, if you have a low Domain Trust Flow, you better be on the watch out.
No Government Links to Domain
MajesticSEO’s API provides quick and ready access to a number of link metrics, one of which is links from .gov domains. The lack of .gov links was more than twice as predictive of a Penguin penalty than the lack of .edu links. This seems to be a pretty fair assessment, as getting .gov links is sufficiently harder than getting .edu links. Of course, it is important to recognize that having .gov links is not necessarily an inoculation against Penguin, but having content and a site that earns .gov links likely is. So don’t just go out trying to spam .gov sites for links. Create a site that deserves them, and work on outreach that helps reel them in.
The no government links does not appear to be influential at the URL level. It is far too sparse a data point (ie: too few URLs have any GOV links) to be an effective categorization metric.
If you are paying attention, you should have noticed that the No Government Links metric likely influences Domain Trust Flow from before. Getting trustworthy links is definitely starting to add up to avoiding the Penguin algorithm!
Time and time again we return to this clear signal for over optimization. Optimized anchor text is still the fastest way to rank and the fastest way to get penalized. Simply having a single link with the anchor text set to the exact keyword for which you are trying to rank is one of the strongest influencers.
However, what we find to be most influential is the mix of anchor text metrics across the board: the combination of phrase match anchor text to the domain or page plus exact match anchor text. The higher these metrics, such as the most common anchor being your keyword, the greater your risk for being caught in a Penguin update.
While there are tons of other factors that go into the Penguin Vulnerability Score, MajesticSEO is definitely one of the larger data sources upon which we relied for building our risk assessment model. If you haven’t had a chance yet, sign up and get your Penguin Vulnerability Score which is only $.99! If your score turns out risky, you can use the tool to find which of these factors appears to be impacting your score. After that, head on over to MajesticSEO to dig deeper into your link profile to find the problem areas.