Facebook hoaxes: Five of the worst

As we reported earlier in the week, a not-so-new Facebook hoax was doing the rounds, claiming that the site was intending to charge users.

Whilst many users quickly seem to have clued up to this and the copy-and-paste message seems to have all but disappeared, it’s far from the only one that keeps popping up.

To many, messages such as this are just a minor irritation. However, it seems to be indicative of a bigger problem as many otherwise intelligent people who just aren’t tech savvy, fall for them hook, line and sinker.

Really, these messages are just another form of spam and many have been around in one form or another for many years.

Accordingly, here are just a select few of the most recent Facebook spams and scams.

#1: “Please do me a favour and move your mouse over my name here, wait for the box to load and then move your mouse over the ‘Subscribe’ link. Then uncheck the ‘Comments and Likes’. I would really rather that my comments on friends and families posts not be made public, thank You! Then re-post this if you don’t want your every single move posted on the right side in the ‘Ticker Box’ for everyone to see!”

The facts: This actually stops users from seeing what their friends have posted, it has nothing to do with an individual’s privacy settings whatsoever and is, in fact, completely untrue.

#2: The sad and tragic story of 7-year-old Amy Bruce, a local child who is dying from a brain tumour. For everyone who copies this, the post promises that the Make a Wish Foundation will donate $7.

The facts: Poor Amy has been 7 years old and dying since 1999, when this story originated in the form of an email which users were required to send on.

#3: The ‘Here you Have It’ virus warning – this warns users of Hotmail, Yahoo, AOL and other web-based email services that there is a virus by a hacker called ‘life owner’. It says that if a user opens an email then they will lose everything on their computer.

The facts: This has been in circulation since around 2002, and whilst there is actually a virus called Here you Have It, the virus doesn’t circulate via email and certainly doesn’t have the ability to destroy everything on your computer. Similar emails have also been around for quite some time, which claim to destroy the boot sector of your hard drive, also falsely.

#4: The claim that hackers are posting pornographic films on people’s profiles which they can’t see. This says that a post will appear that is only visible to other people, but not the account holder themselves. It claims that comments are also generated on the posts and warns users not to click on it as it’s a virus.

The facts: “Nonsense,” says Graham Cluley of Sophos.com. “We have not seen any evidence that hackers are able to post content to a compromised Facebook wall that the owner of the account cannot see.”

#5: The heart-rending tale of 52 thoroughbred horses which are to be sent to slaughter following the death of their owner, as his cruel son wants nothing to do with them. This claims that some of the mares are in foal and they can be obtained for free by dialling a number.

The facts: At the beginning of the year, this post had a grain of truth as 52 horses were abandoned in the US, although there is no evidence to suggest they were to be slaughtered. Since then, the message has been manipulated and evolved much like a Chinese whisper. It is untrue and completely irrelevant.

Of course, there are many, many more of these that propagate across Facebook and email. The worry is that those who are gullible enough to fall for them, will also be really putting themselves at risk by clicking on video links that promise to show them something ‘awesome’.

Such as the survey hijack that claims to show a spider living under a man’s skin.

All in all, it is to be hoped that eventually internet users will recognise such things for what they are, then we might begin to get somewhere in the fight against malware.

Mozilla advises users to disable Java

Mozilla has found that a weakness in Java has led to information stealing attacks being carried out on TLS-protected communications.

Whilst the attack is not specific to Firefox and doesn’t target the browser itself, they warn that some plugins may be vulnerable.

TLS encrypts communication information as it is sent over the internet, including instant messages and email.

Mozilla claims that in this instance, the information could be intercepted by an attacker, allowing them to steal information.

This could also include cookie data, which could then be used by the attacker to impersonate the victim of such a theft.

The issue was discovered by Juliano Rizzo and Thai Doung, who recently presented an in-depth paper on the problem.

This is not the first time that a problem with TLS has been discovered, but it was previously thought that the problem had been overcome with newer versions. It also requires an attacker to have superior knowledge and facilities as it is such a complex attack.

The details of the attack “require the ability to completely control the content of connections originating in the browser”, and it is for this reason that Firefox is considered to be invulnerable as it doesn’t allow this.

However, the weakness in the Java plugin does and so Mozilla recommends that all users disable the add-on as a precaution whilst they investigate further.

They are considering disabling the Java plugin completely so that attacks will not be possible at all.

It is also thought that TLS predecessor SSL is also affected by the problem, which is dubbed BEAST, short for Browser Exploit Against SSL/TLS.

The problem is that once an attack is carried out, user information can be intercepted even in HTTPS (secure) sessions.

Rizzo and Duong carried out a test attack in order to prove their point. They used the Java platform for this and quickly managed to gain information on a Paypal account.

There was some concern that Chrome could be affected, but Google doesn’t ship Java with their browser. It was also thought that the Flash plugin could potentially be affected.

However, Google has already addressed the issue and the latest version of Chrome has a fix to ensure an attack cannot happen.

Microsoft takes down Kelihos botnet

Microsoft has announced another success in its drive to take down botnets.

The company used “legal and technical measures” in “Operation b67” as it was codenamed (hmm, snappy moniker – ed), to take down the Kelihos botnet.

Kelihos is not as big as the Rustock botnet, but MS says that its takedown “represents a significant advance” in their fight.

This is because it’s the first time that MS has “named a defendant in one of its civil cases involving a botnet”.

This, they say, sends a “strong message” to botnet creators and controllers and should they attempt to rebuild the botnet then further action will always be taken.

The civil case alleges that Dominique Piatti and John Does owned a domain which they used to register subdomains in order to operate Kelihos. Whilst MS say that some were used for legitimate reasons, many were being used “for questionable purposes with links to a variety of disreputable online activities.”

This includes one which hosted the scareware MacDefender, which infects Apple’s OS with rogue software.

However, the main purpose of many of their subdomains was to control the botnet, which was used for a variety of purposes including spam, stealing information, stock scams and “websites promoting the sexual exploitation of children.”

MS obtained a restraining order on September 22nd which allowed them to cut the connections between the botnet and the zombie computers it controlled.

They then served Piatti, who lives in the Czech Republic, with notice of the suit and are now attempting to locate the other John Does in order to serve them too.

MS says that actually naming a defendant is a “big step forward” as it helps them to protect customers and the MS platform. It also goes some way to making domain providers aware that they should know more about their customers and their activities.

They also hope that this will raise the cost of cybercrime to the criminal, making it harder for them to start up and operate, therefore reducing the problem.

MS also point out that more regulation is needed in the industry to ensure that domain owners can be held accountable if subdomains are being used for illegal purposes.

Kelihos is thought to have infected around 41,000 computers across the globe, even though it is considered to be a relatively small botnet.

MS says that it will work with ISPs and Community Emergency Response Teams (CERTs) to clean up computers which are infected with botnet malware.

They have already added the Win/32 Kelihos family to the latest release of the Malicious Software Removal Tool.

Title Tags – Is 70 Characters the Best Practice? – Whiteboard Friday

Posted by Aaron Wheeler

It’s often pretty difficult to make a short title for a webpage that offers a lot of varied or super-specific information. At SEOmoz, we say that the best practice for title tag length is to keep titles under 70 characters. That’s pretty pithy considering that the title also includes your site or brand name, spaces, and other nondescript characters. So, does it matter if you go over 70 characters? How important is it to strictly adhere to this best practice? Cyrus Shepard does SEO for us here at SEOmoz, and he’ll answer that very question in this week’s Whiteboard Friday. Think title tags could or should be longer? Shorter? Let us know in the comments below!


Video Transcription

Howdy SEOmoz! Welcome to another edition of Whiteboard Friday. My name is Cyrus. I do SEO here at SEOmoz. Today we’re talking about title tag length. How long is your title tag?

Bad title tag joke. For years, we’ve been telling people, the length of your title tag should be 70 characters or less. That this is best practices. But what does this really mean? Is it absolutely true? What happens if your title tags are longer than 70 characters? For example, the title of today’s post within the meta description is 77 characters. Not this title, but the actual HTML title tag, if you look at the source code, you’ll find that the title tag of today’s Whiteboard Friday is 77 characters. We’re actually over the 70 character title tag limit. Is that bad? Are we going to go to SEO hell for that? What does that mean?

Well, recently people have been doing some experiments to see just how many characters Google will index within a title tag. For years, we thought it was 70s. It’s fluctuated. But recent experiments have shown that Google will index anywhere between 150, one person even showed that they will index over 1,000 characters, and I will link to these experiments in the post. But does this mean that you should use all of those characters to your advantage? Can you use them to your advantage? Well, I got really curious about this. So I decided to perform some experiments here on the SEOmoz blog with super long title tags. We’re talking extreme title tags, like 200 characters long, 250 characters long, just blew them out of the water just to see what would happen.


On the first experiment, I took 10 posts that did not get a lot of traffic, but they were pretty consistent traffic from week to week. I kept the old title tags and I just extended them with relevant keywords up to about 250 characters long. The results blew me away. In that first experiment, my traffic, over about a 12-week period, rose 136%. You can see, I’ll try to include a screen shot in the comments below of the Google Analytics. It exploded. I got really excited. So, I tried a second experiment. (Correction, the experiment took place over a 6 week period, not 12 like I stated in the video.)


The second experiment I tried with existing successful pages, pages that were already getting a fairly high volume of traffic, that were getting a consistent level of traffic every week. On that experiment, over about the same 12-week period, traffic rose 8%. Cool, but overall site traffic rose 9%. So it was actually 1% below the site average.

For a third experiment, I tried again on a completely different site, a personal site. I changed a few pages, title tags. Traffic actually went down over a 12-week period 11%. On that site overall site traffic went down 15%.

So, in one of these experiments, the long title tag seemed to work really well. In the other two, it just seemed to be a wash. Why did this happen, but not here? I am going to get to that in a minute.

Title Tags less than 70 Characters

Now, what are the arguments for short title tags? The best practices that you always hear about, keep it less than 70 characters. There are reasons why this is best practices and why we recommend it time and time again.

The first reason is that Google will only display the first 70 characters, in general, in their SERPs. After that, they’re truncated. Users aren’t going to see them. So, if you are writing title tags longer than 70 characters, you’re basically writing it for the search engines, and time and time again we’ve found that if you’re doing something specifically for search engines and not for users, there is probably not a lot of search engine value in it. There might be some, but probably not much.

The second reason is our Correlated Ranking Factors, a survey that we perform every couple of years. Our highest on page correlation value for keyword specific usage was if it is found, if the keyword is found in the first word of the title tag, that was a 0.09 positive correlation. It is not a huge correlation, but it was our largest on page keyword factor. Year after year after year when we perform these correlation studies, we see a direct correlation between the position of the keyword in the title tag and how important it is in the query. So, the closer the keyword is to the beginning of the title tag, the more likely it is to be important in the query. You’re going to see this time and time again. It’s very consistent. Hundreds of webmasters know this from personal experience. You want your keywords at the beginning of the title tag to rank for those keywords. The further out you do it, at 220 characters, those keywords aren’t going to count for very much.

Title Tag Best Practices

Now the third reason is kind of new in today’s world, and that is the rise of social media. Twitter limits characters to 140 characters. So, if you have a 220 character title tag and you’re trying to share it on Twitter through automatic tweets or Facebook or whatever, they look spammy, they’re not shareable, people don’t want to share them. Shorter title tags, snappy, work really well.

For all these reasons, and for most of the time we found that longer title tags don’t help you, we say that less than 70 is best practices. Now, people get confused by when we say best practices what that means. Does it mean an absolute rule? No. It just means best practices works most of the time. It’s going to be your best bet. All other things being equal, it’s going to be what you want to implement, what you want to teach people to do, and generally how you want to practice.

So, what happened here? Why did this experiment rise 136%? Well, if you remember, these were low volume pages, pages that weren’t getting a lot of traffic anyway. The reason it rose, we suspect, is because those title tags were poorly optimized in the first place. They didn’t match the content. When we added a few keywords to the end, Google interpreted that as, hey, these match a little bit better to the content, and that’s why it rose. It was a fluke. If we would have wrote the title tags better in the first place, we could have seen this traffic all along.

So, with this in mind, I have some suggestions for your future title tag use, and best practices is going to continue to be less than 70 characters.

Best Practices are Guidelines, Not Rules

The first rule is always experiment. Like I said, if we would have tried something else, if we would have written different title tags in the first place, it could have helped us. What did it cost us to change those title tags? Zero. If your pages aren’t performing well, you can always try something different and you should try something different. I still see sites all the time, large eCommerce sites, that on thousands of pages they have their brand name, the first 20 characters of the title tag in places where they shouldn’t necessarily do that. SEOmoz did that for a number of years up until a few months ago. So, always experiment, not too much, but always try different things to see what title tags are going to work best for you.

Second is write for users. Here at SEOmoz our title tag is the same as the title of our post on our blog because we think it is important to meet users’ expectations. When they see a title tag in the SERP and they click through to your page, you want them to feel like they’ve arrived where they thought they were going to arrive. So, it doesn’t always have to match the title of your post, but something similar, something to make them comfortable, and something to talk to the users.

Third, remember to keep your important keywords first. Putting your important keywords out here isn’t going to help you much unless your titles are so poorly optimized in the first place that you really should rewrite them. So, put your important keywords, they don’t always have to be in the very first position, but as close to that first position as you can.

Lastly, what happens if your title tag is over 70 characters, such as the title tag of today’s Whiteboard Friday post at 77? Don’t sweat it. In our web app, in our Pro Web App, if you go over 77 characters, we issue a warning. It is not an error. It’s a warning. We just want you to know that maybe if your title tag is over that limit that it might not be the best written title tag. You might want to have a look at it, but here at SEOmoz we have thousands of title tags that go over the 70 keyword limit, and for the most part, we’re going to be fine. Best practices means that it’s best most of the time, but you can go outside of best practices if it’s warranted.

Remember, experiment, try different things out, find out what works best for you.

That’s it for today. Appreciate your comments below. Thanks everybody.

Video transcription by Speechpad.com

Do you like this post? Yes No

Amazon Silk browser raises privacy concerns

As we reported earlier, the much anticipated tablet device from Amazon has been unveiled and is soon to arrive, at least in the US anyway.

However, whilst many are fascinated by a potential rival to the market-dominating iPad, others have already raised concerns about the way the Kindle Fire’s browser handles data.

The Fire has its own browser which is called Silk. Amazon says that it is designed to “overcome the limitations of typical mobile browsers” and it does so by performing some functions in the cloud.

Amazon Silk uses a split-architecture, which means that whilst all of the “browser subsystems are present on [the] Kindle Fire” when you load a page, Silk can dynamically execute some subsystems remotely in the cloud. By spreading the workload between cloud computing and the device itself, browsing can be sped up considerably.

The problem with this is that it has the potential to create a large amount of user data on the cloud system, which can be accessed and collected by Amazon.

As ZDnet point out, this means that “if Amazon wanted to, it could possess the complete online life of all of its Fire users.”

Considering the ongoing concerns surrounding privacy, it is interesting to note in the Silk privacy policy that Amazon says little about how it will deal with the data it collects.

Whilst a Fire user is surfing, they don’t connect directly to a web page, but rather are passed through the Amazon Cloud in order to deliver that fast browsing experience.

This means that not only does Amazon intercept your direct connection with a site, but they also are the ‘middle-man’ when it comes to secure connections.

Further to this, their privacy policy states that they will collect information from websites whilst a user is connected such as URLs or an IP address in order to troubleshoot technical issues with Silk.

However, have no fear as they also point out that they “generally do not keep this information for longer than 30 days.”


Whilst Amazon more than likely have no intention of monitoring its users every move, the danger for them is that it appears they can do so, and they haven’t really addressed this clearly in their privacy policy.

The good news is Amazon Fire users can choose not to use the speed benefits of the cloud, as Silk also has an off-cloud option that enables users to be able to connect to websites directly.

This option would seem like an eminently sensible choice if you have concerns about your privacy and how your data is collected and stored.

CCP initiates new drive to get people online

The Communications Consumer Panel (CCP) is to begin a new project aimed at getting, and keeping, people online.

They are especially interested in ensuring that enough is being done in order to provide facilities and options for minority groups, such as the elderly and those on a low income.

The panel has provided a new framework for digital participation based on existing research, which concentrates on what consumers say they need in order to get online.

It is increasingly important for all members of society to have access to online services as more essential services move online, or they’re delivered offline in a manner which costs the customer more, or leads to a lower quality of service.

The CCP will continue to work with the government to ensure that those who are “digitally excluded” will eventually have better access to information and will feel more secure about their ability to cope online.

Whilst the industry has become more regulated and competitive, there remain some problem areas where consumers don’t get as much choice as they would like.

In a recent survey carried out by CCP, it was found that consumers rarely switch broadband and telecoms providers, as compared to the utilities market.

This is due to a number of factors, such as terms and conditions imposing penalties when a contract is ended early, as well as the sheer hassle that is involved with switching.

Anyone who has ever switched will be aware of the wait that is imposed, as the broadband line is ceased by one company before it is reactivated by another.

It seems that the industry would benefit from more regulation as complaints range from broadband speeds to usage and higher than expected bills. And this doesn’t do much to attract those groups who currently aren’t online.

The recent guidelines which were published by CAP addressed the issues around misleading advertising claims, but the CCP says that these do not go far enough and are “extremely unsatisfactory.”

“Consumers are still unable to make an informed choice of which ISP gives them the best internet speeds if only 10% of a provider’s customers get the maximum advertised speed,” the panel noted.

The new guidelines state that providers are only allowed to advertise certain speeds if they are enjoyed by 10% of their customer base, which does seem a small proportion to base an ad campaign on.

BT and TalkTalk have recently been warned to cease making adverts that claim high speeds are widely available to customers when, in fact, they are not.

This means that not only are consumers often misled, but it muddies the waters when it comes to comparing the best deals on offer.

Facebook cookie issue now fixed

On Monday, we reported that Facebook continued to track user’s movements online after logout, according to blogger Nik Cubrilovic.

The issue seems to have caused quite a stir at Facebook and they now appear to have ‘fixed’ the problem, Nik writes, in an update to the blog.

He says that “over the course of the last 48 hours” he has been in “constant contact with Facebook on working out solutions.”

This has led to the social media giant making changes, which they have explained to Nik in detail.

Whilst Nik goes into a fair amount of technical detail in the blog, the upshot of his work with Facebook is that the logout issue has now been dealt with and users are no longer tracked.

Facebook themselves say that there was “a bug” which meant that user IDs were not being destroyed, as they should be, on logout.

However, this leaves other cookies still present, which are designed for various things such as helping the site “identify suspicious login activity like failed login attempts and keep users safe.”

These are also used to flag behaviour, such as when a spammer creates multiple accounts, and repeated failed login attempts.

Another cookie that remains is intended to protect users who access the site via a public computer, which was one of Nik’s main concerns in the first blog. This will ensure the “keep me logged in” option will be unchecked if it’s found that multiple users are using the same PC.

“These cookies, by the very purpose they serve, uniquely identify the browser being used – even after logout. As a user, you have to take Facebook at their word that the purpose of these cookies is only for what is being described,” Facebook told Nik.

It seems that the other cookies that remain “are not very interesting”, being used mainly for things like browser language and timestamps.

Nik advises users to continue to clear cookies after a session even though the problem has been fixed to his satisfaction.

He noted: “I believe Facebook when they describe what these cookies are used for, but that is not a reason to be complacent on privacy issues and to take initiative in remaining safe.”

Using Social Media to Get Ahead of Search Demand

Posted by iPullRank

Before I even start saying anything about keyword research I want to take my hat off to Richard Baxter because the tools and methodologies he shared at MozCon make me feel silly for even thinking about bringing something to the Keyword Research table. Now with that said, I have a few ideas about using data sources outside of those that the Search Engines provide to get a sense of what needs people are looking to fulfill right now. Consider this the first in a series.
Correlation Between Social Media & Search Volume
The biggest problem with the Search Engine-provided keyword research tools is the lag time in data. The web is inherently a real-time channel and in order to capitalize upon that you need to be able to leverage any advantage you can in order to get ahead of the search demand. Although Google Trends will give you data when there are huge breakouts on keywords around current events there is a three-day delay with Google Insights and AdWords only gives you monthly numbers!
However there is often a very strong correlation between the number of people talking about a given subject or keyword in Social Media and the amount of search volume for that topic. Compare the trend of tweets posted containing the keyword “Michael Jackson” with search volume for the last 90 days.

Michael Jackson Trendistic Graph
"Michael Jackson" Tweets


Michael Jackson Google Insights Graph
"Michael Jackson" Search Volume

The graphs are pretty close to identical with a huge spike on August 29th which is Michael Jackson’s (and my) birthday. The problem is that given the limitations of tools like Google Trends and Google Insights you may not be able to find this out until September 1st for many keywords and beyond that you may not be able to find complementary long tail terms with search volume.
The insight here is that subjects people are tweeting about are ultimately keywords that people are searching for. The added benefit of using social listening for keyword research that you can also get a good sense of the searcher’s intent to better fulfill their needs.
Due to this correlation social Listening allows you to uncover what topics and keywords will have search demand and what topics are going have a spike in search demand –in real-time.
Before we get to the methodology for doing this I have to explain one basic concept –N-grams. An N-gram is a subset of a sequence of length N. In the case of search engines the N is the number of words in a search query. For example (I’m so terrible with gradients):
 Michael King SearchLove NYC 5-gram
is a 5-gram. The majority of search queries fall between 2 and 5-grams anything beyond a 5-gram is most likely a long tail keyword that doesn’t have a large enough search volume to warrant content creation.

If this is still unclear check out the Google Books Ngram viewer ; it’s a pretty cool way to get a good idea of what Ngrams are. Also you should check out John Doherty’s Google Analytics Advanced Segments post where talks about how to segment N-grams using RegEx.

Real-Time Keyword Research Methodology

Now that we’ve got the small vocabulary update out of the way let’s talk about how you can do keyword research in real-time. The following methodology was developed by my friend Ron Sansone with some small revisions from me in order to port it into code.

1.  Pull all the tweets containing your keyword from Twitter Search within the last hour. This part is pretty straightforward; you want to pull down the most recent portion of the conversation right now in order to extract patterns. Use Topsy for this. If you’re not using Topsy, pulling the last 200 tweets via Twitter is also a good sized data set to use.

2.  Identify the top 10 most repeated N-grams ignoring stop words. Here you identify the keywords with the highest (ugh) density. In other words the keywords that are tweeted the most are the ones you are considering for optimization. Be sure to keep this between 2 and 5 N-grams beyond that you most likely not dealing with a large enough search volume to make your efforts worthwhile. Also be sure to exclude stop words so you don’t end up with n-grams like “jackson the” or “has Michael.” Here’s a list of English stop words and Textalyser has an adequate tool for breaking a block of text into N-grams.

3.  Check to see if there is already search volume in the Adwords Keyword tool or Google Insights. This process is not just about identifying breakout keywords that aren’t being shown yet in Google Insights but it’s also about identifying keywords with existing search volume that are about to get boost. Therefore you’ll want to check the Search Engine tools to see if any search volume exists in order to prioritize opportunities.

4.  Pull the Klout scores of all the users tweeting them. Yeah, yeah I know Klout is a completely arbitrary calculation but you want to know that the people tweeting the keywords have some sort of influence. If you find that a given N-gram has been used many times by a bunch of spammy Twitter profiles then that N-gram is absolutely not useful. Also if you create content around the given term, you’ll know exactly who to send it to.

 Methodology Expanded

I expanded on Ron’s methodology by introducing another data source. If you were at SMX East you might have heard me express the love that low budget hustlers (such as myself) have for SocialMention. Using SocialMention allows you pull data from up to 100+ social media properties and news sources. Unlike Topsy or Twitter there is an easy CSV/Excel File export and they give you the top 1-grams being used in posts related to that topic. Be sure to exclude images, audio and video from your search results as they are not useful.
Michael Jackson Social Mention
"Michael Jackson" Social Mention
One quick note: The CSV export will only give you a list of URLs, sources, page titles and main ideas. You will still have to extract the data manually or with some of the ImportXML magic that Tom Critchlow debuted earlier this year.
So What’s the Point?
So what does all of this get me? Well today it got me "michael jackson trial," "jackson trial," "south park" and "heard today." So if I was looking to do some content around Michael Jackson I’d find out what news came to light in court, illustrate the trial and the news in a blog post using South Park characters and fire it off to all the influencers that tweeted about it. Need I say more? You can now easily figure out what type of content would make viral link bait in real-time.
So this sounds like a lot of work to get the jump on a few keywords, doesn’t it?
Well I can definitely relate and especially since I am a programmer it’s quite painful for me to do any repetitive task. Seriously am I really going to sit in Excel and remove stop words? No I’m not and neither should you. Whenever a methodology like this pops up the first thing I think is how to automate it. Ladies and gentlemen, I’d like to introduce you to the legendary GoFish real-time keyword research tool.
GoFish Screenshot
I built this from Ron’s methodology and it uses the Topsy, Repustate and SEMRush APIs. When I get some extra time I will include the SocialMention API and hopefully Google will cut the lights back on for my Adwords API as well.
I seriously doubt it will handle the load that comes with being on the front page of SEOmoz as it is only built on 10 proxies and each of these APIs has substantial rate limitations (Topsy – 33k/day, Repustate 50k/month, SEMRush-I’m still not sure) but here it is nonetheless. If anyone wants to donate some AWS instances or a bigger proxy network to me I’ll gladly make this weapons grade. Shout out to John Murch for letting me borrow some of his secret stash of proxies and shout out to Pete Sena at Digital Surgeons for making me all-purpose GUI for my tools.
Anyway all you have to is put in your keyword, press the button, wait for a time and voila you get output that looks like this:
GoFish Screenshot 2
The output is the top 10 N-grams, the combined Klout scores of the all users that tweeted the given N-gram vs the highest combined Klout score possible, all of the users in the data set that tweeted them and the search volume if available.
So that’s GoFish. Think of it as a work in progress but let me know what features will help you get more out of it.
Until Next Time…
That’s all I’ve got for this week folks. I’ll be back soon with another real-time keyword research tactic and tool. if you haven’t checked out my keyword-level demographics post yet, please do! In the meantime look for me in the chatroom for Richard Baxter’s Actionable Keyword Research for Motivated Marketers Webinar.

Do you like this post? Yes No

EC launches investigation into Euro broadband speeds

The European Commission is launching an investigation into broadband speeds across Europe.

It aims to give an independent analysis of data collected in order to give a proper idea of which countries have the fastest, and indeed slowest, connections across Europe.

With the UK seeing continued delays to the 4G roll out it is likely we will end up in the latter category at this rate.

Information will apparently be open to ISPs , regulators and all consumers once it is collated in partnership with measurement firm SamKnows.

The project will involve enlisting the help of 10,000 volunteers who will be sent a small device to plug into their home internet.

SamKnows has already conducted a similar project with Ofcom, and it was found that broadband performance was less than half of that advertised. Hopefully the EC project will put an end to such scallywag behaviour amongst operators.

Or at least it will give members of the public an idea of which providers are closer to achieving their advertised connection speeds, and which are way off with bogged down networks.

Alex Salter, chief executive of SamKnows, said the test would be on a much bigger scale than the UK project. According to a BBC report, he said: “We are working towards a standard for measuring internet performance and a public dataset that everyone can access.”

“This is vital for everything from major government investment initiatives through to helping consumers choose which broadband provider and package.”

When the coalition government came to power, it announced that come 2015 the UK would be top broadband dog in Europe. We aren’t holding our breath.

Accidental Noindexation Recovery Strategy and Results

Posted by chadburgess

"I know before the cards are even turned over…" – Mike McDermott, Rounders

When Mike McD was called by Teddy KGB in a huge No-Limit Hold’em poker pot, he didn’t have to see his opponents hand to know that KGB had two aces, the only hand in the deck that could beat his nines full of aces (if you have seen Rounders, feel free to skip over the video below, if not, you probably should get on that). This was the same feeling I had when we got "SERP a DERPd" via accidental noindexation of 9,000 of our most important pages….



  1. What happens when pages are accidentally noindexed
  2. Tactics for getting pages into the Google index quickly
  3. How noindex impacts SERP rankings

(note that I am focusing on Google in this post) 


I am an in-house SEO and customer acquisition marketer at SeatGeek.com, a NYC tech startup. Our site is a ticket search engine for sports concerts and theater tickets (i.e. "a Kayak for event tickets").

On Monday 8/1, I was searching Google for ‘mets tickets‘ and saw that SeatGeek had slipped from page 1. Worse, we weren’t even on page 2. I tried a few more queries that I knew we should be on page 1 for and still nothing. My heart was beating. Had we been Panda’d? It didn’t make sense, but I was panicked. Then it hit me. I opened up our New York Mets page, but, just like Mike Mcd, I knew before I even clicked view source…content="noindex" on all of our product pages.

No Index

I have only been doing SEO for ~2 years, so I had never directly experienced an accidental noindex situation. So even as I read reports of these not having an impact on rankings and knew this wasn’t as bad as an accidental canonicalization problem, I couldn’t help but envision the worst case scenario…9,000 of our most important conversion driving pages would be out of the index for weeks and would not have their same rank when they got back in

What happens when pages are accidentally noindexed

Impact of Accidental Noindexation

This is a chart of incoming organic traffic to one of our key pages right when the noindex hit.

Obviously organic traffic ceased to exist. Interestingly though, Google Analytics still reported some traffic to these pages.

This might be the one instance where having less frequent crawl frequency can be beneficial (assuming bandwidth isn’t an issue). The pages that got noindexed are recrawled every 4-6 days, which would have given us a buffer if we caught this sooner. Unfortunately, Google waited until Saturday to crawl these pages and we didn’t catch the problem until Monday. 

Reindexation Plan and Tactics:

The first course of action was to remove the noindex tags, which one of our pop star engineers did within five minutes. This was right around the time I sent out my first plan of action email which I have included below in case you ever have to write the same email: 

So I was doing a daily scan of SERP positions and started noticing team band pages had dropped. At first I thought we got Panda’d, but it looks like the noindex tags that are supposed to be applied to search pages and filtered navigation recently got pushed into production, but because those pages only get reindexed every 3-6 days there was some delay in the traffic impact, which you can see if you filter by team/band pages.
We are currently:

  • Noindex already removed in production
  • Writing blog posts that link to all major sports teams to get these reindexed (more difficult for bands)
  • Launching social media campaigns to support this cause
  • Forcing update on .xml sitemap (hopefully to help with concerts issue)
  • Investigating additional techniques
  • Going to look into the current traffic impact / which pages got impacted the most (hopefully some deeper artist type pages never got recrawled before the fix)

http://www.webmasterworld.com/webmaster/3601620.htm Here’s to hoping this is true "My experience is that "noindex" is quite harmless when it comes to ranking. As soon as you change it to "index", the pages should pop up at nearly the same positions in the SERPs as where they were." I will keep you all posted. -Chad 

Even if rankings would come back, we wanted this to happen as quickly as possible. I had a plan, and fortunately some great interns to help me out. So this is what we did (excuse any repetition from the email)…

Submit to Google Index via Webmaster Tools

All of the above was completed within one hour of us discovering the issue, except for the guest posts and contest which were done over the next 1-2 days. And then we waited… 

Reindexation Metrics:

It took 1-2 days for our most important pages to get back into the index, which we were really happy with. Some of our deeper / less important pages took up to 5 days to come back or longer in some cases. Fortunately we had followed advice from other Mozzers and introduced multiple XML sitemaps earlier in the year with all our product pages in one XML sitemap we were able to easily track indexation of these pages via Google Webmaster Tools. Indexation and traffic were on their way back up by the next day, but as you can tell from the graph below traffic didn’t return to previous levels to about 2-3 days from when the noindex tag was removed.

Noindexed Page Traffic Before and After


Rankings Impact of Noindexation:

Ranking After Accidental Noindexation

Now let’s look at how this impacted our SERP rankings. The example above, was a truly interesting case because our Mets page returned to the index the night of the fix and I emailed my bosses to check it out as a good example of a recovering page, but by the time we got into work the next morning it had left the index again and I looked like a clown shoe. Fortunately, the page came back (again…) into the index the next day and was back up to its previous ranking by the end of the week. This is an example of a trend I noticed that many pages would come back into the index first and then return to ranking for their target terms a day or so later.

The example below is one where we returned to the index but without the same rank as we had before. There isn’t really a way to tell if this was impacted at all by the noindex situation, I suspect it was just a random Google dance related to the more frequent shakeups I have seen in event "tickets" related queries. Overall, our page 1 SERP positions have completely returned to prior levels.

Giants Ranking after noindex


  • If you accidentally noindex pages on your site, of course they will stop getting traffic from organic search, but this will be dependent on the crawl rate of the pages (in our case it took ~5 days for them to drop out of the index) and 2-3 days for them to return to normal levels
  • If you have a blog that gets crawled quickly, use that as a tool to help drive spiders back to the pages that were noindexed with strategic internal linking (of course wait until you have removed the noindex tag)
  • Take advantage of friends & family to help with social shares and pump this up with a social giveaway
  • Use Google Webmaster tools: 1) XML sitemap resubmit 2. Manual ‘Submit to Index’ 3. Sitemap indexation tracking
  • You should have Multiple XML sitemaps set up into logical buckets for indexation tracking to faciliate the indexation tracking mentioned above
  • Although your rankings might see short-term "dancing", an accidental noindex will not have a negative impact on them
  • Lastly, don’t be too worried, just follow some of the tactics above and you should be back in the index with the same rankings (have your boss email me if they are giving you crap – chad@seatgeek.com)

Ok so that was probably too much information for just an accidental noindex situation, but when it happened to me it was scary and there wasn’t a solid documentation on what to expect, so I wanted to produce this for the next person in my situation. Thanks for reading. Connect with me on Twitter if you are so inclined.

Do you like this post? Yes No