Live Tutorial: Cleaning up and auditing 404’s (broken links) on our website

While I’m only aware of one way to skin a cat, there are a number of ways to find and cleanup 404 errors on your website. While we aren’t a huge fan of using SEO tools for the most part, there is one that we highly recommend using for finding broken links on a website, and that tool is Xenu Link Sletuh.

Xenu is one of those tools that has been around forever, and that everyone uses. It was voted by PC Magazine the “fastest link checking software” which should say a lot.

One very common misconception is that 404’s will hurt your site or give you some sort of penalty. Google debunked this myth back in 2011 in an official blog post and it was reiterated again this year by Google’s new unofficial spokesperson and Webmaster Trends Analyst Garry Illyes.

But don’t get too comfortable – just because Google won’t penalize you doesn’t mean it is good for business. After all, most people hate landing on a broken link or 404. A lot of times this will actually cause visitors to “bounce” both figuratively and literally from your website.

We’ve been blogging since 2010 on this website and over the years have linked out to hundreds if not thousands of different websites and blogs. As months turn into years, many of those websites disappear or change their address by:

While 1 or 2 broken links isn’t really a big deal, on a large website such as this approaching 1000 indexed pages it can really start to add up. The process of finding and fixing broken links is simple, but the devil is in the details. In this tutorial, I’ll show you how I was able to fix 100’s of broken links on our website in just under a few hours:

What You’ll Need

In order to carry out this process you’ll need:

If you feel confident in your abilities, proceed with caution.

Beginning the Broken Link Audit

Skill Level: Intermediate  – we don’t consider this a beginner task in the fact that many things can go wrong. You have to have some knowledge of servers and websites, and must have a strong overall attention to detail. With that said, proceed with caution

Launch Xenu the way you would any Windows app. Throw your URL in the top input and hit “OK” you can ignore other options for now. Go back and mess with them later if you want.

1 xenu

 

Warning: keep in mind Xenu can really tax your web host and your ISP. We used to joke around in our office and call it our own “DDoS attack tool” because it would actually take down some of our customers websites.

Anyway, you’ll see it start to run. Watch and learn. Take note of anything that turns “red” and familiarize yourself with the columns. If you’d like you can use the waiting time to do a mini title tag and meta description audit, another feature of Xenu. You can sort any column by clicking on the column title. I sort most of my columns either by “name” or by “status.”

2 scan

 

Once Xenu is done running, it’ll politely ask you for your FTP details. This will allow the tool to do a deep scan for orphan files.

3 FTP

 

As I said earlier try to take note of anything interesting or look for patterns. In our case we had a small outage around November / December of 2011 which caused us to lose some images. This was sadly reflected in this tool, indicating a number of “not founds” throughout our site. We’ll fix those later.

3

 

I was running this tool on a virtual machine with not a lot of RAM. If it freezes up do not abort unless it freezes for 10+ minutes. A lot of times the tool will hit a bottleneck and you’ll just have to wait it out.

4 not responding

 

Once finished the tool will minimize and a report will open up in your web browser. Do a File > Save as so you can save this report for later.

5 broken links report

 

I like to save incremental versions of my reports just in case I ever want to go back.

5.5 save for later

 

Ok now getting down to the nitty gritty. One by one, go to each URL on your website and seek out the broken link. In this case, the example shows a broken image.

7 find 404s

 

I double check each URL to ensure that the tool didn’t misfire just in case. In this case, the image was in fact “not found” and needed to be fixed.

6 find 404s

 

Thanks to Xenu, I was able to find the image located inside of an old blog post.

8 find 404s

 

The next step is to pop open the source code and fix the error. In this case since it was a very unpopular post I just deleted the image reference.

Another alternative would be to re-create the image in Photoshop or find an alternative on the web and drop it back in via FTP at the 404’d path.

9 find 404s

 

Sometimes the tool will report multiple 404’s on CMS’s like WordPress due to crazy URL parameters, when in actuality it is only 1 404.

10 find 404s dont worry about multiple

 

In another case Xenu reported a “not found” error because of a whacky URL parameter at Mashable.com. Even though the site redirected properly, I still fixed it being that is “best practice.”

12 its not always 404 it could be slightly modified

 

Other blogs change their entire permalink structure for their own reasons. Again, it is best practice to fix these. Google will love you.

13 fix any 301s

 

Xenu will also find “not found” errors caused by 301 redirects. There have been a ton of these this year due to all these websites switching to SSL / HTTPS. If you only have a few of these you can fix them manually. If you are feeling adventurous you can fix these in bulk using a few different tools.

14 or just SSL

 

Since we are an SEO blog and link out to Moz.com a lot, we found hundreds of links to Moz.com. Being that they switched to SSL / HTTPS last year, this triggered a number of “not founds” on our report. We could go to each post and edit it manually, but since we use a database we can run a find and replace.

15 bulk 301 fix

 

Please, please, please be careful doing this and make sure you backup your database. The below code will only work with WordPress. Again, use at your own risk.

This short SQL query will replace all “http://moz.com” with “https://moz.com”

You’ll also have to fix any www vs non-www URL’s just an FYI.

15 bulk 301 fix phpmyadmin

 

Here is the MYSQL query I ran to find and replace http with https on our own server.

UPDATE wp_posts SET post_content = REPLACE (  post_content,  'http://example.com',  'https://example.com');

This by no means is a difficult task, but it can be time consuming. If you are a savvy user you can use SQL queries to fix a lot of other URL’s as well.

Wrapping up

Once I am done doing a 404 audit ill generally go back and run it again. A lot of times I’ll miss a few but I also want to keep a record of what it looks like when it is running good.

In this case we were able to clean up a ton of broken links, all in all we fixed over 1000 broken links. I would say over 50% of these were http to https fixes and a large portion were 301s.

Once a website starts to get large, a lot of issues starts to pop up and fixing broken links is just one of them.

I hope you enjoyed this live tutorial, if you have any questions please always feel free to reach out to us on our contact page or on the Twittah.

 

Patrick Coombe
Patrick Coombe is the founder and CEO of Elite Strategies Llc. Patrick takes a hands on approach to managing Elite Strategies and loves to get involved with technical projects relating to clients inbound marketing needs.
Patrick Coombe
Patrick Coombe
  1 COMMENT
Blog Search SEO
  • Written by: Precious

    Thanks dear. recently, i moved my forums http://jackobian.com from phpbb to xenforo forum software and i really have to deal with these bad links that popped up from external links i have already acquired.

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    Contact Info

    900 Linton Blvd, Suite 104
    Delray Beach, FL 33444

    Phone: 561-526-8457
    Toll Free: 855-353-8730
    E-mail: info@elite-strategies.com
    Fax: 561-526-8707