Sitebeam can’t see any or enough pages

Sitebeam can usually figure out what pages belong to a website automatically, just by typing in a single web address. However, some sites are more complex and require more advanced settings.

If the website uses SSL

If the website uses SSL, e.g. the web addresses begin with “https://” instead of “http://”, try this:

  • Go to Site Settings.
  • Click Pages to test.
  • Change SSL version (at the bottom) to Force v1.
  • Save your changes and re-test the site.

Check the web address is correct

Visit the web address that you have entered into Sitebeam, and check that it doesn’t actually take you somewhere else.

e.g. visiting www.microsoft.co.uk takes you to www.microsoft.com/en/gb/

Sitebeam detects this automatically in most cases, but a few sneaky websites use tricks that it can’t detect automatically.

Check that your web address is actually the top page

For example, you usually want to test an address like:

www.example.com

Not an address like:

www.example.com/pages/subpages/1

If you test an unnecessarily specific web address like this, Sitebeam may not be able to find all the pages in your site, because it’s limited to pages that appear inside the address you provided.

Check the site is not entirely made from Flash

If the site is made entirely from Flash, it does not have webpages that can be read in a conventional way. Sitebeam will see the pages with Flash, but it won’t see the Flash itself.

Bear in mind many other devices – including virtually all mobile phones and tablets – are unable to read Flash, and that Flash is discouraged by search engines.

Use Test URL

  • Go to Account
  • Click on Test URL.
  • Enter your homepage into the URL text field.
  • Click Download.

This provides all sorts of information, from http codes to a list of links found on the page, and all of the HTML that was downloaded.  This is the first thing we will do to diagnose the problem.  You can read more about the Test URL feature on the seeing what Sitebeam sees page.

If you see a 403 (Forbidden) error and a message about how you do not have permission to access the page, go to Site Settings > Pages to test and set the User Agent to Sitebeam.  Set the maximum connections to 1.  Run the report again.

In rare cases, you’ll see a 403 (Forbidden) error but have lots of your own site’s links found, and a full page of HTML.  In this specific case you’ll want to go to Site Settings > Pages to test and check the “Ignore http errors and spider everything (not recommended)” checkbox.  Then run the report again.

Test from a specific geographic location

Some sites can only be seen from a specific physical location, and some sites appear differently when viewed from different locations. For example, Google behaves differently in the UK than in the US.

By default Sitebeam downloads pages from servers based in the United States, but you can change this if you believe it will help:

  • Go to Site Settings.
  • Click Pages to test.
  • Select a specific location from the Location drop-down menu.
  • Save your changes and re-test the site.

Check for different hostnames

A hostname is the part of the web address before the first slash (ignoring any “http://” or “https://”), for example:

www.silktide.com/company

The hostname is www.silktide.com. For:

nibbler.silktide.com

The hostname is nibbler.silktide.com

Sitebeam does not consider different hostnames like these to be from the same site (i.e. www.silktide.com and nibbler.silktide.com are separate websites). However, sometimes you might disagree. If you want to combine multiple hostnames and test them together, you need to tell Sitebeam to do this:

  • Click Site settings
  • Click Pages to test
  • Uncheck the Guess what webpages belong to this website automatically box
  • In the Include URLs box, enter any hostnames you wish to test, one on each line
  • Click Save changes when you’re done and re-test the website as normal

Try spidering with fewer connections

Some sites appear to fall over when spidered normally, and this can cause the spider to only see error pages with no links on them. This setting forces Sitebeam to only download one page at a time (instead of 5). This will mean your report takes much longer to run, but it can fix some issues with reports.

  •  Go to Site Settings 
  • Click on Pages to test.
  • Under the Advanced options, use the dropdown menu to change Max connections from Automatic to 1.
  • Click Save changes and run the report again.

The site requires Javascript

Some badly behaved sites insist that Javascript is enabled to be accessed. Usually these sites set a cookie in the browser using Javascript, redirect the browser and then check for the cookie – if it isn’t set they display an error (or worse, crash).

The only way to spider these sites is to tell Sitebeam to set the cookie manually:

  • Visit the site and inspect what cookies are set for that site in your browser. The details for how to do this will vary depending on your browser and operating system.
  • In Sitebeam, click Site settings for the website you wish to test.
  • Click Manual steps.
  • Click New step.
  • Select Cookies and click OK.
  • In the cookies table, enter the name of any required cookies and their values.
  • Save your changes and retest the site as normal.

This is a more advanced technique so please contact us if you need help.

Turn off cookies

Some web servers set cookies to stop spiders, so you may need to stop the spider from accepting and sending back cookies.

  • Go to Site Settings
  • Click on Normalizer
  • Click on Cookie analyzer
  • Uncheck the “Accept any cookies?” box
  • Click OK

Run the report as you would normally.

Change the User Agent

Some web servers don’t like spiders, and will prevent Sitebeam from pulling the pages on the site.  Often this will resolve problems with a site that has a 403 error when you tried to add it to the system.

  • Go to Site Settings.
  • Click Pages to test.
  • Select Sitebeam 5.x.x from the User Agent drop-down menu.
  • Save your changes and re-test the site.

If all else fails

Use the Contact link in the footer of Sitebeam to get in touch.  Be sure to specify what website you were testing that failed (the easiest way to do this is supply the web address you were testing).

Was this article helpful? Contact our support team if you have a question.