Seeing what Sitebeam sees

There are occasions when Sitebeam will look at a website and see something differently from what you might see in your browser. This can cause confusion and test results which don’t match what you would expect.

Diagnosing a problem usually requires some understanding of web technologies like HTML and HTTP headers. If you are unsure about this, we recommend you contact us for support.

Taking a look at a single webpage

This simple check can identify most common problems:

  1. Click the Account tab.
  2. Click Test URL.
  3. Enter a web address in the URL box.
  4. Click OK.

Sitebeam will download that URL and show you a series of diagnostics about what exactly it sees, including the HTTP headers, cookies, files, timings and internal diagnostics that Sitebeam undertakes. In particular, Sitebeam will reveal how it classifies the URL – for example, is it an HTML page, or a redirection?

Sometimes you will view a single URL like this and it will appear to work exactly as you would expect in a browser. The next step is to see exactly what Sitebeam is seeing when it spiders your website.

Inspecting the spider log in detail

You can see exactly what Sitebeam sees when it spidered a website in great detail. View a report, and in the very bottom left of the report click the xx pages tested link.

On this screen you can see each page encountered by Sitebeam as it attempted to spider the site, in the order they were encountered. By clicking the Advanced options link at the top, you can filter this list to include or exclude many attributes, such as pages which were rejected, or of a certain size.

Crucially, clicking on any page listed in this area will display the page exactly as Sitebeam saw it. This means a request will be made exactly like what Sitebeam made at the time it tested the site – not just the URL, but the cookies and HTTP headers too. For some complex diagnostics, the cookies and headers can affect what Sitebeam saw significantly.

Common problems

These are known reasons why Sitebeam may see something different from what you see in your browser:

  • The website contains random elements. Some webpages change every time they are viewed. There is no way around this other than to modify the website.
  • The website behaves differently when accessed from different physical locations. For example, some websites detect the location of the visitor and take them to different pages or show them different content automatically based on this. Our servers are generally based in the US and the UK, but this may change over time.
  • The website behaves differently when accessed by different browsers, operating systems or default languages. Some websites modify their content based on the user agent of the browser requesting the pages, for example to provide mobile content. Some attempt to detect the user’s system language based on their HTTP headers. Any of these settings can have an effect.
  • The website may be using anti-bot security. Some websites attempt to detect the behaviour of bots (such as Sitebeam) and prevent them from accessing their webpages. Generally these rely on detecting intentionally obscure behaviour that is hard to emulate, or by limiting the rate at which webpages can be downloaded. You can force Sitebeam to spider more slowly, or with a different user agent if you wish. Go to Site settings > Pages to test and review the advanced options at the bottom. Some anti-bot technology cannot be circumvented in this way however.

Was this article helpful? Contact our support team if you have a question.