Why some websites cannot be tested

It is not possible to test all websites in Sitebeam, or any automated tool. This page helps to explain why.

Flash-only websites

Sitebeam can only understand websites written in HTML, which is the language the overwhelming majority of websites use. Some websites are built from an alternative technology called “Flash”.

Flash websites cannot be interacted with like HTML websites; they do not have separate pages, or web addresses, and the content within them is either unreadable or stored in a radically different way from HTML. As a result, Sitebeam can’t see the contents of Flash pages, or interact with them to explore them.

Flash is a dying technology, not supported by mobile or tablet devices, and the creator of Flash (Adobe) has abandoned attempts to bring Flash to such devices. The use of Flash is therefore widely discouraged.

AJAX websites

AJAX is a technology used to make parts of a page load ‘on demand’, without the user having to visit another webpage. For example, Facebook uses AJAX to update the number of notifications you have in the corner of the screen, without you having to leave the current page.

Crawling AJAX sites is technically difficult and in some cases impossible. Sitebeam can crawl AJAX websites that follow Google’s AJAX Crawling specification, but websites that do not may be impossible for Sitebeam to test.

Not accessible over public Internet

Sitebeam can only access pages that are visible over the public Internet; more specifically, pages that can be seen by our clusters of servers hosted by Amazon Web Services.

If a website is only available on your private network, or is otherwise restricted so you can’t access it on the public Internet, Sitebeam cannot see it. This is not within our capabilities to resolve; the issue will lie with what or whoever has implemented that restriction.

Anti-bot/spam technology

Some websites use technology to try and prevent ‘bots’ – automated computer programs – from accessing their pages. Most commonly, this is used to stop these bots doing undesirable things whilst pretending to be your users, such as:

  • Filling in forms with spam
  • Putting too much load on your servers
  • Scanning your website for email addresses to spam

And so on.

To some of these technologies, Sitebeam may appear to be a bot – because it is. These technologies may therefore deny access to the website, believing Sitebeam to be a cause of potential harm.

Sitebeam already uses a wide range of countermeasures to prevent this from happening, but ultimately it is a bot, and there are ways of guessing this. The most reliable of which is: Sitebeam systematically checks every page in a website, at a relatively high speed. It’s not possible to conceal that and test the website.

The only solution to anti-bot technologies is to disable the anti-bot technology. You can also slow Sitebeam down to a fraction of normal speed (go to Settings > Pages to test, and set the “Delay between requests” to 1 second or more), which in some cases will bypass the problem.

Entry forms

Some websites require a form is filled in before they can be viewed (such as confirming your date of birth, or agreeing to terms & conditions). Depending on the technologies used for these forms, they may prevent access by Sitebeam.

See Client-side scripting.

Client-side scripting

Real web browsers can run programming languages, such as Javascript, which Sitebeam cannot. These languages are generally used for optional enhancements to a page, such as animation, but in some cases they are used to either create the webpage or permit access to it. Where this is the case, Sitebeam cannot see the resulting page or website.

User security

Some websites are secured behind a login screen. Sitebeam cannot, obviously, access the content behind these.

Although Sitebeam does have some features designed to help enter a login area, some login facilities use client-side scripting to help secure these further. Others use extremely complex anti-bot measures, or CAPTCHAs, or other technologies designed specifically to prevent a computer from being able to login like a real person. These technologies make it impossible for Sitebeam to login to the website.

Was this article helpful? Contact our support team if you have a question.