How Selenium WebDriver Can be Used to Detect Broken Links?
Selenium is a widely used tool for testing any websites or other applications. It is a suite of all testing software. There are many segments of Selenium. Like there is Selenium Web Driver, Selenium IDE, Selenium RC, etc. Selenium IDE is used by those users who are coming from a non-computer science background. As in Selenium IDE, there is no need for programming knowledge. It can be simply used. But in the case of Selenium Web Driver, programming knowledge is important. Selenium Webdriver can be used in Java programming language or Python programming language. The article focuses on discussing How to detect broken links in Selenium WebDriver.
The following topics will be discussed here:
- What are Broken Links?
- Why Check for Broken Links in Selenium?
- Common Reasons for Broken Links.
- How to identify broken links in Selenium WebDriver.
- Finding Broken Links in Selenium.
Let’s start discussing each of these topics in detail.
What are Broken Links?
Broken links are those links that don’t point to any specific web pages. Finding a broken link is a difficult task. Every time, when a website is developed, the links are checked by a group of programmers or testers. But after the website is developed for some reason a broken link may arise. Sometimes a proper webpage address can be changed. This can arise a broken link problem. Or sometimes the target web page is disappeared. In such cases, broken links can be found.
There are a lot of broken links categories are present. Like there are server errors associated with 404 Page Not Found Request or 400 Bad Request, etc. All these things are representations of broken links. Due to mistakes by the developers, there can be a broken link available on the websites. But having broken links on a website is not good for the authority. So, the developers pay more attention while connecting two or more web pages.
Why Check for Broken Links in Selenium?
- Free Downloadable: Selenium is open-source software for testing purposes. It is widely used throughout the globe for this simple reason. Finding broken links & any testing-related issues can easily be handled by Selenium.
- Presence Of Multi-Browser WebDrivers: Selenium comes with multi-browser WebvDrivers. There are WebDrivers for every browser. That may be the Safari browser or Google Chrome browser. So, users can use any operating system to test a particular webpage. Also, it supports multi-browser operations. So, checking for broken links in Selenium will be a wise decision.
- Usability of Multiple Programming Languages: Selenium scripts can be written in multiple programming languages. Any user can use the Java programming language or the Python programming language, as per their choice. Also, it comes with the facility to use other programming languages like Ruby. So, if a developer wants to write the script in its specialized language, so it becomes an easy task for the developer.
- High Speed: Selenium is well known for its performance & speed. Like any other testing tool, selenium is famous for its speed. It can able to compile & execute any piece of code within seconds. Also, it helps to integrate with the WebDrivers along with the scripts. This is the reason behind using Selenium for finding broken links.
- High Integrity: Selenium is highly integrated with some special IDEs like Eclipse. Selenium can easily be connected to the Eclipse IDE. This helps to write the scripts in a better manner. Also, it helps to reduce typing mistakes or any problems related to the scripts. For all these reasons, selenium is highly used for finding out broken links from a webpage.
Common Reasons for Broken Links
- Typing Mistake: This is a largely committed mistake by the programmer. While connecting two web pages, the link should be pasted into the program. While paying for a link in the program, sometimes accidentally, a letter removes. This may create a broken link problem. Removal of a single letter from the link can make it a void link. This generates the broken link problem.
- Deleted Webpage: Sometimes the target webpage gets deleted. Due to some modification to the websites, one or more webpages needs to be removed. This may create this problem. Then the link should be corrected & paste the new target link there.
- Modified Link: Sometimes the server or domain changes. Due to pricing issues, organizations move their server to the other provider. This may create a void link. The existing links get damaged. Sometimes this problem also occurred there.
- Renaming the Webpage: Sometimes modifications have been done to the target web pages themselves. This may create a broken link problem. Renaming a website or breaking its connection of it with existing web pages can create such problems.
- Error In Code: Sometimes while making websites, developers need to write a long piece of code. Due to some reasons, the developers can able to perform any mistake there. This may lead to fatal changes. Sometimes, while connecting to another webpage, a fault-written code can hamper the process. This will occur many times. Sometimes, programmers paste the same link to a certain webpage. This also can create a problem.
- Different Format File: Sometimes by clicking on the link, users can able to download a specific file there. But those files need to be changed & modified from time to time. During the modification, the format of the file may get changed. That will create a disconnection with the code there. So, the link which downloads the file to the machine can able to perform the same. This may create a broken link.
- A Fault Link: Sometimes the provider which is providing the link, can accidentally provide a broken link. There may not be a problem from the developer side or from the company side. But putting in an invalid link will cause the same effect. Thus this type of mistake rarely happens. But this cause also the broken link problem.
How to identify broken links in Selenium WebDriver?
Now, we are going to implement the source code to find broken links in Selenium WebDriver. This will help to understand better the topic. Also, this will help to identify the links on a proper web page. The proper steps will help to understand the source code easily.
Step 1: At first, provide the specific location of the WebDriver where the .exe file is located. Use the setProperty() method to provide the location. Also, specify the type of WebDriver. In the example below, Chrome Driver is used.
Step 2: Next step is to create an object of the WebDriver. As we are using the ChromeDriver, the object is going to mean the ChromeDriver here.
Step 3: Provide the link to the web page which is going to be tested. Here, the Google Home Page link will be used. We have to open the link in the Chrome Driver using the get() method.
Step 4: Collect all the links to the Google Home page. For that purpose, take one List. The list can only be able to store the WebElements like a link. The tagName is used to find out the proper link to the webpage.
Step 5: To print the total number of links the size of the list will be printed. This will help to understand the total links which are available on the webpage.
Step 6: Now, in the for loop, we have to perform some operations. First, we need to extract the proper link from each of the stored links. As in a link, some other elements are also present. They don’t perform the role of a link. So, we have to identify each and every link there. For this purpose, use the getAttribute method. Also, we have to validate the URL there.
Step 7: After that, we need to make a secure connection to the link. We are going to use an HTTP connection request to check all the URLs there. After checking each and every link, it will send the response code back to the program.
Step 8: Now, we have to check the response code. If the response code is equal to or greater than 400, then it will be a broken link. In other cases, they are all valid links. So, we have to print the status of each link on the webpage.
Step 9: At last, we have printed a message to make the end of this program. This will help to identify the successful closing of the program. Lastly, we have to close the opened chrome window by using the quit() method.
Finding Broken Links in Selenium
The implementation process can be developed into many segments. Here, the main structure of the program is divided into some parts. This will help to better understand the approach.
1. Import Packages: First, we need to import the necessary packages into the program. It is advisable to use the Eclipse IDE for easy implementation. The Eclipse IDE will automatically identify the packages. Then it imports them to a certain program. Here, we need to import the packages which are associated with the WebDrivers, Specific WebDriver, Web Elements, Some types of exceptions, etc. All these packages need to be imported into the program before executing it.
2. Collect all links on the web page: Here, we need to take one web page as an example. There we need to collect all the links present on that web page. For that purpose, we need to insert all the links into the list structure. We have to find the links with the tag name ‘a’. As all the links in one web page have the ‘a’ tag. This will help to distinguish it better. We have to use the findelement() method to get all the links. Then we have to make a type casting of the list to the WebElement format. There we have to store all the links.
List <WebElement>links = driver.findElements(By.tagName(“a”));
3. Identify and Validate URLs: Then we have to validate each & every link stored in the List. There we need to run a loop. Inside the loop, we have to identify each & every link. We have to use the getAttribute() function to identify the proper link stored in the list. Along with the link, there are some other values that also get stored inside the List. So, we have to only identify the link. As the links have the ‘href’ attributes, so we have to use them. Then we have to make an object of the URL type. This will help in the next step.
WebElement element = links.get(i);
URL link=new URL(url);
4. Send HTTP request: Now, we have to create a connection. For that purpose, we need to use the HttpURLConnection object. We need to create one object of HttpURLConnection. Using that object we can connect the link over the internet. We have to use the connect() method to do the same. This will run the links on the internet. Based on the execution, it sends one code back to the code. This is the response code of the link.
HttpURLConnection httpConn = (HttpURLConnection) link.openConnection();
5. Validate Links: Now, we have to run an if-else statement. If the response code is greater or equal to 400, then the link is a broken one. Or else if the response code is lower than that, then the link can’t be a broken one. This shows that there is not any server problem with the link.
if(code >= 400)
System.out.println(“Broken Link: “+url);
System.out.println(“Valid Link: “+url);
Below is the implementation to detect broken links on the webpage: