How to strip out HTML tags from a string using JavaScript ?
To strip out all the HTML tags from a string there are lots of procedures in JavaScript. In order to strip out tags we can use replace() function and can also use .textContent property, .innerText property from HTML DOM. HTML tags are of two types opening tag and closing tag.
- Opening tag: It starts with a ‘<‘, followed by an HTML keyword, and ends with a ‘>‘. <html>, <br>, <title> are some examples of HTML opening tags.
- Closing tag: It starts with a ‘</‘, followed by an HTML keyword, and ends with a ‘>‘.</html>, </title> are examples of HTML closing tags.
The below examples illustrate both approaches:
Example 1: The ‘<‘, ‘</’, ‘>’, can be used to identify a word as an HTML tag in a string. The following examples show how to strip out HTML tags using replace() function and a regular expression, which identifies an HTML tag in the input string. A regular expression is a better way to find the HTML tags and remove them easily.
Program: In JavaScript, the following code strips a string of HTML tags.
javascript
function removeTags(str) { if ((str=== null ) || (str=== '' )) return false ; else str = str.toString(); // Regular expression to identify HTML tags in // the input string. Replacing the identified // HTML tag with a null string. return str.replace( /(<([^>]+)>)/ig, '' ); } console.log(removeTags( '<html>Welcome to GeeksforGeeks.</html>' ));; |
Output:
Welcome to GeeksforGeeks.
Example 2: The .textContent property returns the text content of the specified node and all its descendants. The .innerText property do the same thing as .textContent property.
Program: In JavaScript, the following code strips a string of the HTML tags.
javascript
// HTML tags contain text var html = "<p>A Computer Science " + "Portal for Geeks</p>" ; var div = document.createElement( "div" ); div.innerHTML = html; var text = div.textContent || div.innerText || "" ; console.log(text) |
Output:
A Computer Science Portal for Geeks
Please Login to comment...