Validating an Email Address with Regular Expressions

Marc0

Well-known member
Registered
Joined
Jun 6, 2012
Messages
890
Points
28
These few lines of JavaScript go a long way to validate email addresses.

Code:
window.onload = initForms;

function initForms() {
     for (var i=0; i< document.forms.length; i++) {
         document.forms[i].onsubmit = function() {return validForm();}
     }
}

function validForm() {
     var allGood = true;
     var allTags = document.getElementsByTagName ("*");

     for (var i=0; i<allTags.length; i++) {
        if (!validTag(allTags[i])) {
           allGood = false;
        }
     }
     return allGood;

     function validTag(thisTag) {
        var outClass = "";
        var allClasses = thisTag.className.split (" ");

        for (var j=0; j<allClasses.length; j++) {
           outClass += validBasedOnClass(allClasses[j]) + " ";
        }

        thisTag.className = outClass;

        if (outClass.indexOf("invalid") > -1) {
           invalidLabel(thisTag.parentNode);
           thisTag.focus();
           if (thisTag.nodeName == "INPUT") {
              thisTag.select();
           }
           return false;
        }
           return true;

           function validBasedOnClass(thisClass) {
              var classBack = "";

              switch(thisClass) {
                 case "":
                 case "invalid":
                    break;
                 case "email":
                    if (allGood && !validEmail (thisTag.value)) classBack = "invalid ";
                 default:
                    classBack += thisClass;
              }
              return classBack;
           }

           function validEmail(email) {
              var re = /^\w+([\.-]?\w+)*@\w+  ([\.-]?\w+)*(\.\w{2,3})+$/;

              return re.test(email);
           }

           function invalidLabel(parentTag) {
              if (parentTag.nodeName == "LABEL") {
                 parentTag.className += " invalid";
              }
          }
      }
}
The HTML for the email validation example.

Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
     <title>Email Validation</title>
     <link rel="stylesheet" href="script01.css" />
     <script language="Javascript" type="text/ javascript" src="script01.js">
     </script>
</head>
<body>
     <h2 align="center">Email Validation</h2>
     <form action="someAction.cgi">
         <p><label>Email Address:
         <input class="email" type="text" size="50" /></label></p>
         <p><input type="reset" />&nbsp;<input type="submit" value="Submit" /></p>
     </form>
</body>
</html>
To validate an email address using regular expressions:

Code:
var re = /^\w+([\.-]?\w+)*@\w+ ([\.-]?\w+)*(\.\w{2,3})+$/;
Yow! What on earth is this? Don't panic; it's just a regular expression in the validEmail() function. Let's break it apart and take it piece by piece. Like any line of JavaScript, you read a regular expression from left to right.

First, re is just a variable. We've given it the name re so that when we use it later, we'll remember that it's a regular expression. The line sets the value of re to the regular expression on the right side of the equals sign.

A regular expression always begins and ends with a slash, / (of course, there is still a semicolon here, to denote the end of the JavaScript line, but the semicolon is not part of the regular expression). Everything in between the slashes is part of the regular expression.

The caret ^ means that we're going to use this expression to examine a string starting at the string's beginning. If the caret was left off, the email address might show as valid even though there was a bunch of garbage at the beginning of the string.

The expression \w means any one character, "a" through "z", "A" through "Z", "0" through "9", or underscore. An email address must start with one of these characters.

The plus sign + means one or more of whatever the previous item was that we're checking on. In this case, an email address must start with one or more of any combination of the characters "a" through "z", "A" through "Z", "0" through "9", or underscore.

The opening parenthesis ( signifies a group. It means that we're going to want to refer to everything inside the parentheses in some way later, so we put them into a group now.

The brackets [] are used to show that we can have any one of the characters inside. In this example, the characters \.- are inside the brackets. We want to allow the user to enter either a period or a dash, but the period has a special meaning to regular expressions, so we need to preface it with a backslash \ to show that we really want to refer to the period itself, not its special meaning. Using a backslash before a special character is called escaping that character. Because of the brackets, the entered string can have either a period or a dash here, but not both. Note that the dash doesn't stand for any special character, just itself.

The question mark ? means that we can have zero or one of the previous item. So along with it being okay to have either a period or a dash in the first part of the email address (the part before the @), it's also okay to have neither.

Following the ?, we once again have \w+, which says that the period or dash must be followed by some other characters.

The closing parenthesis ) says that this is the end of the group. That's followed by an asterisk *, which means that we can have zero or more of the previous itemin this case, whatever was inside the parentheses. So while "dori" is a valid email prefix, so is "testing-testing-1-2-3".

The @ character doesn't stand for anything besides itself, located between the email address and the domain name.

The \w+ once again says that a domain name must start with one or more of any character "a" through "z", "A" through "Z", "0" through "9", or underscore. That's again followed by ([\.-]?\w+)* which says that periods and dashes are allowed within the suffix of an email address.

We then have another group within a set of parentheses: \.\w{2,3} which says that we're expecting to find a period followed by characters. In this case, the numbers inside the braces mean either 2 or 3 of the previous item (in this case the \w, meaning a letter, number, or underscore). Following the right parenthesis around this group is a +, which again means that the previous item (the group, in this case) must exist one or more times. This will match ".com" or ".edu", for instance, as well as "ox.ac.uk".

And finally, the regular expression ends with a dollar sign $, which signifies that the matched string must end here. This keeps the script from validating an email address that starts off properly but contains garbage characters at the end. The slash closes the regular expression. The semicolon ends the JavaScript statement, as usual.

Code:
return re.test(email);
This single line takes the regular expression defined in the previous step and uses the test() method to check the validity of email.
 
Older Threads
Replies
0
Views
2,298
Replies
0
Views
2,550
Replies
2
Views
3,141
Replies
1
Views
2,488
Replies
30
Views
11,214
Newer Threads
Replies
1
Views
2,487
Replies
3
Views
3,808
Replies
3
Views
5,060
Replies
0
Views
3,846
Recommended Threads
Similar Threads

Latest Hosting OffersNew Reviews

Sponsors

Tag Cloud

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top