Using Regular Expression In C#.NET


 Using Regular Expression In C#.NET  
 


Today, in this tutorial, we are going to learn about c# regular expression programming and its practical implementation. Almost all programming languages have support for regular expression and thus has a regular expression engine which parse the expressions, compute necessarily and return expected result. .NET manages this regular expression engine globally and thus works exactly same way for all languages inside .NET framework. If you using some different language of .NET(j#,VB etc) other than c#, you can implement this similar way , just by changing the syntax to that language.

Don’t you know anything about Regular Expression at all? So, it is best to get some basic idea of regular expression before you start implementing it in C#. Go ahead and learn some regular expression basics and know what the symbols stands for.

RegularExpressions Namespace in .NET Framework:

First, we will need to import the System.Text.RegularExpressions namespace so that the regular expression classes are available. There are several classes for regular expression operation purpose like Group,Match,Capture and their corresponding Collection classes, one delegate ‘MatchEvaluator’, Regex(Main class to be used), Enum ‘RegexOptions’ and RegexCompilationInfo class, used for using Regular expression on stand alone assembly.

.NET Regular Expression Classes

Instead of Regex,Match and RegexOptions classes, other classes are for a little advance usage. So, I am not going to cover them in this article. Match class store a single match result and MatchCollection stores a list of match results. RegexOptions is an enum which has various options to pass as parameter while calling match function like Ingore case(accepts both lower and higher case letters), RightToLeft(Matches On reverse order then default left to right matching) and so on.

Using C# Regular Expression ‘Regex’ class:

As I already mentioned, Regex is the main class to do the real match operation. You can use this class either by instantiation an object, or by calling its static method. If you have such situation where same regular expression will going to be needed several times in a single execution, then its better to use instantiated object and check that against different values. Here are sample code for such usage:

            string[] email_lists = { "email1@domain1.com", "email2@domain1.com", "email1@domain2.com" };

            Regex regx = new Regex(@"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$", RegexOptions.IgnoreCase);

            foreach (string email in email_lists)
            {
                Match match = regx.Match(email);
                Console.WriteLine(match.Success);
            }

The above example will validate a list of email addresses whether they are in correct email format and will print true/false depending on the match result. Note that, it matches every single string separately and returns a ‘Match’ class instance, which has a ‘Success’ property that indicates whether matches are successful or failed.

Alternative Static Match function:

If we use the static ‘Match’ method of Regex class, we will have to feed two parameter at least(+1, optional RegexOptions). The following code sample will illustrate this usage:

            string[] email_lists = { "email1domain1.com", "email2@domain1.com", "email1@domain2.com" };          
            string pattern = @"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$";
            foreach (string email in email_lists)
            {
                Match match = Regex.Match(email, pattern ,RegexOptions.IgnoreCase);
                Console.WriteLine(match.Success);
            }

The above code snippet will work similarly as the previous one, just we have change the way of usage. But final result is same.

Retrieve all matched results from A Text:

OK, so as we know can match to a specific data, lets move one step forward to retrieves a list of data items which matches a specific regular expression. The following code snippet will do this work and retrieve all data matching simple time format hh:mm:ss .

 Regex dataPointRegex = new Regex(@"\d[\d+.:]+\d", RegexOptions.IgnorePatternWhitespace);
 String longText = "03:08:12asdasd06:47:12asdasd03:08:12asdasd06:47:42asdasdasdasd03:08:12ghjklfghdgodjg06:48:12asdada03:08:12asdasdasd06:48:42asdasdasdasdas";           
 List<String> segments = new List<String>();
 MatchCollection matches;
 matches = dataPointRegex.Matches(longText );

 if (matches.Count > 0)
 {
      foreach (Match match in matches)
      {
        if (match.Success)
        {
          foreach (Group result in match.Groups)
          {
             segments.Add(result.Value);
          }
        }
      }
 }

Note that, this example usage ‘Matches’ function instead of ‘Match’ to catch multiple matches. And each match object has its own associated ‘Groups’ collection that contains the results.

References:

Microsoft Product, VisualStudio.NET has a very rich online msdn documentation for full references on .NET framework. To know more details on regular expression tutorial in c#, you can go through the .NET Regular Expressions On MSDN Online. There you will find some handy examples also. Also, let me know if you having trouble on any specific portion which you haven’t found yet. Happy coding :)

Comments

Leave a Reply