ToString Enhancements with Regular Expressions

This entry will be a walkthrough on creating a ToString extension method for easily extracting a substring matching a given regular expression (RegEx). General regular expression knowledge is assumed as well as basic String manipulation in C#.

Implementation & Use

Diving right in here is the implementation:

public static string ToString(this string @this, string regexPattern)
{
  return (Regex.IsMatch(@this, regexPattern) ?
  Regex.Match(@this, regexPattern).Value : null);
}

This extension method is for System.String objects. It can be used like this:

string phoneRegex =
     @"^[- .]?(([2-9]\d{2})|[2-9]\d{2})[- .]?\d{3}[- .]?\d{4}$"
string phoneRaw = "ea011-122-2333klai";
string phone = phoneRaw.ToString(phoneRegex);

Above is an example of the extension method in use. First, the regex pattern is stored. This pattern is used to match particular phone number formats. The raw data for the phone number is then stored. The raw data of the phone number cannot be used in its entirety to make a valid phone call therefore, it must be scrubbed. This occurs with the call to phoneRaw.ToString(phoneRegex). This call returns the string “011-122-2333”. Using the extension method, the phone number went from a raw, unusable form, to a valid format using a single method call.

Implementation Details

The ToString extension method created above uses the Regex class. This class provides various methods for retrieving information regarding the Regex’s anatomy such as groups, group numbers, and pattern matches.  The ToString extension method uses two of these: IsMatch, and Match.

Regex.IsMatch

This is a static method returning a Boolean value indicating whether a match is found in the specified string using the specified regular expression. In the extension method created above, this is used to decide whether to proceed with returning the value of Regex.Match, or simply return null.

Regex.Match

This is a static method returning a Match object containing information about the first occurrence of the specified regular expression in the specified string. The extension method created above uses this to return the Value of the match. This value contains the string representation of the matched substring found in the input string.

Originally Perceived Enhancements

There are a few improvements that can be made to the original extension method. One, in particular, is the use of the static method Matches. This returns all of the successful matches of a regular expression in a given string. The user can then determine which match to use. Below is an example:

public static string ToString(
  this string @this,
  string regexPattern,
  int matchIndex)
{
  var matches = Regex.Matches(@this, regexPattern);
  return
     (matches.Count > matchIndex ? matches[matchIndex].Value : null);
}

Above is another version of the ToString extension method that accepts a RegEx pattern and a match index. The method retrieves all of the matches and, if valid, returns the desired match’s value. This way, the user is able to do this:

string phoneRegex = @"^?\d{3}";
string phoneRaw = "(011) 122-2333";
string areaCode = phoneRaw.ToString(phoneRegex, 0);

Above shows the enhanced version of the ToString extension method. It is used to obtain the area code “011” of the phone number by retrieving the first occurrence of 3 consecutive digits.

Conclusion

The two versions of the ToString extension method created above allow, with just a few lines of code, the ability to do all sorts of string manipulation and input validation using regular expressions.

A Binary Look at System.IConvertible Derivations

This entry will focus on extending System.IConvertible objects to retrieve its binary form as a String. Basic knowledge of programming in C# and creating extension methods is assumed.

Converting from one data type to another (Casting) is quite commonplace in computer programming. In C# this can be done in various ways:

  1. Implicit Casting
  2. Explicit Casting
  3. Using System.Convert

The list above will be described in the following three sections.

Implicit Casting

This is one of the most basic forms. When one variable of one data type is assigned to another with a different yet compatible data type, they can be implicitly cast. Here is an example:

long lngValue = 9223372036854775806L;
int intValue = 2147483646;
lngValue = intValue;

Above is an example of implicit casting. A long (Int64) variable occupies 64 bits (or 8 bytes, or 16 nibbles). An int (Int32) variable occupies 32 bits (or 4 bytes, or 8 nibbles). Since Int32 memory space can easily fit within Int64 memory space (as depicted below), an implicit conversion can be applied from Int32 to Int64. An explicit cast must be performed when converting the opposite direction (Int64 to Int32) where truncation will occur.

A visual comparison between Int32 and Int64 memory space


Above is a visual comparison between Int32 and Int64 memory space (Int32 to Int64 cast).

Explicit Casting

This is another basic form of casting. Here is an example:

long lngValue = 2147483646L;
int intValue = 0;
intValue = (int)lngValue;

Above is an example of explicit casting. Since a 64-bit variable does not completely fit within a 32-bit memory space, we must use explicit casting while accepting the fact truncation may occur (possibly loss of data). Truncation is depicted below (an Int64 to Int32 cast):

an Int64 to Int32 cast

Using System.Convert

System.Convert contains many methods meant to convert from one base data type to another. There are methods such as ToInt64, ToByte, ToDateTime, etc. Here is an example:

int intValue = 2147483646;
long value = Convert.ToInt64(intValue);

Above is an example using the ToInt64 method of System.Convert. The method accepts the Int32 value we initialized and converts it to Int64. This particular method simply wraps an explicit cast: (long)value. Others are dependent on an object’s implementation of the System.IConvertible interface as shown in the implementation of the Convert.ToInt64(object) method:

public static long ToInt64(object value)
{
  if (value != null)
  {
    return ((IConvertible) value).ToInt64(null);
  }
  return 0L;
}

Above is the implementation of Convert.ToInt64(object). The ToInt64 method of value is called after converting it to System.IConvertible proving its dependence on System.IConvertible objects.

The Extension

Now that we have gone over the basics of casting and the dependency on System.IConvertible objects, the extension method can be created. Below is the implementation:

public static class IConvertibleToBinaryString
{
  public static string ToBinaryString(this IConvertible @this)
  {
    long value = Convert.ToInt64(@this, CultureInfo.InvariantCulture);
    string converted = Convert.ToString(value, 2);
    return converted;
  }
}

Above is the implementation of the extension method to convert objects deriving from System.IConvertible to their binary form as a String. First, the object is converted to long. This must be done to be able to specify the base for the Convert.ToString method that follows. The culture information provided is optional. Convert.ToString accepts a value to convert of type long and an integer specifying the base. This is important. Base 2 is binary, therefore, calling Convert.ToString(lngValue, base) with 2 as the base indicates a conversion to the binary form of the long value. There are other bases that can be used as well to created methods such as:

  1. ToHexString (use base 16)
  2. ToOctetString (use base 8)
  3. ToDecimalString (use base 10)

Proving Binary Validity

How does one know that the method actually works? Unit-testing is how! Below is a basic method testing four uses of the extension method previously created. This uses MSTest:

public void</span> ToBinaryFromLong()
{
  long lv = 10L;
  Assert.AreEqual("1010", lv.ToBinaryString());

  short sv = 2;
  Assert.AreEqual("10", sv.ToBinaryString());

  Boolean bln = true;
  Assert.AreEqual("1", bln.ToBinaryString());

  bln = false;
  Assert.AreEqual("0", bln.ToBinaryString());
}

Running the code above should pass. This means the binary form is retrieved as expected.

Conclusion

This entry provided a look at a combination of System.Convert, System.IConvertable, and extension methods. Please note: This entry demonstrated extending System.IConvertible objects. It is not to suggest a String of binary numbers has common relevant uses. This is an enjoyable entry meant to prove it can be done.

Run a System.Action Instantiation Asynchronously

Here is the second article in the Extension Me series. The Extension Me series focuses on various uses of Extension Methods in .NET and primarily uses C# as the language of choice. This article will briefly focus on the topic of Asynchronous Programming by extending System.Action delegates. Using the extension method, any code wrapped in an instantiation of System.Action delegates can be executed asynchronously. The assumptions made about the reader are: has experience with the basics of programming in C# and knows what extension methods are. Please be advised: the implications of asynchronous programming are not to be ignored for severe consequences can occur.

Not only is asynchronous programming fun and beneficial (if used correctly), it is becoming an essential skill as Windows 8 is introduced. Although the implementation of the following asynchronous pattern is incompatible with Metro-style WinRT apps, the theory of asynchronous programming will surely be applicable. More information on Asynchronous Programming can be found at Visual Studio Asynchronous Programming where How-To videos, whitepapers, samples, and walkthroughs are available.

Below is the implementation of the extension method:

public static class ActionExtensions
{
  public static void Async(this Action @this)
  {
    var thread = new Thread(@this.Invoke);
    thread.Start();
  }
}

The code snippet above shows how extension methods can introduce the beauty of System.Action and System.Threading.Thread objects working together. The Action and the Thread are very close. An essential function of System.Action is wrapping a block of executable code while an essential function of System.Threading.Thread is wrapping a block of execution. By embracing these two abilities, code can be executed asynchronously just by wrapping it in an Action delegate. Here is an example:

public void ReturnsQuick()
{
  new Action(() =>
  {
    //Long Running Code here runs on a separate thread…
  }).Async();
}

When the ReturnsQuick method is called, it creates a new instance of an Action passing a lambda containing “long-running code”. Using the extension method created previously, the code is executed on a separate thread. The code is now asynchronous and, as the name suggests, the ReturnsQuick method returns immediately after calling Async().

As mentioned above when introducing the topic of this article, there are plenty of caveats that must be addressed before adopting this approach (especially application-wide). Shared resource is one of the most important things to consider. If two threads try to access a single resource at the same time, one of them has to lose! Please take a look at the available pdf, “Threading in C#” by Joseph Albahari for a more detailed explanation of proper asynchronous approaches.

Serialization, Encryption, and Extension Methods Working In Harmony

Recently I wrote a blog entry called “Extension Me Serialize: With Encryption!” for Magenic focusing on encrypting serialized data. The url is: http://magenic.com/Blog/ExtensionMeSerializeWithEncryption.aspx and it has been referenced on Channel 9: http://channel9.msdn.com/Shows/This+Week+On+Channel+9/TWC9-Windows-8-C-Amp-NuGet-Mouse-Mischief-and-more.

Embracing the power of extension methods, I offered the ability to easily serialize objects while simultaneously offering security by default and defense in depth approaches. Also, as noted in the post, depending solely on encryption is not a responsible form of security for your IT system. The protection and maintenance of your encryption keys must be the first step. Encryption is meaningless unless the encryption keys are properly managed. If an attacker can easily access the encryption keys used for data encryption and can apply those to decrypt the data, the data is plaintext to them. I implore you, protect   your   keys !