Thursday, September 8, 2011

Regex vs. .Net’s TrimEnd

I recently had a scenario where I had to compare a large list of strings to a keyword or in some instances a series of keywords to find a match. This seemed fairly simple at first, but after implementing the code and testing the results I noticed that certain words were ignored. For example a keyword search for ‘card’ on the sentence “This card goes after that card.” would only return the first ‘card’ in the sentence. The reason for this was that the second ‘card’ was actually ‘card.’ with a period in the list I was dealing with. I had no control over how the list was built so some string manipulation was going to be needed to scrub out all the noise characters similar to how SQL discards noise words (now referred to as stop words). It was decided that I would only scrub the noise characters at the end of each line, and ignore them if they were mid-sentence.

There are two obvious choices here; an old school Regex pattern, or TrimEnd in .Net’s string namespace.

I was always under the impression that regular expressions would outperform .Net’s string operations, so this was an easy opportunity to put that theory to test.

First I wrote the regular expression to clean off any trailing punctuation:

   1: private string GetRegexValue(string sampleText)
   2: {
   3:     Stopwatch s = Stopwatch.StartNew();
   4:     string pattern = @"(\p{P})(?=\Z|\r\n)";
   5:     string result = string.Empty;
   6:  
   7:     for (int i = 0; i <= 100000; i++)
   8:     {
   9:         result = string.Empty;
  10:         result = Regex.Replace(sampleText, pattern, "");
  11:     }
  12:  
  13:     lblRegexTime.Text = string.Format("{0} ms", 
  14:         s.ElapsedMilliseconds.ToString());
  15:  
  16:     return result;
  17: }



Next I wrote a similar function using string.TrimEnd:



   1: private string GetTrimValue(string sampleText)
   2: {
   3:     Stopwatch s = Stopwatch.StartNew();
   4:     string result = string.Empty;
   5:     char[] charsToTrim = { ',', '.', ' ', ':', ';', '!', '-', '?' };
   6:  
   7:     for (int i = 0; i <= 100000; i++)
   8:     {
   9:         result = string.Empty;
  10:         result = sampleText.TrimEnd(charsToTrim);
  11:     }
  12:  
  13:     lblTrimTime.Text = string.Format("{0} ms", 
  14:         s.ElapsedMilliseconds.ToString());
  15:     return result;
  16: }



To get a good Idea of how the performance matched up I ran both through a loop 100,000 times. The results were surprising.


image


After the initial run the TrimEnd and the Regex are even faster.


image


So what does this mean? Are String operation always faster that regular expressions, No. But its always a good idea to test before making any assumptions as to which is faster.

Monday, April 18, 2011

PDF support in SharePoint 2010

SharePoint 2010 does not offer much support for PDF’s out of the box, but you can quickly get this in a few short steps.

Fist there is the missing icon issue. If you upload a PDF to a document library you just get this:

 image

To add support for the PDF icon simply download it from here and place it in your Images directory on the each SharePoint server in your environment. (C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\IMAGES)

Once you have the image in place update the DOCICON.xml file here: (C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\XML) with the file type and image to use. (<Mapping Key="pdf" Value="pdficon_small.gif" />)

image

After the next IISReset you should have icons for your pdf files.

The second issue with PDF’s out of the box is searching for them. The Search service application does not have PDF as a file type to look for so you will need to add it. To accomplish this open central administration and click on “Manage Service Applications”. Now click “File Types” on the left menu pane.

image

Click “New File Type” and add PDF to the list.

image

Run a full crawl and you should now get PDF’s in your search results.

To enable the PDF iFilter see my previous post here.

Thursday, March 3, 2011

Silverlight SharePoint Web Part using the HTML Bridge

I recently ran across an issue with the HTML Bridge and conflicting behavior between IE and Firefox.

HTML Bridge info: http://msdn.microsoft.com/en-us/library/cc645076(v=VS.95).aspx

I had the following code in Silverlight to set a JavaScript variable to a Silverlight web part instance on a SharePoint page. By creating this variable I was able to call and pass in parameters into the Silverlight control from the JavaScript running on the page.

private static void InitializeReceiverJsVariable()
{
    string pluginId = HtmlPage.Plugin.Id;

    if (string.IsNullOrEmpty(pluginId))
        throw new Exception("No Silverlight Plugin found!  Ensure the Silverlight plugin has an ID!");

    string jsCode = string.Format("var {0} = document.getElementById('{1}').content.{0};", JS_VARIABLE_NAME, pluginId);

    HtmlPage.Window.Eval(jsCode);
}

Note: any Silverlight methods that you want call from JavaScript will need to have the [ScriptableMember] property on them. Alternatively you could create a class to house all your Scriptable methods and add the [ScriptableType] property to the whole class.

After compiling and deploying the web part to SharePoint, testing the JavaScript proved to be a success. Internet Explorer was allowing for communication between the Silverlight web part and the document library. Firefox was not allowing this. After several tests and some trial and error it turned out to be an issue with declaring the JavaScript variable from Silverlight. The solution was to declare the JavaScript variable on the page and just set the value from Silverlight instead of creating the variable too.

JavaScript addition:

var slWebPart = null;

Silverlight change:

private static void InitializeReceiverJsVariable()
{
    string pluginId = HtmlPage.Plugin.Id;

    if (string.IsNullOrEmpty(pluginId))
        throw new Exception("No Silverlight Plugin found!  Ensure the Silverlight plugin has an ID!");

    string jsCode = string.Format("{0} = document.getElementById('{1}').content.{0};", JS_VARIABLE_NAME, pluginId);

    HtmlPage.Window.Eval(jsCode);
}

After the change to the web part it worked in IE, Firefox, and Chrome. I still have not figured the exact issue or why you can’t declare the variable using the HTML Bridge, but the work around solved the problem. If you have come across this or know the technical reason, comments are always welcome.

Friday, February 11, 2011

Rectifying a missing namespace for MSBuild

Today I was working on adding some additional projects to a build and was getting the error:

Error CS0234: The type or namespace name 'Linq' does not exist in the namespace 'System' (are you missing an assembly reference?)

Wait, what, this builds fine in visual studio…

System.Linq is part of System.Core. I did not have this directly referenced in visual studio but it built none the less.

image

I started to look at other projects in the build that used Linq but did not have any build issues and realized that they had System.Core referenced directly. So naturally I tried to go ahead and add System.Core to the project that was failing. Then I got this:

image

System.Core is automatically added by the build system. Well apparently not, so I had to add it manually. I opened the project file for my C# project in notepad and added the following to overwrite this error and get System.Core added.

image

After adding System.Core I was able to re-run the build and everything worked fine. As my co-worker Ralph would say, “it has been rectified”. hence the title of this blog post.

The cause of this issue is a project that is originally created as a .NET 4.0 project and then changed to a .NET 3.5 project. When Visual studio makes the alteration to the project file it does not handle adding System.Core into the project.

Friday, January 21, 2011

Find in files with preview pane

A little over a year ago we moved from Visual Source Safe to Team Foundation Server. Over all I was very happy with TFS and all the new features. There was one feature that I greatly missed from VSS and that was the ability to search within all projects for some specific text. There was occasionally a time where I would come across something that I had done before and wanted to reference it before writing it again. This can easy be accomplished in windows using search. To enable searching on your project workspace you must first add it to the list of places that the window search indexes.

Open Control Panel and click on Indexing Options.

image

Click on Modify and select the folders that you want indexed.

image

You will also need to click on advanced and select the “Index Properties and File Contents” option.

image

*Changing this option will reset the index and you will lose the ability to search items until they are re-indexed.

Now when performing a search all my project files are included.

image

Windows 7 does not have the ability to show a preview of C-Sharp files by default.

image

Adding this functionality is easy using the Managed Preview Handler Framework. Stephen Toub has a great walk through on this here. There is a CodePlex project that wraps his work along with a few others into an easy installer here. After either registering your own handlers or installing the CodePlex project you can use the windows preview pane to see the files from your search results.

image