Easy code: Parse HTML String to get InnerText

Today, I had to get comments, stored in a database, to publish them in a Web Form Application.

These comments were formatted with HTML tags (not well formed) so I needed to parse the data to get only the Inner text.

I developed this piece of code which is very easy … but useful too.

public static string GetInnerHtmltext(string data)
  string decode = System.Web.HttpUtility.HtmlDecode(data);
  Regex objRegExp = new Regex(“<(.|\n)+?>”);
  string replace = objRegExp.Replace(decode, “”);
  return replace.Trim (“\t\r\n “.ToCharArray ());

Have Fun !!!

Comments (4)

  1. Shilpa says:

    nice code but how to parse html images ?

    my mail address b is shilpakmlthn@yahoo.co.in

  2. Jaspreet says:

    Thanks dude! The code rocks!!!

  3. Oldarney says:

    DUDE. This code sounds awesome… Any tips on how to get this code to parse my stuff… I have a 7 megabyte file full of HTML, I only want the visible text.

    I am a total newb when it comes to .net. on the other hand I have some experience with C++ and alot with web languages.

  4. Jose says:

    need to more info about forms and  blogs