In a normal webpage there is a lot of whitespace. We put it there in form of tabs and carriage-returns in order to make the document more maintainable and easier to read. But it has its price in terms of adding to the overall weight of the document in kilobytes and it takes longer for the browser to render a page with a lot of whitespace. This is especially true for IE6, which renders whitespace a lot slower than its counterparts.The problem is, we don’t want to write html without whitespace. It would make it impossible to maintain. What we are looking for is an automatic way to filter the whitespace at runtime. In ASP.NET this is easy. You can override the page’s Render method and do the filtering in there. We also want to be able to turn the filtering on and off easily, because when we develop we often want to look at the rendered html code and make sense of it.Here is an example that does just that. It overrides the render method of the page and is turned on/off in the web.config.Implement this on the aspx page or even better on the master page if you use ASP.NET 2.0:
//Add this to the top of the page
using System.Configuration;
using System.Web.UI;
using System.Text.RegularExpressions;
//Overrides the Render method
protected override void Render(HtmlTextWriter writer){
using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter())){
base.Render(htmlwriter);
string html = htmlwriter.InnerWriter.ToString();
if ((ConfigurationManager.AppSettings.Get(“RemoveWhitespace”) +string.Empty).Equals(“true”, StringComparison.OrdinalIgnoreCase)){
html = Regex.Replace(html, @”(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(?<=[>])\s{2,11}(?=[<])|(?=[\n])\s{2,}”, string.Empty);
html = Regex.Replace(html, @”[ \f\r\t\v]?([\n\xFE\xFF/{}[\];,<>*%&|^!~?:=])[\f\r\t\v]?”, ”$1″);
html = html.Replace(“;\n”, ”;”);
}
writer.Write(html.Trim());
}
}
and add this to the web.config appsetting section:
<add key=”RemoveWhitespace” value=”true”/>
It’s as easy as that and the overhead is reasonable. I use this on the most of the websites I build and have never noticed any negative impact in any way.Update, May 4th 2006Some have had their problems with this method, complaining that it screws up the HTML and embedded JavaScript. I’ve modified the method to address these issues and created a safe whitespace removal method.Ok, this is not new. I’ve also written about this a few times in the past. The thing is that removing whitespace is a very tricky discipline that is different from site to site. At least that was what I thought until very recently.For some unexplained reason I started working on a little simple method to remove whitespace in a way so it works on all websites without breaking any HTML. Maybe not unexplained since I’ve written about it so many times that it would seem I got a secret obsession.Obsession or not, here is the code I ended up with after a few hours of hacking. Just copy the code onto your base page or master page and watch the magic.
private static readonly Regex REGEX_BETWEEN_TAGS = new Regex(@”>\s+<”, RegexOptions.Compiled);
private static readonly Regex REGEX_LINE_BREAKS = new Regex(@”\n\s+”, RegexOptions.Compiled);
/// <summary>
/// Initializes the <see cref=”T:System.Web.UI.HtmlTextWriter”></see> object and calls on the child
/// controls of the <see cref=”T:System.Web.UI.Page”></see> to render./// </summary>///
<param name=”writer”>The <see cref=”T:System.Web.UI.HtmlTextWriter”></see> that receives the page content.</param>
protected override void Render(HtmlTextWriter writer){
using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter())){
base.Render(htmlwriter);
string html = htmlwriter.InnerWriter.ToString();
html = REGEX_BETWEEN_TAGS.Replace(html, “> <”);
html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
writer.Write(html.Trim());
}
}
Remember that whitespace removal speeds up rendering in especially IE and reduces the overall weight of your page.
« Remove whitespace from stylesheets and JavaScript files Remove generated JavaScript from your pages »

