HTML Encoding UTF-8 Characters

Written by William Roush on November 19, 2018 at 1:53 am

Ran into an annoying problem where I needed to HTML encode UTF-8 characters, stuff like em-dash (—).

HttpUtility.HtmlEncode, and WebUtility.HtmlEncode will only encode up to character code values of 255 (so basically extended ASCII).


AntiXssEncoder.HtmlEncode(input, true)

is what you want, only annoyance is that it only supports named elements with a fall back to decimal notation, you cannot force decimal notation.

 

See: https://docs.microsoft.com/en-us/dotnet/api/system.web.security.antixss.antixssencoder.htmlencode?view=netframework-4.7

This entry was posted in Uncategorized on by .

About William Roush

William Roush is currently employed as a Senior Software Developer and independent contractor in Chattanooga, Tennessee. He has more than 12 years of experience in the IT field, with a wide range of exposure including software development, deploying and maintaining virtual infrastructure, storage administration and Windows administration.

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload CAPTCHA.