Monthly Archives: November 2018

HTML Encoding UTF-8 Characters

Written by William Roush on November 19, 2018 at 1:53 am

Ran into an annoying problem where I needed to HTML encode UTF-8 characters, stuff like em-dash (—).

HttpUtility.HtmlEncode, and WebUtility.HtmlEncode will only encode up to character code values of 255 (so basically extended ASCII).


AntiXssEncoder.HtmlEncode(input, true)

is what you want, only annoyance is that it only supports named elements with a fall back to decimal notation, you cannot force decimal notation.

 

See: https://docs.microsoft.com/en-us/dotnet/api/system.web.security.antixss.antixssencoder.htmlencode?view=netframework-4.7