PHP htmlentities() + UTF-8 considered harmful

$foo

Result: ヨコノトÂÈÄfun

Hex: e383a8e382b3e3838ee38388c382c388c38466756e

htmlspecialchars($foo)

Result: ヨコノトÂÈÄfun

Hex: e383a8e382b3e3838ee38388c382c388c38466756e

htmlspecialchars($foo, ENT_QUOTES, 'UTF-8')

Result: ヨコノトÂÈÄfun

Hex: e383a8e382b3e3838ee38388c382c388c38466756e

htmlentities($foo)

Result: �������fun

Hex: 266174696c64653b8326756d6c3b266174696c64653b8226737570333b266174696c64653b838e266174696c64653b8388264174696c64653b82264174696c64653b88264174696c64653b8466756e

htmlentities($foo, ENT_QUOTES, 'UTF-8')

Result: ヨコノトÂÈÄfun

Hex: e383a8e382b3e3838ee383882641636972633b264567726176653b2641756d6c3b66756e

brynEscape($foo)

Result: ヨコノトÂÈÄfun

Hex: e383a8e382b3e3838ee38388c382c388c38466756e


See http://www.phpwact.org/php/i18n/utf-8 for more information.

Source

More Play