ConFoo Montreal 2017 Calling for Papers

idn_to_ascii

(PHP 5 >= 5.3.0, PECL intl >= 1.0.2, PECL idn >= 0.1)

idn_to_asciiConvert domain name to IDNA ASCII form.

Description

Procedural style

string idn_to_ascii ( string $domain [, int $options ] )

This function converts Unicode domain name to IDNA ASCII-compatible format.

Parameters

domain

Domain to convert. In PHP 5 must be UTF-8 encoded.

options

Conversion options - combination of IDNA_* constants.

Return Values

Domain name encoded in ASCII-compatible form. or FALSE on failure

Examples

Example #1 idn_to_ascii() example

<?php

echo idn_to_ascii('täst.de'); 

?>

The above example will output:

xn--tst-qla.de

See Also

add a note add a note

User Contributed Notes 1 note

up
15
edible dot email at gmail dot com
4 years ago
The notes on this function are not very clear and a little misleading.

Firstly, <=5.3, you will need to make use of one of several scripts or classes available on the internet which might, or might not, require the installation of of the intl and idn PECL extensions ...and you will need to have !<4.0 in order to be able to install both.

Secondly, if you have >=5.4 you will not require the PECL extensions.

Third, use of utf8_encode() is not necessary.  In fact, it will potentially prevent idn_to_ascii() from working at all.

On my setup it was necessary to change the charset in the script meta tags to UTF-8:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

...and to change charset_default in the php.ini file (/usr/local/lib/php.ini, whereis php.ini, find / -name php.ini):

default_charset = "UTF-8"

The above changes mean that idn_to_ascii() can now be used with that syntax (no need for utf8_encode()).  Previously, the function worked to convert some IDNs, but failed to convert Japanese and Cyrillic IDNs.  Further, no additional locales were enabled or added, and Apache's charset file was left unmodified.

It is also important to remember only to apply the function where required, eg:

idn_to_ascii(cåsino.com) // is wrong

...whereas...

iden_to_ascii(cåsino) // is right

...and also be aware of text editors that don't support UTF-8 encoding, or the $domain = 'cåsino' value will end up as $domain = '??????' ...and the function will fail.

I have found that Notepad++ easily and reliably handles UTF-8 encoding that works for this function using UTF-8 as the encoding option, not UTF-8 without BOM.
To Top