CakeFest 2024: The Official CakePHP Conference

recode_string

(PHP 4, PHP 5, PHP 7 < 7.4.0)

recode_stringコード変換指令に基づき文字列のコードを変換する

説明

recode_string(string $request, string $string): string

コード変換指令 request に基づき文字列 string のコードを変換します。

パラメータ

request

変換指令の型。

string

変換する文字列。

戻り値

変換後の文字列、または変換指令を実行できない場合に false を返します。

例1 基本的な recode_string() の例

<?php
echo recode_string("us..flat", "The following character has a diacritical mark: á");
?>

注意

簡単なコード変換指令は、"lat1..iso646-de" のようになります。

参考

  • コード変換指令に関する詳細な手順に関しては、インストールされている GNU Recode のドキュメントも参照ください。
  • mb_convert_encoding() - ある文字エンコーディングの文字列を、別の文字エンコーディングに変換する
  • UConverter::transcode() - ある文字エンコーディングから別の文字エンコーディングに文字列を変換する
  • iconv() - ある文字エンコーディングの文字列を、別の文字エンコーディングに変換する

add a note

User Contributed Notes 4 notes

up
3
jazfresh at spam-javelin.hotmail.com
20 years ago
I came across a bug (and workaround) when using recode_string. When converting from utf-8 to iso-2022-jp, it would always return an empty string (although it would work fine for conversions from html to utf8). Converting with recode on the command line worked fine, which was odd. I noticed that if I specified "-v" on the command line, recode stated that it was using libiconv to do the conversion.

Using "iconv" instead of recode got the right results.
i.e.

Works:
$str = recode_string("html..utf-8", "&#26085;&#26412;&#35486;"); // Unicode for "Japanese"

Doesn't work:
$str = recode_string("utf-8..iso-2022-jp", $mystring);

Works:
$str = iconv("utf-8", "iso-2022-jp", $mystring);

Don't ask me why. Hope this saves someone some frustrating hours debugging.
up
2
bisqwit at iki dot fi
18 years ago
Here's how to convert romaji to katakana/hiragana with PHP (transliterating Japanese text).
The function Romaji2Kana($s) will return with keys 'hira' and 'kata' that respectively contain the hiragana and katakana versions of the given string in UTF-8 encoding.

<?php
// eucjp: 2421; unicode: 3041
define('HIRATABLE', 'a A i I u U e E o O KAGAKIGIKUGUKEGEKOGOSAZASIZISUZUSEZESOZO'.
'TADATIDItuTUDUTEDETODONANINUNENOHABAPAHIBIPIHUBUPUHEBEPEHOBOPO'.
'MAMIMUMEMOyaYAyuYUyoYORARIRUREROwaWAWIWEWOn ');
// eucjp: 2521; unicode: 30A1
define('KATATABLE', 'a A i I u U e E o O KAGAKIGIKUGUKEGEKOGOSAZASIZISUZUSEZESOZO'.
'TADATIDItuTUDUTEDETODONANINUNENOHABAPAHIBIPIHUBUPUHEBEPEHOBOPO'.
'MAMIMUMEMOyaYAyuYUyoYORARIRUREROwaWAWIWEWOn VUkake');

function
HiraTrans($s)
{
#print "trans('$s')\n";
$pos = strpos(HIRATABLE, $s);
if(
$pos===false) return 0xA1BC; // ^
return 0xA4A1 + $pos/2;
}
function
KataTrans($s)
{
$pos = strpos(KATATABLE, $s);
if(
$pos===false) return 0xA1BC; // ^
return 0xA5A1 + $pos/2;
}

function
Romaji2Kana($s)
{
$s = strtoupper(str_replace(
Array(
'shi', 'sh', 'fu', 'chi', 'ch', 'tsu', 'dz', 'l', '-',
'â', 'î', 'û', 'ê', 'ô', 'ā', 'ī', 'ū', 'ē', 'ō'),
Array(
'si', 'sy', 'hu', 'ti', 'ty', 'tu', 'j', 'r', '^',
'a^', 'i^', 'u^', 'e^', 'o^', 'a^', 'i^', 'u^', 'e^', 'o^'),
$s));
// FO -> FUxo
$s = preg_replace('@F([AIOE])@e', '"HU".strtolower("\1")', $s);
// VO -> VUxo
$s = preg_replace('@V([AIUEO])@e', '"VU".strtolower("\1")', $s);
// KYA -> KYya
$s = preg_replace('@([KSTNHMRGZBPD])Y([AUO])@e', '"\1Iy".strtolower("\2")', $s);
// XTU -> tu (make them actually small)
$s = preg_replace('@X(TU|Y[AUO]|[AIUEO]|KA|KE)@e', 'strtolower("\1")', $s);
// KKO -> tuKO
$s = preg_replace('@([KSTHMRYWGZBPDV]{2,})@e',
'str_pad("",2*strlen("\1")-2,"tu").substr("\1",0,1)', $s);
// N -> n (but not NO -> nO)
// At this point, N' will work correctly
$s = preg_replace('@N(?![AIUEO])@', 'n', $s);
// Unrecognized characters off
$s = eregi_replace('[^^VAIUEOKSTNHMYRWGZBPD]', '', $s);

$pat = '@([AIUEOnaiueo^]|..)@e';
$rec = 'EUCJP..UTF8';

return
Array(
'hira' => recode_string($rec,preg_replace($pat, 'pack("n", HiraTrans("\1"))', $s)),
'kata' => recode_string($rec,preg_replace($pat, 'pack("n", KataTrans("\1"))', $s)));
}

print_r( Romaji2Kana('konnichiha') );
?>

Note: Due to technical limitations in the manual pages, there are two errors in this code:
- Some characters in the first str_replace may appear wrong in some php.net mirrors. It supposed to contain aiueo with circumflex and aiueo with macron.
- The strings in the defines should be constant, not appendage expressions. (Line length limitation)

-Joel Yliluoma
up
-6
msimonc at yahoo dot com
15 years ago
Seems to require that librecode be installed.
Try iconv() instead.
up
-6
mori at homoeopathy dot co dot jp
9 years ago
function Romaji2Kana works pretty good but few exception: "JA" and "DZA" were not converted correctly as Japanese speekes expected. Following is a correction of that behavior.

public function convert($s, $mode='hiragana')
{
$s = strtoupper(str_replace(
- Array('shi', 'sh', 'fu', 'chi', 'ch', 'tsu', 'dz', 'l', '-',
+ Array('shi', 'sh', 'fu', 'chi', 'ch', 'tsu', 'dzi', 'ji', 'j', 'l', '-',
'â', 'î', 'û', 'ê', 'ô', 'ā', 'ī', 'ū', 'ē', 'ō'),
- Array('si', 'sy', 'hu', 'ti', 'ty', 'tu', 'j', 'r', '^',
+ Array('si', 'sy', 'hu', 'ti', 'ty', 'tu', 'zi', 'zi', 'dz','r', '^',
'a^', 'i^', 'u^', 'e^', 'o^', 'a^', 'i^', 'u^', 'e^', 'o^'),
$s));
+ // DZA -> ZIya
+ $s = preg_replace('/DZ([AUO])/e', '"ZI".strtolower("y\1")', $s);
+ // DZE -> ZIe
+ $s = preg_replace('/DZ([E])/e', '"ZI".strtolower("\1")', $s);
// FO -> FUxo
$s = preg_replace('@F([AIOE])@e', '"HU".strtolower("\1")', $s);
// VO -> VUxo
To Top