php[world] 2018 - Call for Speakers

La classe Spoofchecker

(PHP 5 >= 5.4.0, PHP 7, PECL intl >= 2.0.0)

Introduction

Cette classe est fournie car Unicode contient un grand nombre de caractères et incorpore les systèmes d'écriture variés du monde et leur utilisation incorrecte peut exposer des programmes ou des systèmes à d'éventuelles attaques de sécurité en utilisant la similarité des caractères.

Les méthodes fournies permettent de vérifier si une chaîne individuelle est susceptible d'être une tentative      à tromper le lecteur (détection de tromperie), tel que dans "pаypаl" orthographié avec un caractère 'а' cyrillique.

Synopsis de la classe

Spoofchecker {
/* Constantes */
const integer SINGLE_SCRIPT_CONFUSABLE = 1 ;
const integer MIXED_SCRIPT_CONFUSABLE = 2 ;
const integer WHOLE_SCRIPT_CONFUSABLE = 4 ;
const integer ANY_CASE = 8 ;
const integer SINGLE_SCRIPT = 16 ;
const integer INVISIBLE = 32 ;
const integer CHAR_LIMIT = 64 ;
/* Méthodes */
public bool areConfusable ( string $s1 , string $s2 [, string &$error ] )
public __construct ( void )
public bool isSuspicious ( string $text [, string &$error ] )
public void setAllowedLocales ( string $locale_list )
public void setChecks ( long $checks )
}

Constantes pré-définies

Spoofchecker::SINGLE_SCRIPT_CONFUSABLE

Spoofchecker::MIXED_SCRIPT_CONFUSABLE

Spoofchecker::WHOLE_SCRIPT_CONFUSABLE

Spoofchecker::ANY_CASE

Spoofchecker::SINGLE_SCRIPT

Spoofchecker::INVISIBLE

Spoofchecker::CHAR_LIMIT

Sommaire

add a note add a note

User Contributed Notes 2 notes

up
4
Anonymous
1 year ago
From http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html :
SINGLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are from the same script
MIXED_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script
WHOLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script BUT both of them are single-script strings
ANY_CASE: Deprecated.
SINGLE_SCRIPT: Deprecated.
INVISIBLE: Check an identifier for the presence of invisible characters, such as zero-width spaces, or character sequences that are likely not to display, such as multiple occurrences of the same non-spacing mark.
CHAR_LIMIT: Check that an identifier contains only characters from a specified set of acceptable characters.

Explanation of whole script, mixed script and single script confusables in UTS 39 section 4 : http://unicode.org/reports/tr39/#Confusable_Detection

Details from Java SpoofChecker class at http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html
up
0
Anonymous
1 month ago
Spoofchecker yields false positives by defaut when Whole-Script Confusables (WSC) and Mixed-Script Confusables (MSC) checks are used.
They have been deprecated since ICU 58:
http://bugs.icu-project.org/trac/ticket/12549#comment:10

Workarounds: upgrade ICU to 58+, or avoid the MSC and WSC checks with Spoofcheckers' setChecks() function.
To Top