Question

Ben yaparsam, preg_replace ('/ [^ a-zA-Z0-9 \ s-_] /','', $ val) çok dilli uygulama, bu aksanlı karakterleri veya Rusça karakterleri gibi şeyler idare edecek? Eğer değilse, nasıl ben sadece yukarıdaki karakter, ancak yerel bilinci ile izin kullanıcı girişi filtre edebilir?

teşekkürler!

codecowboy.

Answer 1

Ben bulabilirim sadece yararlı bilgiler hangi devletler, this page of the manual dan:

A "word" character is any letter or digit or the underscore character, that is, any character which can be part of a Perl "word". The definition of letters and digits is controlled by PCRE's character tables, and may vary if locale-specific matching is taking place. For example, in the "fr" (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.

Yine de, istediğiniz gibi çalışıyor bahse olmaz ...

Ama, emin olmak için:

belki kullanan unicode matching daha iyi olurdu
Muhtemelen emin olmak için denemek gerekecek ...

Unicode hakkında, manuel bu diyor:

Matching characters by Unicode property is not fast, because PCRE has to search a structure that contains data for over fifteen thousand characters. That is why the traditional escape sequences such as \d and \w do not use Unicode properties in PCRE.

Yani, daha güvenli bir çözüm olabilir ... bu konuda meraklı, ben ^ ^ eklemek gerekir

Answer 2

Hayır, sadece ASCII karakter A-Z maç olacak. Herhangi bir dilde, herhangi bir harf / sayı eşleştirmek için, unicode properties of the regex engine kullanmanız gerekir:

preg_replace('/[^\p{L}\p{N}]/', '', $string);

php preg_replace nasıl yerel farkında?

2 Cevap

etiketler