Question

I örnekte, bir bileşik kelimeleri olduğu bir kelime listesi

Palanca
Platon
PlatonPalanca

I need to remove "Platon" ve "Palanca" ve let only "PlatonPalanca". Used array_unique to remove duplicates, but those composed words vardır tricky...

Should I sort the list by word length ve compvardır one by one? A regular expression is the answer?

güncelleme: kelimelerin listesi ilgili sözler değil sadece, çok daha büyük ve karışık

2 güncelleme: Ben güvenli bir dizeye dizi implode olabilir.

3 güncelleme: Ben bu bir balon tür sanki bunu önlemek için çalışıyorum. Bunu yapmanın daha etkili bir yolu olmalı

Well, I think that a buble-sort like approach is the only possible one :-( I don't like it, but it's what i have... Any better approach?

function sortByLengthDesc($a,$b){
return strlen($a)-strlen($b);
}

usort($words,'sortByLengthDesc');
$count = count($words);
for($i=0;$i<=$count;$i++) {
    for($j=$i+1;$j<$count;$j++) {
    	if(strstr($words[$j], $words[$i]) ){
    		$delete[]=$i;
    	}
    }
}
foreach($delete as $i) {
    unset($words[$i]);
}

update 5: Sorry all. I'm A moron. Jonathan Swift make me realize I was asking the wrong question. Given x words which START the same, I need to remove the shortests ones.

"Hot, köpek, stve, hotdogstve" ", dog stve, hotdogstve" olması gerektiğini
"Araba, hayvan, halı" "hayvan, halı" haline gelmelidir
"Palanca, Platon, PlatonPalanca" should become "Palanca, PlatonPalanca"
"Platonother, other" should be untouchedm they both start different

Answer 1

Ben size sağlam bir cevap verebilir, böylece biraz daha sorunu tanımlamak gerektiğini düşünüyorum. İşte bazı patolojik listeleridir. Hangi ürün kaldırıldı almalısınız?:

hot, köpek, hotdogstand.
hot, köpek, standı, hotdogstand
sıcak, köpekler, standı, hotdogstand

SOME CODE

Bu kod var olandan daha verimli olması gerekir:

$words = array('hatstand','hat','stand','hot','dog','cat','hotdogstand','catbasket');

$count = count($words);

for ($i=0; $i<=$count; $i++) {
	if (isset($words[$i])) {
		$len_i = strlen($words[$i]);
		for ($j=$i+1; $j<$count; $j++) {
			if (isset($words[$j])) {
				$len_j = strlen($words[$j]);

				if ($len_i<=$len_j) {
					if (substr($words[$j],0,$len_i)==$words[$i]) {
						unset($words[$i]);	
					}
				} else {
					if (substr($words[$i],0,$len_j)==$words[$j]) {
						unset($words[$j]);
					}
				}
			}
		}
	}
}

foreach ($words as $word) {
	echo "$word<br>";
}

Sen döngüler önce dizide kelime uzunlukları depolayarak bu optimize olabilir.

Answer 2

Dizide herhangi bir kelime onunla başlar veya onunla biter eğer, her sözüme ve görebilirsiniz. Evet - bu kelime () (unset) çıkarılmalıdır.

Answer 3

Sen, bir diziye kelimeleri koymak sonraki sözleri geçerli dizin ile başlar eğer kontrol yoluyla alfabetik ve daha sonra döngü diziyi sıralamak, böylece sözcükleri oluşan varlık olabilir. Onlar yoksa, geçerli dizin ve sonraki kelimelerin son bölümlerinde kelime kaldırabilirsiniz ...

Böyle bir şey:

$array = array('palanca', 'plato', 'platopalanca');
// ok, the example array is already sorted alphabetically, but anyway...
sort($array);

// another array for words to be removed
$removearray = array();

// loop through the array, the last index won't have to be checked
for ($i = 0; $i < count($array) - 1; $i++) {

  $current = $array[$i];

  // use another loop in case there are more than one combined words
  // if the words are case sensitive, use strpos() instead to compare
  while ($i < count($array) && stripos($array[$i + 1], $current) === 0) {
    // the next word starts with the current one, so remove current
    $removearray[] = $current;
    // get the other word to remove
    $removearray[] = substr($next, strlen($current));
    $i++;
  }

}

// now just get rid of the words to be removed
// for example by joining the arrays and getting the unique words
$result = array_unique(array_merge($array, $removearray));

Answer 4

Regex işe yarayabilir. Sen dize başlangıç ve bitiş geçerlidir regex içinde tanımlayabilirsiniz.

^ defines the start $ defines the end

yani bir şey gibi

foreach($array as $value)
{
    //$term is the value that you want to remove
    if(preg_match('/^' . $term . '$/', $value))
    {
        //Here you can be confident that $term is $value, and then either remove it from
        //$array, or you can add all not-matched values to a new result array
    }
}

Sorununuzu önleyeceğini

Eğer sadece iki değerin eşit olduğunu kontrol Ama eğer, == preg_match (ve muhtemelen daha hızlı) kadar iyi çalışacaktır

$ Hüküm ve $ değerler listesi çok büyük olay bu stratejilerin en verimli olduğu ortaya çıkmıyor, ama basit bir çözümdür.

Performans sıralama bir sorun, (sağlanan sort fonksiyonunu unutmayın) listeleri ve yan sonra ilerlerken aşağı listeleri tarafı ise daha yararlı olabilir. Ben burada kodu göndermeden önce aslında bu fikri test etmek için gidiyorum.

Oluşan kelimeleri çıkarın

4 Cevap

etiketler