Skip to content

Instantly share code, notes, and snippets.

@2803media
Created April 3, 2014 15:59
Show Gist options
  • Select an option

  • Save 2803media/9957218 to your computer and use it in GitHub Desktop.

Select an option

Save 2803media/9957218 to your computer and use it in GitHub Desktop.
Remove doublon from a text, first break by '.' then by ',' and after build a text without doublons
$content = "";
$content = preg_replace('/\n\r|\r\n|\n|\r/','', $content);
$content = preg_replace('/\s+/', ' ',$content);
$content = preg_replace('/<br \/><br\/><br\/><br\/> /',' ', $content);
$content = str_replace('&nbsp;','',$content);
$content = str_replace('&#39;','\'',$content);
$content = str_replace('&agrave;','à',$content);
$content = str_replace('&eacute;','é',$content);
$content = preg_replace('/\n\r|\r\n|\n|\r/','', $content);
$parts = explode('.',$content);
$sentences = array();
$sentences2 = array();
$sentences = array_values(array_unique($parts));
foreach($sentences as $sentence){
$parts2[] = explode(',',$sentence);
}
foreach($parts2 as $parts21){
foreach($parts21 as $parts211){
$sentences2[] = $parts211;
}
}
$sentences2 = array_values(array_unique($sentences2));
foreach($sentences2 as $sentences21){
echo $sentences21;
echo '. ';
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment