Last active
September 12, 2024 16:18
-
-
Save rponte/893494 to your computer and use it in GitHub Desktop.
Removing accents and special characters in Java: StringUtils.java and StringUtilsTest.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package br.com.triadworks.rponte.util; | |
import java.text.Normalizer; | |
public class StringUtils { | |
/** | |
* Remove toda a acentuação da string substituindo por caracteres simples sem acento. | |
*/ | |
public static String unaccent(String src) { | |
return Normalizer | |
.normalize(src, Normalizer.Form.NFD) | |
.replaceAll("[^\\p{ASCII}]", ""); | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package br.com.triadworks.rponte.util; | |
import static org.junit.Assert.assertEquals; | |
import org.junit.Test; | |
public class StringUtilsTest { | |
private static final String accents = "È,É,Ê,Ë,Û,Ù,Ï,Î,À,Â,Ô,è,é,ê,ë,û,ù,ï,î,à,â,ô,Ç,ç,Ã,ã,Õ,õ"; | |
private static final String expected = "E,E,E,E,U,U,I,I,A,A,O,e,e,e,e,u,u,i,i,a,a,o,C,c,A,a,O,o"; | |
private static final String accents2 = "çÇáéíóúýÁÉÍÓÚÝàèìòùÀÈÌÒÙãõñäëïöüÿÄËÏÖÜÃÕÑâêîôûÂÊÎÔÛ"; | |
private static final String expected2 = "cCaeiouyAEIOUYaeiouAEIOUaonaeiouyAEIOUAONaeiouAEIOU"; | |
private static final String accents3 = "Gisele Bündchen da Conceição e Silva foi batizada assim em homenagem à sua conterrânea de Horizontina, RS."; | |
private static final String expected3 = "Gisele Bundchen da Conceicao e Silva foi batizada assim em homenagem a sua conterranea de Horizontina, RS."; | |
private static final String accents4 = "/Users/rponte/arquivos-portalfcm/Eletron/Atualização_Diária-1.23.40.exe"; | |
private static final String expected4 = "/Users/rponte/arquivos-portalfcm/Eletron/Atualizacao_Diaria-1.23.40.exe"; | |
@Test | |
public void replacingAllAccents() { | |
assertEquals(expected, StringUtils.unaccent(accents)); | |
assertEquals(expected2, StringUtils.unaccent(accents2)); | |
assertEquals(expected3, StringUtils.unaccent(accents3)); | |
assertEquals(expected4, StringUtils.unaccent(accents4)); | |
} | |
} |
Working properly.
Congrats!
Perfect !
Funciona Lindamente !
Muito bom. Vlw.
It works, thank you !
Muito bom. Obrigado.
Me ajudou bastante!!! agradecido
Local para mim funciona perfeito, subo no Websphere 8.5 ele insiste em converter Ç para A.
Ex.: CONSOLAÇÂO fica CONSOLAAO
Será encode do websphere?
Unfortunately, this removes "ß" from the string
+1
explicação do Alexandre Aquiles sobre o funcionamento do código acima: https://twitter.com/alex_aquiles/status/1494397659431542784?s=21
Đ character is not working
Muito Obrigado.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Đ not working