How to remove accents in strings [1]
attempt 1 (remove accents) -- with problems:
Normalizer.normalize(str, Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "")
ç gives c
' gives '
ê gives e
Ö gives O
ü gives u.
´ gives ´
but Ø gives Ø
Accents are removed! But letters (non-ascii) are not removed.
attempt 2 (Remove non-ascii chars) -- successfully!
Normalizer.normalize(str, Form.NFD).replaceAll("[^\\p{ASCII}]", "")
ç gives c
' gives '
ê gives e
Ö gives O
ü gives u.
´ gives nothing
Ø gives nothing
[1] This may be useful e.g. if you want to give filenames in non-unicode filesystem.
No comments:
Post a Comment