cancel
Showing results for 
Search instead for 
Did you mean: 
Community_Admin
Community Team Member
Community Team Member
Perhaps a bit tangential to text mining, but has anyone found a way to efficiently do string analysis in Redshift?  Ideally something like levenshtein() or levenshtein_less_distance() in postgres to return string similarity.
 
There must be a better way than 
where left(frequently_misspelled_text_field, [number]) || '%' ilike left(good_text_field,[number]) || '%'
which is inefficient and not robust. 
 
This came about previously when I was trending patient data on a state and city basis and had to deal with human-entered messes such as 'calafornia' , 'Yexas' (which wouldn't be caught unless I did the above using right() as well), 'NewYork', 'Massachusets', etc. 
Version history
Last update:
‎10-26-2021 03:45 AM
Updated by:
Contributors
Community Toolbox

Recommended quick links to assist you in optimizing your community experience:

Need additional support?:

Community Support Request