Paragraph String Sentence Comparison Matching Algorithms Variety of simple algorithms for measuring the differences between two strings which can range from the number of similar letters between the two strings through to word matching only words and subwords The return is either an integer count or in some cases an optional percentage more modular when working strings of varying length function similar_text first second percent eslint disable line camelcase discuss at https locutus io php similar_text original by Rafa Kukawski https blog kukawski pl bugfixed by Chris McMacken bugfixed by Jarkko Rantavuori original by findings in stackoverflow https stackoverflow com questions 14136349 how does similar text work improved by Markus Padourek taken from https www kevinhq com 2012 06 php similartext function in javascript_16 html example 1 similar_text Hello World Hello locutus returns 1 8 example 2 similar_text Hello World null returns 2 0 if first null second null typeof first undefined typeof second undefined return 0 first second let pos1 0 let pos2 0 let max 0 const firstLength first length const secondLength second length let p let q let l let sum for p 0 p firstLength p for q 0 q secondLength q for l 0 p l firstLength q l secondLength first charAt p l second charAt q l l eslint disable line max len todo break up this crazy for loop and put the logic in its body if l max max l pos1 p pos2 q sum max if sum if pos1 pos2 sum similar_text first substr 0 pos1 second substr 0 pos2 if pos1 max firstLength pos2 max secondLength sum similar_text first substr pos1 max firstLength pos1 max second substr pos2 max secondLength pos2 max if percent return sum return sum 200 firstLength secondLength end similar_text let s0 once upon a time let s1 once upon a timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year console log similar_text s0 s0 true console log similar_text s0 s1 true console log similar_text s0 s2 true console log similar_text s0 s3 true console log similar_text s0 s4 true console log similar_text s0 s5 true As the name says levenshtein Calculate Levenshtein distance between two strings function levenshteinDistance str1 str2 const track Array str2 length 1 fill null map Array str1 length 1 fill null for let i 0 i str1 length i 1 track 0 i i for let j 0 j str2 length j 1 track j 0 j for let j 1 j str2 length j 1 for let i 1 i str1 length i 1 const indicator str1 i 1 str2 j 1 0 1 track j i Math min track j i 1 1 deletion track j 1 i 1 insertion track j 1 i 1 indicator substitution return track str2 length str1 length end levenshteinDistance let s0 once upon a time let s1 once upon a timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year console log levenshteinDistance s0 s0 console log levenshteinDistance s0 s1 console log levenshteinDistance s0 s2 console log levenshteinDistance s0 s3 console log levenshteinDistance s0 s4 console log levenshteinDistance s0 s5 Counts the occurences of matching words and sub words between strings Either returns the number of words matched or if the percentage boolean is set a value from 0 1 0 is no match and 1 is 100 same function matchingWords s1 s2 percentage let words1 s1 split n r let words2 s2 split n r words1 words1 map name name toLowerCase words2 words2 map name name toLowerCase words1 words1 filter el return el trim length 0 words2 words2 filter el return el trim length 0 console log words1 console log words2 let count 0 for let k1 0 k1 words1 length k1 for let k2 0 k2 words2 length k2 if words2 k2 includes words1 k1 count for let k2 0 k2 words2 length k2 for let k1 0 k1 words1 length k1 if words1 k1 includes words2 k2 count if percentage return count words1 length words2 length return count end matchingWords let s0 once upon a time let s1 once upon A timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year console log matchingWords s0 s0 true console log matchingWords s0 s1 true console log matchingWords s0 s2 true console log matchingWords s0 s3 true console log matchingWords s0 s4 true console log matchingWords s0 s5 true Takes the previous example further but also gives the option to identify the culprite matching words in the output function matchingWordsPlus s1 s2 opt perc true matchwords let words1 s1 split n r let words2 s2 split n r words1 words1 map name name toLowerCase words2 words2 map name name toLowerCase words1 words1 filter el return el trim length 0 words2 words2 filter el return el trim length 0 console log words1 console log words2 let matchList let count 0 for let k1 0 k1 words1 length k1 for let k2 0 k2 words2 length k2 if words2 k2 includes words1 k1 count matchList push words1 k1 for let k2 0 k2 words2 length k2 for let k1 0 k1 words1 length k1 if words1 k1 includes words2 k2 count matchList push words2 k2 matchList new Set matchList console log repeat words and sub words matchList if opt matchwords undefined opt matchwords matchList if opt perc return count words1 length words2 length return count end matchingWords let s0 once upon a time let s1 once upon A timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year let opts perc true matchwords console log matchingWordsPlus s0 s0 opts console log opts matchwords console log matchingWordsPlus s0 s1 console log matchingWordsPlus s0 s2 console log matchingWordsPlus s0 s3 console log matchingWordsPlus s0 s4 console log matchingWordsPlus s0 s5
go let s4 in a time long forgotten let s5 the weather was cold that year console log levenshteinDistance s0 s0 console log levenshteinDistance s0 s1 console log levenshteinDistance s0 s2 console log levenshteinDistance s0 s3 console log levenshteinDistance s0 s4 console log levenshteinDistance s0 s5 Counts the occurences of matching words and sub words between strings Either returns the number of words matched or if the percentage boolean is set a value from 0 1 0 is no match and 1 is 100 same function matchingWords s1 s2 percentage let words1 s1 split n r let words2 s2 split n r words1 words1 map name name toLowerCase words2 words2 map name name toLowerCase words1 words1 filter el return el trim length 0 words2 words2 filter el return el trim length 0 console log words1 console log words2 let count 0 for let k1 0 k1 words1 length k1 for let k2 0 k2 words2 length k2 if words2 k2 includes words1 k1 count for let k2 0 k2 words2 length k2 for let k1 0 k1 words1 length k1 if words1 k1 includes words2 k2 count if percentage return count words1 length words2 length return count end matchingWords let s0 once upon a time let s1 once upon A timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year console log matchingWords s0 s0 true console log matchingWords s0 s1 true console log matchingWords s0 s2 true console log matchingWords s0 s3 true console log matchingWords s0 s4 true console log matchingWords s0 s5 true Takes the previous example further but also gives the option to identify the culprite matching words in the output function matchingWordsPlus s1 s2 opt perc true matchwords let words1 s1 split n r let words2 s2 split n r words1 words1 map name name toLowerCase words2 words2 map name name toLowerCase words1 words1 filter el return el trim length 0 words2 words2 filter el return el trim length 0 console log words1 console log words2 let matchList let count 0 for let k1 0 k1 words1 length k1 for let k2 0 k2 words2 length k2 if words2 k2 includes words1 k1 count matchList push words1 k1 for let k2 0 k2 words2 length k2 for let k1 0 k1 words1 length k1 if words1 k1 includes words2 k2 count matchList push words2 k2 matchList new Set matchList console log repeat words and sub words matchList if opt matchwords undefined opt matchwords matchList if opt perc return count words1 length words2 length return count end matchingWords let s0 once upon a time let s1 once upon A timee let s2 once era let s3 an era long ago let s4 in a time long forgotten let s5 the weather was cold that year let opts perc true matchwords console log matchingWordsPlus s0 s0 opts console log opts matchwords console log matchingWordsPlus s0 s1 console log matchingWordsPlus s0 s2 console log matchingWordsPlus s0 s3 console log matchingWordsPlus s0 s4 console log matchingWordsPlus s0 s5