Calculating similarity between two data with Ruby and Elasticsearch
I recently had to find similar data located in a dataset, in order to find potential duplicate records: "John Doe 123456789" "John Foe 123123123" After considering a couple of options, I’ve decided to continue with Elasticsearch, as it was already integrated in the project I was working on. The Ruby client of Elasticsearch provided a useful function on search results, records.each_with_hit, that I could abuse for this situation: file = File....