# Algorithms – Dynamic Programming – Edit distance (Levenshtein)

I suggest to check out the following post first: http://badearobert.ro/2019/05/03/algorithms-dynamic-programming-longest-common-subsequence-lcs/

Edit distance refer to minimum number of moves to make two strings equal.

The Levenshtein distance is a string metric for measuring difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.

Levenshtein distance may also be referred to as edit distance. It is closely related to pairwise string alignments.

Levenshtein distance are applicable in string matching, spell checking, etc.

Examples of edit distance:

**k**itten →**s**itten (substitution of “s” for “k”)- sitt
**t**in → sittin (deletion of “t“) - sittin → sittin
**g**(insertion of “g” at the end)

Let’s see the implementation of the algorithm.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
#include <iostream> #include <iomanip> #include <algorithm> #include <string> #include <vector> uint32_t levenshtein(const std::string& X, const std::string& Y) { if (X.size() == 0) return Y.size(); if (Y.size() == 0) return X.size(); std::vector<std::vector<int>> data(X.size() + 1, std::vector<int>(Y.size() + 1)); for (uint32_t i = 0; i <= X.size(); ++i) data[i][0] = i; for (uint32_t j = 0; j <= Y.size(); ++j) data[0][j] = j; for (uint32_t i = 1; i <= X.size(); ++i) { for (uint32_t j = 1; j <= Y.size(); ++j) { if (X[i - 1] == Y[j - 1]) data[i][j] = data[i - 1][j - 1]; else data[i][j] = 1 + std::min({ data[i - 1][j], data[i][j - 1], data[i - 1][j - 1] }); } } return data[X.size()][Y.size()]; } int main() { std::cout << "Usecase" << std::setw(10) << "Result" << std::endl; std::cout << "{abc, abc}" << std::setw(10) << levenshtein("abc", "abc") << std::endl; std::cout << "{abc, ABd}" << std::setw(10) << levenshtein("abc", "ABd") << std::endl; std::cout << "{aieou, axixexoxu}" << std::setw(10) << levenshtein("aieou", "axixexoxu") << std::endl; std::cout << "{AGGTAB, GXTXAYB}" << std::setw(10) << levenshtein("AGGTAB", "GXTXAYB") << std::endl; std::cin.get(); return 0; } |

And this is how the values are calculated, based on two strings: