Algorithm of the day -

The Needleman-Wunsch Algorithm

Summary

The Needleman-Wunsch algorithm is a dynamic programming algorithm used for global alignment of two sequences, finding the optimal alignment by maximizing the number of matches and minimizing the number of gaps

Use Case

The Needleman-Wunsch algorithm can be used in bioinformatics to align two DNA or protein sequences, helping to identify similarities and differences between them

Steps

  1. Initialize a 2D matrix to store the scores of the alignments
  2. Fill in the matrix by comparing the characters of the two sequences and calculating the scores based on matches, mismatches, and gaps
  3. Trace back the matrix to find the optimal alignment

Complexity

The time complexity of the Needleman-Wunsch algorithm is O(n*m) and the space complexity is O(n*m), where n and m are the lengths of the two sequences

Code Example

def needleman_wunsch(seq1, seq2, match=1, mismatch=-1, gap=-1):
    m, n = len(seq1), len(seq2)
    score_matrix = [[0 for _ in range(n+1)] for _ in range(m+1)]
    for i in range(m+1):
        score_matrix[i][0] = gap * i
    for j in range(n+1):
        score_matrix[0][j] = gap * j
    for i in range(1, m+1):
        for j in range(1, n+1):
            match_score = score_matrix[i-1][j-1] + (match if seq1[i-1] == seq2[j-1] else mismatch)
            delete_score = score_matrix[i-1][j] + gap
            insert_score = score_matrix[i][j-1] + gap
            score_matrix[i][j] = max(match_score, delete_score, insert_score)
    alignment = [''] * (m + n)
    i, j = m, n
    while i > 0 and j > 0:
        score = score_matrix[i][j]
        score_diagonal = score_matrix[i-1][j-1]
        score_up = score_matrix[i-1][j]
        score_left = score_matrix[i][j-1]
        if score == score_diagonal + (match if seq1[i-1] == seq2[j-1] else mismatch):
            alignment[i+j-1] = seq1[i-1]
            alignment[i+j] = seq2[j-1]
            i -= 1
            j -= 1
        elif score == score_up + gap:
            alignment[i+j-1] = seq1[i-1]
            alignment[i+j] = '-'
            i -= 1
        else:
            alignment[i+j-1] = '-'
            alignment[i+j] = seq2[j-1]
            j -= 1
    while i > 0:
        alignment[i] = seq1[i-1]
        i -= 1
    while j > 0:
        alignment[j] = seq2[j-1]
        j -= 1
    return ''.join(alignment[m:]), ''.join(alignment[:m])