In the simple example in the previous section, the penalty for a gap is equal at all locations in the alignment. However, it often makes sense to penalise gaps differently at the ends, or at different positions within each sequence. For example, if a protein domain is being aligned to a longer sequence that is known to contain the domain, the penalties at the end of the domain should be reduced to allow the domain to slide over the longer sequence. If the secondary structure of one protein in a pair to be aligned is known, then increasing the gap-penalty within core secondary structure elements will reduce the likelihood of placing a gap in a secondary structure [Barton & Sternberg, 1987a,Lesk et al., 1986].
Both changes require simple modifications to the algorithm. End gaps
are adjusted by changing the gap-penalty constants for the 0th and
last row and column of the H matrix. Position-specific gaps are set
by having a vector of penalties P of length m rather than a single
constant .
This modifies the calculation of Hi,j to:
In this example, the gap-vector P refers to sequence A. Thus, the weight for aligning any residue in A with a gap will depend on where the residue is in A. In contrast, aligning a residue in B with a gap is penalised equally irrespective of position.
There are many ways of modifying position specific gap-penalties. For
example P can be applied to gaps in both sequences, but dependent
only on the position in A, so eliminating the fixed constant
,
or a second gap-penalty vector can be introduced for B.