Shortest common supersequence problem
In computer science, the shortest common supersequence problem is a problem closely related to the longest common subsequence problem. Given two sequences X = < x1,...,xm > and Y = < y1,...,yn >, a sequence U = < u1,...,uk > is a common supersequence of X and Y if U is a supersequence of both X and Y. In other words, a shortest common supersequence of strings x and y is a shortest string z such that both x and y are subsequences of z.
A shortest common supersequence (scs) is a common supersequence of minimal length. In the shortest common supersequence problem, the two sequences X and Y are given and the task is to find a shortest possible common supersequence of these sequences. In general, an scs is not unique.
For two input sequences, an scs can be formed from a longest common subsequence (lcs) easily. For example, if X and Y, the lcs is Z. By inserting the non-lcs symbols while preserving the symbol order, we get the scs: U.
It is quite clear that for two input sequences. However, for three or more input sequences this does not hold. Note also, that the lcs and the scs problems are not dual problems.
For the more general problem of finding a string, S which is a superstring of a set of strings S1,S2,...,Sl, the problem is NP-Complete .[1] Also, good approximations can be found for the average case but not for the worst case.[2][3]
References
- ↑ Kari-Jouko Räihä, Esko Ukkonen (1981). "The shortest common supersequence problem over binary alphabet is NP-complete". Theoretical Computer Science 16 (2): 187–198. doi:10.1016/0304-3975(81)90075-x.
- ↑ Tao Jiang and Ming Li (1994). "On the Approximation of Shortest Common Supersequences and Longest Common Subsequences". SIAM Journal on Computing 24 (5): 1122–1139. doi:10.1137/s009753979223842x.
- ↑ Marek Karpinski and Richard Schmied (2013). "On Improved Inapproximability Results for the Shortest Superstring and Related Problems". Proceedings of 19th CATS CRPIT 141: 27–36.
- Garey, Michael R.; Johnson, David S. (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. p. 228 A4.2: SR8. ISBN 0-7167-1045-5. Zbl 0411.68039.
- Szpankowski, Wojciech (2001). Average case analysis of algorithms on sequences. Wiley-Interscience Series in Discrete Mathematics and Optimization. With a foreword by Philippe Flajolet. Chichester: Wiley. ISBN 0-471-24063-X. Zbl 0968.68205.