This post outlines several ways to filter duplicate e-commerce content and also describes how indexing helps to determine the similarity between documents.