If you plan on having a web page which you want many people to visit, or if you are interested in knowing just how your keyword searches turn up the results that they do, then you will want to know a little more about latent semantic indexing and just how it works.
Latent semantic indexing is a technique that projects queries and documents into space with latent semantic dimensions. In the latent semantic space, a query and a document are similar even if they don’t share any of the same terms if their terms are semantically similar.
LSI is similarly metric to word overlap measures. LSI has fewer dimensions than the original space and is a method for dimensionality reduction.
There are several different mappings for latent semantic indexing from high dimensional to low dimensional spaces. LSI chooses the optimal mapping in a sense that minimizes the distance.
Choosing the number of dimensions is a unique problem. A reduction can remove much of the noise while keeping too few dimensions may lose important information.
LSI performance is improved considerably after ten to twenty dimensions and peaks at seventy to one hundred dimensions....