Indexing by latent semantic analysis is natural language processing technique of vectorial semantics that analyzes the relationship between documents and the terms contained within. They also produce a set of concepts related to the documents.
The new concepts of space from the latent semantic indexing analysis can be used to compare the documents in the concept space. This is also known as data clustering or document classification.
They can be used to find similar documents across languages, which is called cross language retrieval, and can be used to find relations between terms, known as synonymy and polysmemy.
Given a query of terms, the LSI analysis can be translated into the concept space and find matching documents. This is commonly known as information retrieval.
But a fundamental problem with the synonymy and polysemy is in the natural language processing. Synonymy is where different words describe the same idea.
A query in a search engine may fail to retrieve a document that does not contain the words appearing in the query, even if the document is relevant. So even if words have the same meanings, the search query may not turn up...