Retrieving the top 5 ranked terms from the top ranked document for each search query, based on the TFIDF formula, for an improved expanded search query.
- Pre-processing the given text collection and building an inverted index for it.
- Parsing the already created Ranked Information Retrieval results (
results.ranked.txt
) for each of the given queries (queries.lab3.txt
), which are based on the TFIDF formula. - Taking the highest ranked document for each query and calculating the TFIDF for each term in that pre-processed document.
- Retrieving the top 5 terms which are the most relevant terms to expand the original query with, in this case.
- Parsing and pre-processing the given queries and writing everything to the results file
Qm.1.5.txt
.
https://www.inf.ed.ac.uk/teaching/courses/tts/labs/lab5.html