The homepage of this test collection is http://ece.ut.ac.ir/dbrg/Hamshahri/ The Hamshahri corpus contains news documents from the Hamshahri online newspaper (http://www.hamshahrionline.ir/). The documents span from 1996 to 2002, and cover 82 different categories such as politics, literature, art, economics, etc. The corpus has the following statistics: Size (MB): 345MB (564MB with tags) # of documents: 166,774 # of unique terms: 417,339 Average Document Length (words): 380 Average Document Length: 1.8KB (note: not unicode characters) Documents range from short news under 1KB to long articles up to 140KB. There are two sets of queries and relevance judgements provided: The first contains 65 topics, which were developed by 17 different people. These queries have the following statistics: Average length: 2.84 terms The corresponding judgements were created via pooling process from 7 different retrieval engines using the LM1, LM2, LM3, LM4, VS2, VS4, and VS5 models. 17 different users (IT students) judged the top 100 documents as either relevant or not relevant, according to TREC guidelines. The second set of queries and relevance judgements contains 58 topics, where only the top 20 retrieved documents were judged. More information regarding this corpus is available from this paper: http://ece.ut.ac.ir/dbrg/Hamshahri/Papers/Hamshahri_Description.pdf