Skip to main content

Table 2 Techniques adopted in existing and proposed algorithms

From: Correlated concept based dynamic document clustering algorithms for newsgroups and scientific literature

Algorithm Document representation Similarity measure Data set
Existing algorithms
SHC Gad and Kamel (2010) Term weight (word/phrase relationship) Semantic Similarity Reuters-21578 and 20-Newsgroups
ESHC-IntraCVS Gavin and Yue (2009) Term frequency Cosine Similarity UW-CAN dataset, 314 web pages from University of Waterloo
CBA (Shehata (2010;Shehata et al. 2010) Verb argument structure Concept similarity Measure ACM abstract articles, Reuters, Brown corpus, Usenet newsgroups
ICA Liu et al. (2008) Term occurrencec Jaccard coefficient 20NewsGroup corpus
Proposed algorithms
TMARDC Term frequency MARDL, sentence similarity ACM abstract articles, 20Newsgroup
CCMARC Correlated terms Semantic similarity ACM abstract articles, 20Newsgroup
CCFICA Correlated terms Semantic similarity ACM abstract articles, 20Newsgroup