Algorithm name with author(s) | Technical abbreviation | Representation | Similarity measure | Data set used |
---|---|---|---|---|
Threshold Resilient Online Algorithm Chou and Chen (2008) | IPLSI(Incremental Probabilistic Latent Semantic Indexing) | Latent Semantic Variables | A latent variable is introduced between documents and terms, Cosine function | NIST TDT Corpora |
Efficient Phrase Based Indexing Hammouda and Kamel (2004) | Uses DIG(Document Index Graph) for Web Clustering | Document Index Graph (Phrase Based Representation) | Phrase Based Similarity measure | USENET News Groups |
Component-Based Clustering Algorithms Boris et al. (2012) | IR(Initial Representative), MD(Measure Distance), UR(Update Representatives), EC(Evaluate Clusters), SC(Stop Criterion) | Object-Based Software Representation | CITY,CORREL, COSINE, ELUCID | 10 UCI Datasets |
Temporal Queries and Version Management Zaniolo and Wang (2008) | XML Techniques | V-Document (XML Document) | ---- | W3C, World Fact Book |
Density –Based Methods for Hierarchical Clustering Chehreghani and Abolhassani (2008) | 3-Phases: Insertion Phase, Extraction Phase, Combination Phase | M-Tree Structure | Relative distance between objects | DMOZ, NEWS, REUTERS |
XML Schema Matching Algorithm Alsayed et al. (2009) | NPS(Number Prufer Sequences), LPS(Label Prufer Sequences) | Prufer Sequences, Schema Trees | The distance between two nodes in the schema tree | XCBL, OAGIS |
Novel Web User Clustering Method Ling et al. (2009) | A 3Phase COWES Algorithm | A Web Session Subtree | DoC(Degree of Change), FoC(Frequency of Change) and SoC(Significance of Change) | Internet Traffic Archive |
Multi-label Document Clustering Algorithm Chen et al. (2010) | FMDC(Fuzzy Based Multi-label Document Clustering) – Fuzzy Association Rule + Existing Ontology | Terms and Hypernyms Representation of documents | Membership Functions and Document Term Matrix | Classic, Re0, R8, and WebKB |
Incremental Construction of Multilingual Topic Maps Ellouze et al. (2012) | CITOM(Construction Incremental Topic Map) | Topic Map Model Representation | Topic Map Pruning Process | Multilingual corpora |
Feature Extraction Algorithm Yan et al. (2011) | TOFA(Trace-Oriented Feature Analysis) | Bag Of Words Model(BOW) | Latent Semantic Indexing(LSI) | 20NG, RVCI, ODP |
Correlation Similarity Measure Space Zhang et al. (2011) | CPI(Correlation Preserving Index) | Terms and related terms | Correlation similarity | 20NG |
Contextual Document Cluster Rooney et al. (2006) | CDC(Contextual Document Cluster) | Term Document Representation | Adjacent Document Similarity | RCVI |
Framework of Wikipedia-Based Clustering Hu et al. (2009) | Exact-match and Relatedness-match | Concept feature vector and Category feature vector | Complete Linkage as cluster distance measure | 20-newsgroup, TDT2, LA Times |