SCUT_LLSF - implements the Scut thresholding technique from [2]
for the Linear Least Squares Fit classifier [3]
THRESHOLD=SCUT_LLSF(A, Q, CLUSTERS, K, LABELS_TR, LABELS_TE,
MINF1, L, METHOD, STEPS, SVD_METHOD, CLSI_METHOD) returns
the vector of thresholds for the Linear Least Squares Fit
classifier for the collection [A Q]. A and Q define the
training and test parts of the validation set with labels
LABELS_TR and LABELS_TE respectively. CLUSTERS is a
structure defining the classes, while MINF1 defines the
minimum F1 value and STEPS defines the number of steps
used during thresholding.
METHOD is the method used for the approximation of the
rank-l truncated SVD, with possible values:
- 'clsi': Clustered Latent Semantic Indexing [4].
- 'cm': Centroids Method [1].
- 'svd': Singular Value Decomosition.
SVD_METHOD defines the method used for the computation of
the SVD, while CLSI_METHOD defines the method used for the
determination of the number of factors from each class used
in Clustered Latent Semantic Indexing in case METHOD equals
'clsi'.
[THRESHOLD, F, THRESHOLDS]=SCUT_LLSF(A, Q, CLUSTERS, K,
LABELS_TR, LABELS_TE, MINF1, L, METHOD, STEPS, SVD_METHOD,
CLSI_METHOD) returns also the best F1 value as well as the
matrix of thresholds for each step (row i corresponds to
step i).
REFERENCES:
[1] H. Park, M. Jeon, and J. Rosen. Lower Dimensional
Representation of Text Data Based on Centroids and Least
Squares. BIT Numerical Mathematics, 43(2):427–448, 2003.
[2] Y. Yang. A Study of Thresholding Strategies for Text
Categorization. In Proc. 24th ACM SIGIR, pages 137–145,
New York, NY, USA, 2001. ACM Press.
[3] Y. Yang and C. Chute. A Linear Least Squares Fit
Mapping Method for Information Retrieval from Natural
Language Texts. In Proc. 14th Conference on Computational
Linguistics, pages 447–453, Morristown, NJ, USA, 1992.
[4] D. Zeimpekis and E. Gallopoulos, "Non-Linear Dimensional
Reduction via Class Representatives for Text Classification".
In Proc. 2006 IEEE International Conference on Data Mining
(ICDM'06), Hong Kong, Dec. 2006.
Copyright 2011 Dimitrios Zeimpekis, Eugenia Maria Kontopoulou,
Efstratios Gallopoulos