Text Classification
Text Classification
For purpose of word embedding extrinsic evaluation, especially downstream task.
Some concepts are informed from 复旦大学NLP组
Statistical-Based Method
Statistics perspective based text classification described as follow[Li Y 2015].
We use Tencent news titles as our text classification dataset. A total of 8,826 titles of four categories (society, entertainment, healthcare, and military) are extracted. The lengths of titles range from 10 to 20 words. We train ℓ2-regularized logistic regression classifiers using the LIBLINEAR package (Fan et al, 2008) with the learned embeddings.
Bibliography
复旦大学NLP组. NLP-Beginner. https://github.com/FudanNLP/nlp-beginner
[Li Y. 2015] Li Y, Li W, Sun F, et al. Component-Enhanced Chinese Character Embeddings[J]. empirical methods in natural language processing, 2015: 829-834.
转载于//www.cnblogs.com/fengyubo/p/11118431.html
还没有评论,来说两句吧...