For questions about text classification, the task of assigning predefined categories (or classes) to free-text documents.
Questions tagged [text-classification]
53 questions
8
votes
1 answer
Why are documents kept separated when training a text classifier?
Most of the literature considers text classification as the classification of documents. When using the bag-of-words and Bayesian classification, they usually use the statistic TF-IDF, where TF normalizes the word count with the number of words per…
freesoul
- 246
- 1
- 5
4
votes
2 answers
Does summing up word vectors destroy their meaning?
For example, I have a paragraph that I want to classify in a binary manner. But because the inputs have to have a fixed length, I need to ensure that every paragraph is represented by a uniform quantity.
One thing I've done is taken every word in…
Arnav Das
- 201
- 1
- 4
4
votes
2 answers
Is there any classifier that works best in general for NLP based projects?
I've written a program to analyse a given piece of text from a website and make conclusary classifications as to its validity. The code basically vectorizes the description (taken from the HTML of a given webpage in real time) and takes in a few…
Arnav Das
- 201
- 1
- 4
3
votes
1 answer
Using a pre-trained model to generate labels to data to then train a model on
I'm trying to set up a pipeline for my ML models to automatically re-train themselves whenever concept drift occurs to recalibrate to the new output distributions. However, I can't get ground-truth from my data without manual labeling, and I want an…
Sanger Steel
- 31
- 1
3
votes
4 answers
Top Frequent occurrence word effect in Model Efficiency?
Assume that I have a Dataframe with the text column.
Problem: Classification / Prediction
sms_text
0 Go until jurong point, crazy.. Available only ...
1 Ok lar... Joking wif u oni...
2 Free entry in 2 a wkly comp to win FA Cup fina...
3 …
Pluviophile
- 1,293
- 7
- 20
- 40
3
votes
1 answer
When is it time to switch to deep neural networks from simple networks in text classification problems?
I did an out of domain detection task (as a binary classification problem) and tried LR and Naive Bayes and BERT but the deep neural network didn't perform better than LR and NB. For the LR I just used BOW and it beats the 12-layer BERT.
In a…
Lerner Zhang
- 1,065
- 1
- 9
- 22
3
votes
1 answer
How can a system recognize if two strings have the same or similar meaning?
How can a system recognize if two strings have the same or similar meaning?
For example, consider the following two strings
Wikipedia provides good information.
Wikipedia is a good source of information.
What methods are available to do this?
John Hank
- 31
- 1
3
votes
2 answers
How to use LSTM to generate a paragraph
A LSTM model can be trained to generate text sequences by feeding the first word. After feeding the first word, the model will generate a sequence of words (a sentence). Feed the first word to get the second word, feed the first word + the second…
Dan D
- 1,318
- 1
- 14
- 39
2
votes
1 answer
Export trained model offline to be used by an application
I'm trying to create a text-based game using AI. I was playing around with text classification AutoML from Vertex AI just to learn AI, and then pick the best solution for my use case. Is it possible that I can train my data online (using any cloud…
Danilo Ruziska
- 123
- 3
2
votes
1 answer
How to go about classifying 1000 classes?
I am trying to find research paper with theory(preferably implementation) that is about classifying 1000 (or more) classes. I have heard of an implementation, that initially clustering needs to be done then classification with something like…
Naveen Reddy Marthala
- 205
- 2
- 11
2
votes
0 answers
NLP Bible verse division problem: Whats the best model/method?
I'm working on a project compiling various versions of the Bible into a dataset. For the most part versions separate verses discreetly. In some versions, however, verses are combined. Instead of verse 16, the marker will say 16-18. I wonder if,…
rwreed
- 121
- 2
2
votes
1 answer
How do RNN's for sentiment classification deal with different sentence lengths?
I have been doing a course which teaches you about Deep Neural Networks, during one of the exercises I was made to make an RNN for sentiment classification which I did, but I did not understand how an RNN is able to deal with sentences of different…
jr123456jr987654321
- 255
- 1
- 7
2
votes
2 answers
Is it possible that every class has a higher recall than precision for multi-class classification?
I am a student learning machine learning recently, and one thing is keep confusing me, I tried multiple sources and failed to find the related answer.
As following table shows (this is from some paper):
Is it possible that every class has a higher…
Cheleeger Ken
- 73
- 6
2
votes
0 answers
Are bayesian neural networks suited for text (or document) classification?
I've tried to do my research on Bayesian neural networks online, but I find most of them are used for image classification. This is probably due to the nature of Bayesian neural networks, which may be significantly slower than traditional artificial…
Nicole
- 21
- 1
2
votes
0 answers
Language Learning feedback with AI
Is there a program under development that uses AI technology, like Siri, to "hold hands" so to speak with a language learner and coach them on accent, colloqiual expressions, or to let them guide the language learning process using an archive of…
Tristan Beckwith
- 21
- 3