Typically, we use a lot of algorithms to perform simple classification such as Decision Trees, SVM, Naive Bayes, Logistic Regression etc. How about using simple rules. For example, can't we look at some specific words to define whether it is a positive or negative sentiment? how about words like pathetic, worst, poor for negative sentiments whereas great, fabulous, superb for positive.
Let us see how we can use the Orange Data Mining tool to achieve the above objective. Following is the Orange Data Mining flow.
You can download the workflow from the Github dineshasanka/Orange-Data-Mining---Text-Analyitics (github.com)
1. From the import documents, a film review data set was extracted.
2. Preprocess Text was used to convert the texts to lowercase and remove some URLs.
3. Statistics is the key component in this package. This is where you identify the keywords.
5. As of now this is the dataset.
6. Two feature constructors were introduced. If you are good at Python you can use a Python Script component.
7. Depending on the positive and negative keywords, we can introduce a new column predicted as follows.
No comments:
Post a Comment