In this project we make an attempt to compare a few popular techniques for text classification and study advantages and limitations of each. Techniques include Natural Language Processing techniques (like bag-of-words, n-grams), Machine Learning techniques(like k-means clustering) and Deep Learning techniques(like CNN). We will also try to understand why we need alternatives to each technique. Eventually we look forward to solve the problem of classifying Newspaper articles effectively. Such models can be very easily integrated into an online news portal, which will reduce the human effort of manually categorizing every article into classes.
G Srisha Anagh
Sai Charan Teja Tanguturu
Anmol Singh Sethi
Manavdeep Singh
Nairit Banerjee
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change Please make sure to update tests as appropriate.