FinSight: NLP Financial Analysis App

This is a web application that performs Natural Language Processing on financial text. It uses a custom-fine-tuned spaCy model to perform Named Entity Recognition (NER) and the VADER library for sentiment analysis. The project demonstrates a full NLP workflow from data annotation and model training to building a hybrid model (trained + rule-based) and serving it via a Flask API.

Key Features

Custom Named Entity Recognition: Identifies custom entities like STOCK tickers and FIN_EVENT (e.g., "dot-com crash") in addition to standard entities like PERSON and ORG.
Sentiment Analysis: Provides a Positive, Negative, or Neutral sentiment score for the input text using the VADER model.
Interactive UI: A clean, dark-themed web interface for pasting text and viewing results.
Color-Coded Visualization: Displays the analyzed text with entities highlighted in color.
Dynamic Legend: Automatically generates a legend explaining the entity labels found in the text.
Hybrid NLP Pipeline: The final model loads a custom-trained spaCy pipeline and then adds a rule-based EntityRuler on top to create a robust, hybrid system.

How It Works

This diagram shows the flow of data from the user's browser to the Flask backend and back.

graph TD
    subgraph "Browser (Client-Side)"
        A["User pastes raw text & clicks Analyze"] --> B{JavaScript};
        B -- "1. POST Request w/ Text" --> C["/analyze API Endpoint"];
        F["4. JSON Response"] --> G{JavaScript};
        G --> H["Renders HTML, Sentiment & Legend in UI"];
    end

    subgraph "Flask Server (Backend)"
        C -- "Raw Text" --> D["2. Preprocess Text"];
        D -- "Cleaned Text" --> E("3a. spaCy Pipeline");
        D -- "Cleaned Text" --> I("3b. VADER Sentiment");
        E --> J["Generate displacy HTML & Legend Data"];
        I & J --> F;
    end

Technologies Used

Backend: Python, Flask, spaCy, VADER Sentiment
Frontend: HTML, CSS, JavaScript (with Fetch API)
NLP Concepts: Fine-tuning, Data Annotation, EntityRuler, Dependency Parsing, Catastrophic Forgetting, Overgeneralization.

Setup and Usage

To run this project locally, follow these steps:

Clone the repository:

git clone <your-repo-url>
cd finsight-nlp-app

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install dependencies:

pip install -r requirements.txt
python -m spacy download en_core_web_md

Train the model: The core of this project is the custom-trained model. Run the training script to generate the trained_model_final directory:
```
python train.py
```
Run the Flask application:
```
python app.py
```
Open your browser and navigate to http://127.0.0.1:5000.

Project Status & Future Improvements

This project serves as a strong proof-of-concept and a demonstration of a full NLP workflow. The entity recognition model is custom-trained and performs well on the specific financial texts it was trained on.

However, the sentiment analysis component currently uses VADER, which is a simple, rule-based "bag-of-words" model. As we discovered, it is not powerful enough to understand the deep context of financial news and can misinterpret cautionary articles as "Positive".

FinSight v2: The Path Forward

The clear next step for this project is to replace the VADER model with a state-of-the-art, context-aware Transformer model like FinBERT. FinBERT is specifically pre-trained on financial documents and would provide a much more nuanced and accurate sentiment analysis, correctly interpreting the tone of complex financial news.

The current application provides the perfect foundation for this upgrade, as it already handles the text cleaning and sentence extraction that would be required before feeding data to FinBERT.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
output.txt		output.txt
render.yaml		render.yaml
requirements.txt		requirements.txt
stock_news_today.txt		stock_news_today.txt
train.py		train.py
training_data.py		training_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FinSight: NLP Financial Analysis App

Key Features

How It Works

Technologies Used

Setup and Usage

Project Status & Future Improvements

FinSight v2: The Path Forward

About

Uh oh!

Releases

Packages

Languages

LordAizen1/finsight-nlp-app

Folders and files

Latest commit

History

Repository files navigation

FinSight: NLP Financial Analysis App

Key Features

How It Works

Technologies Used

Setup and Usage

Project Status & Future Improvements

FinSight v2: The Path Forward

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages