Skip to content

FastText Probability Values Greater Than 1 in DJL vs Python Implementation #3811

@chxbca

Description

@chxbca

Description

I'm experiencing an inconsistency between the FastText model predictions in Python and DJL (Deep Java Library) implementation. While both implementations return the same labels, the probability values differ significantly:

  • Labels: Identical between Python and DJL implementations
  • Probability Values: DJL returns values > 1, while Python returns normalized values between 0-1

This appears to be a bug in the probability calculation or normalization in the DJL FastText engine.

Expected Behavior

Probability values should be normalized between 0 and 1, consistent with the Python FastText implementation. The sum of all probabilities for a given prediction should equal 1.

Error Message

No specific error message is thrown, but the probability values returned by DJL FastText engine are mathematically incorrect as they are greater than 1, which violates the basic principles of probability theory.

How to Reproduce?

Steps to reproduce

  1. Load the same FastText model in both Python and DJL FastText engine
  2. Run classification on the same input text
  3. Compare the probability values of the predictions
  4. Observe that DJL returns probability values > 1 while Python returns values between 0-1

What have you tried to solve it?

  1. Verified that the same model file is being used in both implementations
  2. Confirmed that input text preprocessing is identical between Python and Java implementations
  3. Checked that both implementations use the same FastText model(.bin) format

Environment Info

----------- System Properties -----------
java.specification.version: 17
java.vm.vendor: Homebrew
os.name: Mac OS X
os.version: 26.1
java.version: 17.0.17
java.vendor: Homebrew
java.vm.version: 17.0.17+0
os.arch: aarch64

--------- Environment Variables ---------
JAVA_HOME: /opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home

-------------- Directories --------------
DJL cache directory: /Users/chxbca/.djl.ai

----------------- Engines ---------------
DJL version: 0.35.0-SNAPSHOT
Default Engine: PyTorch:2.7.1
PyTorch Library: /.djl.ai/pytorch/2.7.1-cpu-osx-aarch64

--------------- Hardware --------------
Available processors (cores): 8
Byte Order: LITTLE_ENDIAN

--------------- Project Info --------------
FastText Engine version: 0.35.0
Spring Boot version: 2.7.18
Model file size: 865MB
Java version used at runtime: 17.0.17

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions