Skip to content

ODFVs raise a PerformanceWarning for very large sets of features #2293

@roy651

Description

@roy651

Expected Behavior

Feature views should support hundreds of features, regardless of the existence of ODFV in the configuration

Current Behavior

When defining (a few?) feature views with a large total number of features (roughly 100 in our case), Pandas raises a PerformanceWarning due to fragmentation. See output:

/Users/user/anaconda3/envs/feast_wo/lib/python3.10/site-packages/feast/on_demand_feature_view.py:216: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()` df[f"{feature.name}"] = pd.Series(dtype=dtype)

Steps to reproduce

Create a few FeatureViews (I believe it's not the contributing factor) with a total of roughly 100 Features and add a simple transformation inside an on_demand_feature_view configuration. Run feast apply and observe the warning

Specifications

  • Version: 0.17.0
  • Platform: MacOS 12.1
  • Subsystem: Python 3.10, pandas 1.3.5

Possible Solution

As noted in the warning message above:
Consider joining all columns at once using pd.concat(axis=1) instead

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions