-
Notifications
You must be signed in to change notification settings - Fork 92
Long term prediction
Pony Biam! edited this page May 5, 2020
·
10 revisions
A simple long term predictive LightGBM model can be found in this notebook.
Based on the exploratory data analysis a simple feature engineering was performed. Based on EDA of meter readings:
- Healthcare and Utility usages shows the highest meter reading values
- Steam meter shows the highest meter reading values
- Monthly behaviour (meter-reading median):
- Utility usage peaks in April-March
- Chilledwater meter shows higher values in warm season
- Steam meter shows lower values in April-October
- Hourly behaviour (meter-reading median):
- Higher values from 6 hs to 19 hs
- Utility usage shows oposite tendency
- Steam meter pikes from 5 hs to 8 hs
- Weekday behaviour: lowers during weekends
In the following section can be found the features selected, transformed and created.
the following features were selected from each data set:
-
Building metadata
- Building ID*
- Site ID*
- Primary space usage
- Building size (sqft)
-
Weather data
- Timestamp*
- Site ID*
- Air temperature
-
Meter reading data
- Timestamp*
- Building ID*
- meter
- meter reading (target)
The following features were transformed:
-
primaryspaceusagecategories (16) were reduced to healthcare, utility and other -
metercategories (8) were preserved
The following features were created:
- month
- day of the week
- hour of the day
- Timestamp*
- Site ID
- Building ID
- Month
- Hour
- Day of the week
- Usage (3 levels: healthcare, utility, other)
- Building size (sqft)
- Air temperature
- Meter (8 levels)
- Meter reading / target
Parameters for this model were not tuned, but were manually modified to perform better than default.
- "objective": "regression"
- "metric": "rmse"
- "random_state": 55
- "learning_rate": 0.01, (default 0.1)
- "max_bin": 761 (default 255)
- "num_leaves": 2197 (default 31)