Skip to content

Long term prediction

Pony Biam! edited this page May 5, 2020 · 10 revisions

A simple long term predictive LightGBM model can be found in this notebook.

Feature s

Based on the exploratory data analysis a simple feature engineering was performed. Based on EDA of meter readings:

  • Healthcare and Utility usages shows the highest meter reading values
  • Steam meter shows the highest meter reading values
  • Monthly behaviour (meter-reading median):
    • Utility usage peaks in April-March
    • Chilledwater meter shows higher values in warm season
    • Steam meter shows lower values in April-October
  • Hourly behaviour (meter-reading median):
    • Higher values from 6 hs to 19 hs
    • Utility usage shows oposite tendency
    • Steam meter pikes from 5 hs to 8 hs
  • Weekday behaviour: lowers during weekends

In the following section can be found the features selected, transformed and created.

Selection

the following features were selected from each data set:

  • Building metadata
    • Building ID*
    • Site ID*
    • Primary space usage
    • Building size (sqft)
  • Weather data
    • Timestamp*
    • Site ID*
    • Air temperature
  • Meter reading data
    • Timestamp*
    • Building ID*
    • meter
    • meter reading (target)

Transformation

The following features were transformed:

  • primaryspaceusage categories (16) were reduced to healthcare, utility and other
  • meter categories (8) were preserved

Creation

The following features were created:

  • month
  • day of the week
  • hour of the day

Final features

  • Timestamp*
  • Site ID
  • Building ID
  • Month
  • Hour
  • Day of the week
  • Usage (3 levels: healthcare, utility, other)
  • Building size (sqft)
  • Air temperature
  • Meter (8 levels)
  • Meter reading / target

Parameters

Parameters for this model were not tuned, but were manually modified to perform better than default.

  • "objective": "regression"
  • "metric": "rmse"
  • "random_state": 55
  • "learning_rate": 0.01, (default 0.1)
  • "max_bin": 761 (default 255)
  • "num_leaves": 2197 (default 31)

Results

Clone this wiki locally