Skip to content

Commit dbdd9a4

Browse files
authored
add example for categorical feature usage (#76)
1 parent 58e52f9 commit dbdd9a4

File tree

2 files changed

+55
-1
lines changed

2 files changed

+55
-1
lines changed

README.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,43 @@ train.close();
164164
test.close();
165165
```
166166

167-
## Custom objectives
167+
### Categorical features
168+
169+
LightGBM supports defining features as categorical. To make this work with LightGBM4j, you need to do the following:
170+
171+
* Set their names with `setFeatureNames` so you can reference them later in options
172+
* Mark them as `categorical_feature` in booster options.
173+
174+
Given the dataset file in the LibSVM format, where categories are index-encoded:
175+
176+
```
177+
1 0:7 1:2 2:3 3:20 4:15 5:38 6:29 7:201
178+
0 0:5 1:15 2:2 3:1859 4:1 5:156 6:164 7:2475
179+
0 0:2 1:12 2:6 3:648 4:13 5:29 6:38 7:201
180+
1 0:10 1:26 2:5 3:1235 4:14 5:82 6:205 7:931
181+
0 0:6 1:18 2:1 3:737 4:12 5:224 6:162 7:2176
182+
0 0:4 1:12 3:1845 4:18 5:83 6:49 7:1491
183+
0 0:3 2:3 3:1652 4:20 5:2 6:180 7:332
184+
0 0:3 1:21 2:3 3:2010 4:16 5:216 6:69 7:911
185+
0 0:3 1:3 3:1555 4:1 5:84 6:81 7:1192
186+
0 0:8 1:2 2:6 3:1008 4:16 5:216 6:228 7:130
187+
```
188+
189+
You can load and use them in the following way:
190+
191+
```java
192+
LGBMDataset ds = LGBMDataset.createFromFile("./src/test/resources/categorical.data", "", null);
193+
ds.setFeatureNames(new String[]{"f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7"});
194+
String params = "objective=binary label=name:Classification categorical_feature=f0,f1,f2,f3,f4,f5,f6,f7";
195+
LGBMBooster booster = LGBMBooster.create(ds, params);
196+
for (int i=0; i<10; i++) {
197+
booster.updateOneIter();
198+
double[] eval1 = booster.getEval(0);
199+
System.out.println("train " + eval1[0]);
200+
}
201+
```
202+
203+
### Custom objectives
168204

169205
LightGBM4j supports using custom objective functions, but it doesn't provide any high-level wrappers as python API does.
170206

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
package io.github.metarank.lightgbm4j;
2+
3+
import org.junit.jupiter.api.Test;
4+
5+
public class CategoricalIntegrationTest {
6+
@Test
7+
void testCategorical() throws LGBMException {
8+
LGBMDataset ds = LGBMDataset.createFromFile("./src/test/resources/categorical.data", "", null);
9+
ds.setFeatureNames(new String[]{"f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7"});
10+
String params = "objective=binary label=name:Classification categorical_feature=f0,f1,f2,f3,f4,f5,f6,f7";
11+
LGBMBooster booster = LGBMBooster.create(ds, params);
12+
for (int i=0; i<10; i++) {
13+
booster.updateOneIter();
14+
double[] eval1 = booster.getEval(0);
15+
System.out.println("train " + eval1[0]);
16+
}
17+
}
18+
}

0 commit comments

Comments
 (0)