jupyter notebookShared google doc here
- For notebook functionality:
- Jupyter notebook (
pip install jupyter) - IPython-SQL (
pip install ipython-sql)
- Jupyter notebook (
- lxml
This will eventually be done automatically (in some more or less-sophisticated manner...), however in the mean time, to use Indicator operators that use bins- such as LengthBin(NodeSet, bin_divs)- you can follow a simple rough procedure:
First, generate the features table, making sure to include full-path features for the lengths of interest. For example, for sequence and dependency tree path lengths, you would need to include:
Indicator(Between(Mention(0), Mention(1)), 'word')
Indicator(SeqBetween(), 'word')(these are currently implemented as get_relation_binning_features). Then, you can use code such as:
SELECT * FROM genepheno_features WHERE feature LIKE '%SEQ%'seq_lens = [len(rs.feature.split('_')) for rs in res_seq]
n, bins, patches = plt.hist(seq_lens, 50, normed=1, facecolor='green', alpha=0.75)
print [np.percentile(seq_lens, p) for p in [25,50,75]]See treedlib.ipynb for an example implementation.