Skip to content

Neuraxle refactor #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f355cee
Merge pull request #1 from guillaume-chevalier/master
guillaume-chevalier Nov 1, 2019
fc5b7a9
Integrate Neuraxle In LSTM Human Activity Recognition Wip
alexbrillant Nov 1, 2019
c354c5c
Add Fitted Pipeline Saving Call
alexbrillant Nov 2, 2019
4688416
Add Api Serving Demonstration Code
alexbrillant Nov 3, 2019
db28a90
Extract Human Activity Recognition Pipeline Inside a Seprate File
alexbrillant Nov 3, 2019
5c80cb0
Add requirements.txt, and Fix TransformExpectedOutputWrapper, And LST…
alexbrillant Nov 3, 2019
654c433
Refactor Graph Forward Wip
alexbrillant Nov 3, 2019
6c115ef
Add Variable Scope, And Placeholder Names
alexbrillant Nov 3, 2019
44ccad8
Fix Tensorflow Graph Setup Initialization
alexbrillant Nov 3, 2019
4ba8e5a
Setup tensorflow model wrapper once
alexbrillant Nov 3, 2019
53e54de
Use step.sess in tensorflowv1stepsaver
alexbrillant Nov 3, 2019
85d8ced
Use Default Graph In Tf Session, And Use Default Graph In Tensorflow …
alexbrillant Nov 4, 2019
d5e4020
Add Demonstration Notebook For Pipeline Saving, And Loading For Api S…
alexbrillant Nov 4, 2019
84b360b
Add Neuraxle Commit Version To Requirements.txt
alexbrillant Nov 4, 2019
5c6ec1d
Update requirements to avoid numpy version failure, and add example A…
guillaume-chevalier Nov 5, 2019
348f28a
Edit Call Api notebook for gucci
alexbrillant Nov 5, 2019
e288133
Update Demonstration Notebooks
alexbrillant Nov 5, 2019
7b34230
Fix Notebook Demonstration
alexbrillant Nov 5, 2019
b0a0654
Added things to make it work.
guillaume-chevalier Nov 5, 2019
27565e3
Merge branch 'neuraxle-refactor' of github.com:Neuraxio/LSTM-Human-Ac…
guillaume-chevalier Nov 5, 2019
e200823
remove print bug
guillaume-chevalier Nov 5, 2019
16a8856
Improved examples
guillaume-chevalier Nov 5, 2019
67144b9
update README temporarily for clarity of the demo of the prototype.
guillaume-chevalier Nov 5, 2019
2fceeb6
Clean code a bit
guillaume-chevalier Nov 5, 2019
1c74c61
Fix neuraxle.steps.numpy OneHotEncoder imports
alexbrillant Nov 20, 2019
e4d2c99
Rerun notebooks with fixed neuraxle imports
alexbrillant Nov 20, 2019
6856be6
Add tensorflow-gpu in requirements.txt
alexbrillant Nov 20, 2019
d6aa641
Clean Example Using Neuraxle-Tensorflow
alexbrillant Jan 7, 2020
cfdd545
Wip Use Deep Learning Pipeline
alexbrillant Jan 11, 2020
09e8e09
Add Accuracy Metric Plotting With Deep Learning Pipeline
alexbrillant Jan 11, 2020
866619b
Extract plotting function to a file
alexbrillant Jan 11, 2020
cb08c95
Wip update notebook
alexbrillant Jan 12, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ steps/one_hot_encoder.py
steps/transform_expected_output_only_wrapper.py
venv/**
cache/**
neuraxle_tensorflow/**
24 changes: 24 additions & 0 deletions data_reading.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import os

import numpy as np

INPUT_SIGNAL_TYPES = [
Expand Down Expand Up @@ -68,3 +70,25 @@ def load_y(y_path):

# Substract 1 to each output class for friendly 0-based indexing
return y_ - 1


def load_data():
# Load "X" (the neural network's training and testing inputs)

X_train = load_X(X_train_signals_paths)
# X_test = load_X(X_test_signals_paths)

# Load "y" (the neural network's training and testing outputs)

y_train_path = os.path.join(DATASET_PATH, TRAIN, TRAIN_FILE_NAME)
# y_test_path = os.path.join(DATASET_PATH, TEST, TEST_FILE_NAME)

y_train = load_y(y_train_path)
# y_test = load_y(y_test_path)

print("Some useful info to get an insight on dataset's shape and normalisation:")
print("(data_inputs shape, expected_outputs shape, every data input mean, every data input standard deviation)")
print(X_train.shape, y_train.shape, np.mean(X_train), np.std(X_train))
print("The dataset is therefore properly normalised, as expected, but not yet one-hot encoded.")

return X_train, y_train
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
tensorflow==1.15
tensorflow-gpu==1.15
conv==0.2
git+git://github.com/alexbrillant/Neuraxle@one-hot-encoder-step#egg=Neuraxle
-e git://github.com/alexbrillant/Neuraxle.git@a270fe2b2f73c9350d76fcf4b6f058b764a8c8f7#egg=neuraxle
26 changes: 26 additions & 0 deletions steps/forma_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import numpy as np
from neuraxle.base import BaseStep, NonFittableMixin
from neuraxle.steps.output_handlers import InputAndOutputTransformerMixin


class FormatData(NonFittableMixin, InputAndOutputTransformerMixin, BaseStep):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be replaced by this?

Pipeline([
    ToNumpy(),
    OutputTransformerWrapper(ToNumpy())
])

def __init__(self, n_classes):
NonFittableMixin.__init__(self)
InputAndOutputTransformerMixin.__init__(self)
BaseStep.__init__(self)
self.n_classes = n_classes

def transform(self, data_inputs):
data_inputs, expected_outputs = data_inputs

if not isinstance(data_inputs, np.ndarray):
data_inputs = np.array(data_inputs)

if expected_outputs is not None:
if not isinstance(expected_outputs, np.ndarray):
expected_outputs = np.array(expected_outputs)

if expected_outputs.shape != (len(data_inputs), self.n_classes):
expected_outputs = np.reshape(expected_outputs, (len(data_inputs), self.n_classes))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if should not be needed. Use a OutputTransformerWrapper(OneHotEncoder()) instead.
If you also apply the previous comment, you should end up deleting this FormatData class as things are already done in other existing classes. We should not need any reshape here whatsoever if data is fed correctly, or if the OneHotEncoder works properly.


return data_inputs, expected_outputs
171 changes: 171 additions & 0 deletions train_and_save.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from neuraxle.api import DeepLearningPipeline
from neuraxle.base import ExecutionContext, DEFAULT_CACHE_FOLDER
from neuraxle.hyperparams.space import HyperparameterSamples
from neuraxle.pipeline import Pipeline
from neuraxle.steps.numpy import OneHotEncoder
from neuraxle.steps.output_handlers import OutputTransformerWrapper
from sklearn.metrics import accuracy_score

from data_reading import load_data
from neuraxle_tensorflow.tensorflow_v1 import TensorflowV1ModelStep
from steps.forma_data import FormatData


def create_graph(step: TensorflowV1ModelStep):
# Function returns a tensorflow LSTM (RNN) artificial neural network from given parameters.
# Moreover, two LSTM cells are stacked which adds deepness to the neural network.
# Note, some code of this notebook is inspired from an slightly different
# RNN architecture used on another dataset, some of the credits goes to
# "aymericdamien" under the MIT license.
# (NOTE: This step could be greatly optimised by shaping the dataset once
# input shape: (batch_size, n_steps, n_input)

# Graph input/output
data_inputs = tf.placeholder(tf.float32, [None, step.hyperparams['n_steps'], step.hyperparams['n_inputs']],
name='data_inputs')
expected_outputs = tf.placeholder(tf.float32, [None, step.hyperparams['n_classes']], name='expected_outputs')

# Graph weights
weights = {
'hidden': tf.Variable(
tf.random_normal([step.hyperparams['n_inputs'], step.hyperparams['n_hidden']])
), # Hidden layer weights
'out': tf.Variable(
tf.random_normal([step.hyperparams['n_hidden'], step.hyperparams['n_classes']], mean=1.0)
)
}

biases = {
'hidden': tf.Variable(
tf.random_normal([step.hyperparams['n_hidden']])
),
'out': tf.Variable(
tf.random_normal([step.hyperparams['n_classes']])
)
}

data_inputs = tf.transpose(
data_inputs,
[1, 0, 2]) # permute n_steps and batch_size

# Reshape to prepare input to hidden activation
data_inputs = tf.reshape(data_inputs, [-1, step.hyperparams['n_inputs']])
# new shape: (n_steps*batch_size, n_input)

# ReLU activation, thanks to Yu Zhao for adding this improvement here:
_X = tf.nn.relu(
tf.matmul(data_inputs, weights['hidden']) + biases['hidden']
)

# Split data because rnn cell needs a list of inputs for the RNN inner loop
_X = tf.split(_X, step.hyperparams['n_steps'], 0)
# new shape: n_steps * (batch_size, n_hidden)

# Define two stacked LSTM cells (two recurrent layers deep) with tensorflow
lstm_cell_1 = tf.contrib.rnn.BasicLSTMCell(step.hyperparams['n_hidden'], forget_bias=1.0, state_is_tuple=True)
lstm_cell_2 = tf.contrib.rnn.BasicLSTMCell(step.hyperparams['n_hidden'], forget_bias=1.0, state_is_tuple=True)
lstm_cells = tf.contrib.rnn.MultiRNNCell([lstm_cell_1, lstm_cell_2], state_is_tuple=True)

# Get LSTM cell output
outputs, states = tf.contrib.rnn.static_rnn(lstm_cells, _X, dtype=tf.float32)

# Get last time step's output feature for a "many-to-one" style classifier,
# as in the image describing RNNs at the top of this page
lstm_last_output = outputs[-1]

# Linear activation
return tf.matmul(lstm_last_output, weights['out']) + biases['out']


def create_optimizer(step: TensorflowV1ModelStep):
return tf.train.AdamOptimizer(learning_rate=step.hyperparams['learning_rate'])


def create_loss(step: TensorflowV1ModelStep):
# Loss, optimizer and evaluation
# L2 loss prevents this overkill neural network to overfit the data
l2 = step.hyperparams['lambda_loss_amount'] * sum(tf.nn.l2_loss(tf_var) for tf_var in tf.trainable_variables())

# Softmax loss
return tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
labels=step['expected_outputs'],
logits=step['output']
)
) + l2


def accuracy_score_classification(data_inputs, expected_outputs):
accuracy = np.mean(np.argmax(data_inputs, axis=1) == np.argmax(expected_outputs, axis=1))
return accuracy


class HumanActivityRecognitionPipeline(DeepLearningPipeline):
N_HIDDEN = 32
N_STEPS = 128
N_INPUTS = 9
LAMBDA_LOSS_AMOUNT = 0.0015
LEARNING_RATE = 0.0025
N_CLASSES = 6
BATCH_SIZE = 1500
EPOCHS = 14

def __init__(self):
super().__init__(
Pipeline([
OutputTransformerWrapper(OneHotEncoder(nb_columns=self.N_CLASSES, name='one_hot_encoded_label')),
FormatData(n_classes=self.N_CLASSES),
TensorflowV1ModelStep(
create_graph=create_graph,
create_loss=create_loss,
create_optimizer=create_optimizer
).set_hyperparams(
HyperparameterSamples({
'n_steps': self.N_STEPS, # 128 timesteps per series
'n_inputs': self.N_INPUTS, # 9 input parameters per timestep
'n_hidden': self.N_HIDDEN, # Hidden layer num of features
'n_classes': self.N_CLASSES, # Total classes (should go up, or should go down)
'learning_rate': self.LEARNING_RATE,
'lambda_loss_amount': self.LAMBDA_LOSS_AMOUNT,
'batch_size': self.BATCH_SIZE
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, let's only consider n_hidden, learning_rate, and lambda_loss_amount as hyperparameters per se. The others aren't planned to be changed during meta-optimization for instance).

We could let them there for now, however I would have seen them as something else. Looks like this issue perhaps: Neuraxio/Neuraxle#91

We could as well add a n_stacked hyperparam to control how many LSTMs we stack on top of each other (optional feature, not really needed for now).

)
]),
validation_size=0.15,
batch_size=self.BATCH_SIZE,
batch_metrics={'accuracy': accuracy_score},
shuffle_in_each_epoch_at_train=True,
n_epochs=self.EPOCHS,
epochs_metrics={'accuracy': accuracy_score},
scoring_function=accuracy_score
)


def main():
pipeline = HumanActivityRecognitionPipeline()

data_inputs, expected_outputs = load_data()
pipeline, outputs = pipeline.fit_transform(data_inputs, expected_outputs)

accuracies = pipeline.get_epoch_metric_train('accuracy')
plt.plot(range(len(accuracies)), accuracies)
plt.xlabel('epochs')
plt.xlabel('accuracy')
plt.title('Training accuracy')

accuracies = pipeline.get_epoch_metric_validation('accuracy')
plt.plot(range(len(accuracies)), accuracies)
plt.xlabel('epochs')
plt.xlabel('accuracy')
plt.title('Validation accuracy')
plt.show()

pipeline.save(ExecutionContext(DEFAULT_CACHE_FOLDER))
pipeline.teardown()


if __name__ == '__main__':
main()