miprometheus.models¶
Model¶
-
class
miprometheus.models.
Model
(params, problem_default_values_={})[source]¶ Class representing base class for all Models.
Inherits from
torch.nn.Module
as all subclasses will represent a trainable model.Hence, all subclasses should override the
forward
function.Implements features & attributes used by all subclasses.
-
__init__
(params, problem_default_values_={})[source]¶ Initializes a Model object.
Parameters: - params (
miprometheus.utils.ParamInterface
) – Parameters read from configuration file. - problem_default_values (dict) – dict of parameters values coming from the problem class. One example of such parameter value is the size of the vocabulary set in a translation problem.
This constructor:
stores a pointer to
params
:>>> self.params = params
sets a default problem name:
>>> self.name = 'Model'
initializes the logger.
>>> self.logger = logging.Logger(self.name)
tries to parse the values coming from
problem_default_values_
:>>> try: >>> for key in problem_default_values_.keys(): >>> self.params.add_custom_params({key: problem_default_values_[key]}) >>> except BaseException: >>> self.logger.info('No parameter value was parsed from problem_default_values_')
initializes the data definitions:
Note
This dict contains information about the expected inputs and produced outputs of the current model class.
This object will be used during handshaking between the model and the problem class to ensure that the model can accept the batches produced by the problem and that the problem can accept the predictions of the model to compute the loss and accuracy.
This dict should be defined using self.params.
This dict should at least contains the targets field:
>>> self.data_definitions = {'inputs': {'size': [-1, -1], 'type': [torch.Tensor]}, >>> 'targets': {'size': [-1, 1], 'type': [torch.Tensor]} >>> }
sets the access to
AppState
: for dtype, visualization flag etc.>>> self.app_state = AppState()
initialize the best model loss (to select which model to save) to
np.inf
:>>> self.best_loss = np.inf
- params (
-
handshake_definitions
(problem_data_definitions_)[source]¶ Proceeds to the handshake between what the Problem class provides (through a
DataDict
) and what the model expects as inputs.Note
Handshaking is defined here as making sure that the
Model
and theProblem
agree on the data that they exchange. More specifically, theModel
has a definition of the inputs data that it expects (through itsself.data_definitions
attribute). TheProblem
has the same object describing what it generates.This functions proceeds to the handshaking as:
Verifying that all keys existing in
Model.data_definitions
are also existing inProblem.data_definitions
. If a key is missing, an exception is thrown.This function does not verify the key
targets
as this will be done byproblems.problem.Problem.handshake_definitions
.If all keys are present, than this function checks that for each (
Model.data_definitions
) key, the shape and type of the corresponding value matches what is indicated for the corresponding key inProblem.data_definitions
. If not, an exception is thrown.If both steps above passed, than the Model accepts what the Problem generates and can proceed to the forward pass.
To properly define the
data_definitions
dicts, here are some examples:>>> data_definitions = {'img': {'size': [-1, 320, 480, 3], 'type': [np.ndarray]}, >>> 'question': {'size': [-1, -1], 'type': [torch.Tensor]}, >>> 'question_length': {'size': [-1], 'type': [list, int]}, >>> # ... >>> }
Please indicate both the size and the type as
lists
:- Indicate all dimensions in the correct order for each key size field. If a dimension is unimportant or unknown (e.g. the batch size or variable-length sequences), then please indicate
-1
at the correct location. - If an object is a composition of several Python objects (
list
,dict
,…), then please include all objects type, matching the dimensions order: e.g.[list, dict]
.
Parameters: problem_data_definitions (dict) – Contains the definition of a sample generated by the Problem
class.Returns: True if the Model
accepts what theProblem
generates, otherwise throws an exception.
-
add_statistics
(stat_col)[source]¶ Adds statistics to
StatisticsCollector
.Note
Empty - To be redefined in inheriting classes.
Parameters: stat_col – StatisticsCollector
.
-
collect_statistics
(stat_col, data_dict, logits)[source]¶ Base statistics collection.
Note
Empty - To be redefined in inheriting classes. The user has to ensure that the corresponding entry in the
StatisticsCollector
has been created withself.add_statistics()
beforehand.Parameters: - stat_col –
StatisticsCollector
. - data_dict (DataDict) –
DataDict
containing inputs and targets. - logits – Predictions being output of the model.
- stat_col –
-
add_aggregators
(stat_agg)[source]¶ Adds statistical aggregators to :py:class:miprometheus.utils.StatisticsAggregator.
Note
Empty - To be redefined in inheriting classes.
Parameters: stat_agg – :py:class:miprometheus.utils.StatisticsAggregator.
-
aggregate_statistics
(stat_col, stat_agg)[source]¶ Aggregates the statistics collected by :py:class:miprometheus.utils.StatisticsCollector`` and adds the results to :py:class:miprometheus.utils.StatisticsAggregator.
Note
Empty - To be redefined in inheriting classes. The user has to ensure that the corresponding entry in the
StatisticsAggregator
has been created withself.add_aggregators()
beforehand. Given that theStatisticsAggregator
uses the statistics collected by theStatisticsCollector
, the user should also ensure that these statistics are correctly collected (i.e. use ofself.add_statistics
andself.collect_statistics
).Parameters: - stat_col – :py:class:miprometheus.utils.StatisticsAggregatorCollector
- stat_agg – :py:class:miprometheus.utils.StatisticsAggregator
-
plot
(data_dict, predictions, sample=0)[source]¶ Plots inputs, targets and predictions, along with model-dependent variables.
. note:
Abstract - to be defined in derived classes.
Parameters: - data_dict (
DataDict
) –DataDict
containing input and target batches. - predictions (
torch.tensor
) – Prediction. - sample (int) – Number of sample in batch (default: 0)
- data_dict (
-
save
(model_dir, training_status, training_stats, validation_stats)[source]¶ Generic method saving the model parameters to file. It can be overloaded if one needs more control.
Parameters: - model_dir (str) – Directory where the model will be saved.
- training_status (str) – String representing the current status of training.
- training_stats (
miprometheus.utils.StatisticsCollector
ormiprometheus.utils.StatisticsAggregator
) – Training statistics that will be saved to checkpoint along with the model. - validation_stats (
miprometheus.utils.StatisticsCollector
ormiprometheus.utils.StatisticsAggregator
) – Validation statistics that will be saved to checkpoint along with the model.
Returns: True if this is currently the best model (until the current episode, considering the loss).
-
load
(checkpoint_file)[source]¶ Loads a model from the specified checkpoint file.
Parameters: checkpoint_file – File containing dictionary with model state and statistics.
-
summarize
()[source]¶ Summarizes the model by showing the trainable/non-trainable parameters and weights per layer (
nn.Module
).Uses
recursive_summarize
to iterate through the nested structure of the model (e.g. for RNNs).Returns: Summary as a str.
-
SequentialModel¶
-
class
miprometheus.models.
SequentialModel
(params, problem_default_values_={})[source]¶ Class representing base class for all Sequential Models.
Inherits from models.model.Model as most features are the same.
Should be derived by all sequential models.
-
__init__
(params, problem_default_values_={})[source]¶ Mostly calls the base
models.model.Model
constructor.Specifies a better structure for
self.data_definitions
.Parameters: - params – Parameters read from configuration
.yaml
file. - problem_default_values (dict) – dict of parameters values coming from the problem class. One example of such parameter value is the size of the vocabulary set in a translation problem.
- params – Parameters read from configuration
-
plot
(data_dict, predictions, sample=0)[source]¶ Creates a default interactive visualization, with a slider enabling to move forth and back along the time axis (iteration over the sequence elements in a given episode). The default visualization contains the input, output and target sequences.
For a more model/problem - dependent visualization, please overwrite this method in the derived model class.
Parameters: - data_dict –
DataDict containing
- input sequences: [BATCH_SIZE x SEQUENCE_LENGTH x INPUT_SIZE],
- target sequences: [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_SIZE]
- predictions (torch.tensor) – Predicted sequences [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_SIZE]
- sample (int) – Number of sample in batch (default: 0)
- data_dict –
-
ModelFactory¶
-
class
miprometheus.models.
ModelFactory
[source]¶ ModelFactory: Class instantiating the specified model class using the passed params.
-
static
build
(params, problem_default_values_={})[source]¶ Static method returning a particular model, depending on the name provided in the list of parameters.
Parameters: params ( utils.param_interface.ParamInterface
) – Parameters used to instantiate the model class...note:
``params`` should contains the exact (case-sensitive) class name of the model to instantiate.
Parameters: problem_default_values (dict) – Default (hardcoded) values coming from a Problem class. Can be used to pass values such as a number of classes, an embedding dimension etc. Returns: Instance of a given model.
-
static
Visual Question Answering baselines¶
CNN + LSTM¶
-
class
miprometheus.models.vqa_baselines.cnn_lstm.
CNN_LSTM
(params, problem_default_values_={})[source]¶ Implementation of a simple VQA baseline, globally following these steps:
- Image Encoding, using a CNN model,
- Question Encoding (if specified) using a LSTM,
- Concatenates the two features vectors and pass then through a MLP to produce the predictions.
Warning
The CNN model used in this implementation is the one from the Relational Network model (implementation in models.relational_net.conv_input_model.py), constituted of 4 convolutional layers (with batch normalization).
Altough the cited paper above mentions GoogLeNet & VGG as other CNN models, they are not supported for now. It is planned in a future release to add support for
torchvision
models.This implementation has only been tested on
SortOfCLEVR
for now.-
__init__
(params, problem_default_values_={})[source]¶ Constructor of the
CNN_LSTM
model.Parses the parameters, instantiates the LSTM & CNN model, alongside with the MLP classifier.
Parameters: - params (utils.ParamInterface) – dict of parameters (read from configuration
.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
- params (utils.ParamInterface) – dict of parameters (read from configuration
-
forward
(data_dict)[source]¶ Runs the
CNN_LSTM
model.Parameters: data_dict (utils.DataDict) – DataDict({‘images’, ‘questions’, …}) where:
- images: [batch_size, num_channels, height, width],
- questions: [batch_size, size_question_encoding]
Returns: Predictions: [batch_size, output_classes]
-
plot
(data_dict, predictions, sample=0)[source]¶ Displays the image, the predicted & ground truth answers.
Parameters: - data_dict (utils.DataDict) –
DataDict({‘images’, ‘questions’, ‘targets’}) where:
- images: [batch_size, num_channels, height, width]
- questions: [batch_size, size_question_encoding]
- targets: [batch_size]
- predictions (torch.tensor) – Prediction.
- sample (int) – Index of sample in batch (DEFAULT: 0).
- data_dict (utils.DataDict) –
Stacked Attention Networks¶
-
class
miprometheus.models.vqa_baselines.stacked_attention_networks.
StackedAttentionNetwork
(params, problem_default_values_)[source]¶ Implementation of a Stacked Attention Networks (SAN).
The three major components of SAN are:
- the image model (CNN model, possibly pretrained),
- the question model (LSTM based),
- the stacked attention model.
Warning
This implementation has only been tested on
SortOfCLEVR
so far.-
__init__
(params, problem_default_values_)[source]¶ Constructor class of
StackedAttentionNetwork
model.- Parses the parameters,
- Instantiates the CNN model: A simple, 4-layers one, or a pretrained one,
- Instantiates an LSTM for the questions encoding,
- Instantiates a 3-layers MLP as classifier.
Parameters: - params (utils.ParamInterface) – dict of parameters (read from configuration
.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
-
forward
(data_dict)[source]¶ Runs the
StackedAttentionNetwork
model.Parameters: data_dict (utils.DataDict) – DataDict({‘images’, ‘questions’, …}) where:
- images: [batch_size, num_channels, height, width],
- questions: [batch_size, size_question_encoding]
Returns: Predictions: [batch_size, output_classes]
-
plot
(data_dict, predictions, sample=0)[source]¶ Displays the image, the predicted & ground truth answers.
Parameters: - data_dict (utils.DataDict) –
DataDict({‘images’, ‘questions’, ‘targets’}) where:
- images: [batch_size, num_channels, height, width],
- questions: [batch_size, size_question_encoding]
- targets: [batch_size]
- predictions (torch.tensor) – Prediction.
- sample (int) – Index of sample in batch (DEFAULT: 0).
- data_dict (utils.DataDict) –
-
class
miprometheus.models.vqa_baselines.stacked_attention_networks.
StackedAttentionLayer
(question_image_encoding_size, key_query_size, num_att_layers=2)[source]¶ Stacks several layers of
Attention
to enable multi-step reasoning.-
__init__
(question_image_encoding_size, key_query_size, num_att_layers=2)[source]¶ Constructor of the
StackedAttentionLayers
class.Parameters:
-
forward
(encoded_image, encoded_question)[source]¶ Apply stacked attention.
Parameters: - encoded_image (torch.tensor) – output of the image encoding (CNN + FC layer), should be of shape [batch_size, width * height, num_channels_encoded_image]
- encoded_question (torch.tensor) – Last hidden layer of the LSTM, of shape [batch_size, question_encoding_size]
Returns: u: attention [batch_size, num_channels_encoded_image]
-
-
class
miprometheus.models.vqa_baselines.stacked_attention_networks.
AttentionLayer
(question_image_encoding_size, key_query_size=512)[source]¶ Implements one layer of the Stacked Attention mechanism.
Reference: Section 3.3 of the paper cited above.
-
__init__
(question_image_encoding_size, key_query_size=512)[source]¶ Constructor of the
AttentionLayer
class.Parameters:
-
forward
(encoded_image, encoded_question)[source]¶ Applies one layer of stacked attention over the image & question.
Parameters: - encoded_image (torch.tensor) – output of the image encoding (CNN + FC layer), should be of shape [batch_size, width * height, num_channels_encoded_image]
- encoded_question (torch.tensor) – Last hidden layer of the LSTM, of shape [batch_size, question_encoding_size]
Returns: - “Refined query vector” (weighted sum of the image vectors, combine with the question vector), of shape [batch_size, num_channels_encoded_image]
- Attention weights, todo: shape?
-
-
class
miprometheus.models.vqa_baselines.stacked_attention_networks.
PretrainedImageEncoding
(cnn_model='resnet18', num_layers=2)[source]¶ Wrapper class over a
torchvision.model
to produce feature maps for the SAN model.-
__init__
(cnn_model='resnet18', num_layers=2)[source]¶ Constructor of the
PretrainedImageEncoding
class.Parameters: cnn_model (str) – select which pretrained model to load. Warning
This class has only been tested with the
resnet18
model.Parameters: num_layers (int) – Number of layers to select from the cnn_model
.
-
-
class
miprometheus.models.vqa_baselines.stacked_attention_networks.
MultiHopsStackedAttentionNetwork
(params, problem_default_values_)[source]¶ Implementation of a Stacked Attention Networks (SAN), with several attention hops over the question words.
The implementation details are very similar to the StackedAttentionNetwork`, to the difference that it uses an LSTMCell instead of an LSTM.
Warning
This implementation has only been tested on
ShapeColorQuery
so far.-
__init__
(params, problem_default_values_)[source]¶ Constructor class of
MultiHopsStackedAttentionNetwork
model.- Parses the parameters,
- Instantiates the CNN model: A simple, 4-layers one, or a pretrained one,
- Instantiates an LSTMCell for the questions encoding,
- Instantiates a 3-layers MLP as classifier.
Parameters: - params (utils.ParamInterface) – dict of parameters (read from configuration
.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
Initialize the hidden and cell states of the LSTM to 0.
Parameters: batch_size (int) – Size of the batch. Returns: hx, cx: hidden and cell states initialized to 0.
-
forward
(data_dict)[source]¶ Runs the
MultiHopsStackedAttentionNetwork
model.Parameters: data_dict (utils.DataDict) – DataDict({‘images’, ‘questions’, …}) where:
- images: [batch_size, num_channels, height, width],
- questions: [batch_size, size_question_encoding]
Returns: Predictions: [batch_size, output_classes]
-
plot
(data_dict, predictions, sample=0)[source]¶ Displays the image, the predicted & ground truth answers.
Parameters: - data_dict (utils.DataDict) –
DataDict({‘images’, ‘questions’, ‘targets’}) where:
- images: [batch_size, num_channels, height, width],
- questions: [batch_size, size_question_encoding]
- targets: [batch_size]
- predictions (torch.tensor) – Prediction.
- sample (int) – Index of sample in batch (DEFAULT: 0).
- data_dict (utils.DataDict) –
-
MAC¶
-
class
miprometheus.models.mac.
ControlUnit
(dim, max_step)[source]¶ Implementation of the
ControlUnit
of the MAC network.-
forward
(step, contextual_words, question_encoding, ctrl_state)[source]¶ Forward pass of the
ControlUnit
.Parameters: - step (int) – index of the current MAC cell.
- contextual_words (torch.tensor) – tensor of shape [batch_size x maxQuestionLength x dim] containing the words encodings (‘representation of each word in the context of the question’).
- question_encoding (torch.tensor) – question representation, of shape [batch_size x 2*dim].
- ctrl_state (torch.tensor) – previous control state, of shape [batch_size x dim]
Returns: new control state, [batch_size x dim]
-
-
class
miprometheus.models.mac.
ImageProcessing
(dim)[source]¶ Image encoding using a 2-layers CNN assuming the images have been already preprocessed by ResNet101.
-
__init__
(dim)[source]¶ Constructor for the 2-layers CNN.
Parameters: dim (int) – global ‘d’ hidden dimension
-
forward
(feature_maps)[source]¶ Apply the constructed CNN model on the feature maps (coming from ResNet101).
Parameters: feature_maps (torch.tensor) – [batch_size x nb_kernels x feat_H x feat_W] coming from ResNet101. Should have [nb_kernels x feat_H x feat_W] = [1024 x 14 x 14]. Return feature_maps: feature map, shape [batch_size, dim, new_height, new_width]
-
-
class
miprometheus.models.mac.
InputUnit
(dim, embedded_dim)[source]¶ Implementation of the
InputUnit
of the MAC network.-
forward
(questions, questions_len, feature_maps)[source]¶ Forward pass of the
InputUnit
.Parameters: - questions (torch.tensor) – tensor of the questions words, shape [batch_size x maxQuestionLength x embedded_dim].
- questions_len (list) – Unpadded questions length.
- feature_maps (torch.tensor) – [batch_size x nb_kernels x feat_H x feat_W] coming from ResNet101.
Returns: - question encodings: [batch_size x 2*dim] (torch.tensor),
- word encodings: [batch_size x maxQuestionLength x dim] (torch.tensor),
- images_encodings: [batch_size x nb_kernels x (H*W)] (torch.tensor).
-
-
class
miprometheus.models.mac.
MACUnit
(dim, max_step=12, self_attention=False, memory_gate=False, dropout=0.15)[source]¶ Implementation of the
MACUnit
(iteration over the MAC cell) of the MAC network.-
__init__
(dim, max_step=12, self_attention=False, memory_gate=False, dropout=0.15)[source]¶ Constructor for the
MACUnit
, which represents the recurrence over the MACCell.Parameters: - dim (int) – global ‘d’ hidden dimension.
- max_step (int) – maximal number of MAC cells. Default: 12
- self_attention (bool) – whether or not to use self-attention in the
WriteUnit
. Default:False
. - memory_gate (bool) – whether or not to use memory gating in the
WriteUnit
. Default:False
. - dropout (float) – dropout probability for the variational dropout mask. Default: 0.15
-
get_dropout_mask
(x, dropout)[source]¶ Create a dropout mask to be applied on x.
Parameters: - x (torch.tensor) – tensor of arbitrary shape to apply the mask on.
- dropout (float) – dropout rate.
Returns: mask.
-
forward
(context, question, knowledge, kb_proj)[source]¶ Forward pass of the
MACUnit
, which represents the recurrence over the MACCell.Parameters: - context (torch.tensor) – contextual words, shape [batch_size x maxQuestionLength x dim]
- question (torch.tensor) – questions encodings, shape [batch_size x 2*dim]
- knowledge (torch.tensor) – knowledge_base (feature maps extracted by a CNN), shape [batch_size x nb_kernels x (feat_H * feat_W)].
Returns: list of the memory states.
-
-
class
miprometheus.models.mac.
MACNetwork
(params, problem_default_values_={})[source]¶ Implementation of the entire
MAC
network.-
__init__
(params, problem_default_values_={})[source]¶ Constructor for the
MAC
network.Parameters: - params (utils.ParamInterface) – dict of parameters (read from configuration
.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
- params (utils.ParamInterface) – dict of parameters (read from configuration
-
forward
(data_dict, dropout=0.15)[source]¶ Forward pass of the
MAC
network. Calls first theInputUnit
, then the recurrent MAC cells and finally the`OutputUnit
.Parameters: - data_dict (utils.DataDict) – input data batch.
- dropout (float) – dropout rate.
Returns: Predictions of the model.
-
static
generate_figure_layout
()[source]¶ Generate a figure layout for the attention visualization (done in
MACNetwork.plot()
)Returns: figure layout.
-
plot
(data_dict, logits, sample=0)[source]¶ Visualize the attention weights (
ControlUnit
&ReadUnit
) on the question & feature maps. Dynamic visualization throughout the reasoning steps is possible.Parameters: - data_dict (utils.DataDict) – DataDict({‘images’,’questions’, ‘questions_length’, ‘questions_string’, ‘questions_type’, ‘targets’, ‘targets_string’, ‘index’,’imgfiles’, ‘prediction_string’})
- logits (torch.tensor) – Prediction of the model.
- sample (int) – Index of sample in batch (Default: 0)
Returns: True when the user closes the window, False if we do not need to visualize.
-
-
class
miprometheus.models.mac.
OutputUnit
(dim, nb_classes)[source]¶ Implementation of the
OutputUnit
of the MAC network.-
forward
(mem_state, question_encodings)[source]¶ Forward pass of the
OutputUnit
.Parameters: - mem_state (torch.tensor) – final memory state, shape [batch_size x dim]
- question_encodings (torch.tensor) – questions encodings, shape [batch_size x (2*dim)]
Returns: probability distribution over the classes, [batch_size x nb_classes]
-
-
class
miprometheus.models.mac.
ReadUnit
(dim)[source]¶ Implementation of the
ReadUnit
of the MAC network.-
__init__
(dim)[source]¶ Constructor for the
ReadUnit
.Parameters: dim (int) – global ‘d’ hidden dimension
-
forward
(memory_states, knowledge_base, ctrl_states, kb_proj)[source]¶ Forward pass of the
ReadUnit
. Assuming 1 scalar attention weight per knowledge base elements.Parameters: - memory_states (torch.tensor) – list of all previous memory states, each of shape [batch_size x mem_dim]
- knowledge_base (torch.tensor) – image representation (output of CNN), shape [batch_size x nb_kernels x (feat_H * feat_W)]
- ctrl_states (list) – All previous control state, each of shape [batch_size x ctrl_dim].
Returns: current read vector, shape [batch_size x read_dim]
-
-
miprometheus.models.mac.
linear
(input_dim, output_dim, bias=True)[source]¶ Defines a Linear layer. Specifies Xavier as the initialization type of the weights, to respect the original implementation: https://github.com/stanfordnlp/mac-network/blob/master/ops.py#L20
Parameters: - input_dim (int) – input dimension
- output_dim (int) – output dimension
- bias (bool) – If set to True, the layer will learn an additive bias initially set to true (as original implementation https://github.com/stanfordnlp/mac-network/blob/master/ops.py#L40)
Returns: Initialized Linear layer
-
class
miprometheus.models.mac.
WriteUnit
(dim, self_attention=False, memory_gate=False)[source]¶ Implementation of the
WriteUnit
of the MAC network.-
__init__
(dim, self_attention=False, memory_gate=False)[source]¶ Constructor for the
WriteUnit
.Parameters:
-
Simplified MAC¶
-
class
miprometheus.models.s_mac.
ControlUnit
(dim, max_step)[source]¶ Implementation of the
ControlUnit
for theS-MAC
model.Note
This implementation is part of a simplified version of the MAC network, where modifications regarding the different units have been done to reduce the number of linear layers (and thus number of parameters).
This is part of a submission to the ViGIL workshop for NIPS 2018. Feel free to use this model and refer to it with the following BibTex:
@article{marois2018transfer, title={On transfer learning using a MAC model variant}, author={Marois, Vincent and Jayram, TS and Albouy, Vincent and Kornuta, Tomasz and Bouhadjar, Younes and Ozcan, Ahmet S}, journal={arXiv preprint arXiv:1811.06529}, year={2018} }
-
__init__
(dim, max_step)[source]¶ Constructor for the
ControlUnit
.Parameters:
-
forward
(step, contextual_words, question_encoding, ctrl_state)[source]¶ Forward pass of the
ControlUnit
for theS-MAC
network.Parameters: - step (int) – index of the current MAC cell.
- contextual_words (
torch.Tensor
) – tensor of shape [batch_size x maxQuestionLength x dim] containing the words encodings (“representation of each word in the context of the question”). - question_encoding (
torch.Tensor
) – question representation, of shape [batch_size x 2*dim]. - ctrl_state (
torch.Tensor
) – previous control state, of shape [batch_size x dim]
Returns: new control state, [batch_size x dim] (
torch.Tensor
)
-
-
class
miprometheus.models.s_mac.
MACUnit
(dim, max_step=12, dropout=0.15)[source]¶ Implementation of the
MACUnit
(iteration over the MAC cell) of theS-MAC
network.Note
This implementation is part of a simplified version of the MAC network, where modifications regarding the different units have been done to reduce the number of linear layers (and thus number of parameters).
The implementation being simplified, we are not using the optional self-attention & memory-gating in the
WriteUnit
.This is part of a submission to the ViGIL workshop for NIPS 2018. Feel free to use this model and refer to it with the following BibTex:
@article{marois2018transfer, title={On transfer learning using a MAC model variant}, author={Marois, Vincent and Jayram, TS and Albouy, Vincent and Kornuta, Tomasz and Bouhadjar, Younes and Ozcan, Ahmet S}, journal={arXiv preprint arXiv:1811.06529}, year={2018} }
-
__init__
(dim, max_step=12, dropout=0.15)[source]¶ Constructor for the
MACUnit
, which represents the recurrence over the MACCell for theS-MAC
network.Parameters:
-
static
get_dropout_mask
(x, dropout)[source]¶ Create a dropout mask to be applied on x.
Parameters: - x (
torch.Tensor
) – tensor of arbitrary shape to apply the mask on. - dropout (float) – dropout rate.
Returns: mask (
torch.Tensor
)- x (
-
forward
(context, question, kb_proj)[source]¶ Forward pass of the
MACUnit
, which represents the recurrence over the MACCell for theS-MAC
network.Parameters: - context (
torch.Tensor
) – contextual words, shape [batch_size x maxQuestionLength x dim] - question (
torch.Tensor
) – questions encodings, shape [batch_size x 2*dim] - kb_proj (
torch.Tensor
) – Linear projection of the knowledge_base (feature maps extracted by a CNN), shape [batch_size x dim x (feat_H * feat_W)].
Returns: Last memory state (
torch.Tensor
)- context (
-
-
class
miprometheus.models.s_mac.
sMacNetwork
(params, problem_default_values_={})[source]¶ Implementation of the entire
S-MAC
model.Note
This implementation is a simplified version of the MAC network, where modifications regarding the different units have been done to reduce the number of linear layers (and thus number of parameters).
This is part of a submission to the ViGIL workshop for NIPS 2018. Feel free to use this model and refer to it with the following BibTex:
@article{marois2018transfer, title={On transfer learning using a MAC model variant}, author={Marois, Vincent and Jayram, TS and Albouy, Vincent and Kornuta, Tomasz and Bouhadjar, Younes and Ozcan, Ahmet S}, journal={arXiv preprint arXiv:1811.06529}, year={2018} }
-
__init__
(params, problem_default_values_={})[source]¶ Constructor for the
S-MAC
network.Parameters: - params (
miprometheus.utils.ParamInterface
) – dict of parameters (read from configuration.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
- params (
-
forward
(data_dict, dropout=0.15)[source]¶ Forward pass of the
S-MAC
network.Calls first the
InputUnit
, then the recurrent S-MAC cells and finally theOutputUnit`
.Parameters: - data_dict (
miprometheus.utils.DataDict
) – input data batch. - dropout (float) – dropout rate.
Returns: Predictions of the model.
- data_dict (
-
static
generate_figure_layout
()[source]¶ Generate a figure layout for the attention visualization (done in
sMacNetwork.plot()
)Returns: matplotlib.figure.Figure
layout.
-
plot
(data_dict, logits, sample=0)[source]¶ Visualize the attention weights (
ControlUnit
&ReadUnit
) on the question & feature maps.Dynamic visualization throughout the reasoning steps is possible.
Parameters: - data_dict (
miprometheus.utils.DataDict
) – DataDict({‘questions_string’, ‘questions_type’, ‘targets_string’,’imgfiles’, ‘prediction_string’, ‘clevr_dir’,**
}) - logits (
torch.Tensor
) – Prediction of the model. - sample (int) – Index of sample in batch (Default: 0)
Returns: True when the user closes the window, False if we do not need to visualize.
- data_dict (
-
-
class
miprometheus.models.s_mac.
ReadUnit
(dim)[source]¶ Implementation of the
ReadUnit
for theS-MAC
model.Note
This implementation is part of a simplified version of the MAC network, where modifications regarding the different units have been done to reduce the number of linear layers (and thus number of parameters).
This is part of a submission to the ViGIL workshop for NIPS 2018. Feel free to use this model and refer to it with the following BibTex:
@article{marois2018transfer, title={On transfer learning using a MAC model variant}, author={Marois, Vincent and Jayram, TS and Albouy, Vincent and Kornuta, Tomasz and Bouhadjar, Younes and Ozcan, Ahmet S}, journal={arXiv preprint arXiv:1811.06529}, year={2018} }
-
__init__
(dim)[source]¶ Constructor for the
ReadUnit
of theS-MAC
model.Parameters: dim (int) – global ‘d’ hidden dimension.
-
forward
(memory_state, ctrl_state, kb_proj)[source]¶ Forward pass of the
ReadUnit
. Assuming 1 scalar attention weight per knowledge base elements.Parameters: - memory_state (
torch.Tensor
) – Memory state, shape [batch_size x mem_dim]. - ctrl_state (
torch.Tensor
) – Control state, shape [batch_size x ctrl_dim]. - kb_proj (
torch.Tensor
) – Linear projection of the image representation (output of CNN), shape [batch_size x dim x (feat_H * feat_W)].
Returns: current read vector, shape [batch_size x read_dim] (
torch.Tensor
)- memory_state (
-
-
class
miprometheus.models.s_mac.
WriteUnit
(dim)[source]¶ Implementation of the
WriteUnit
for theS-MAC
model.Note
This implementation is part of a simplified version of the MAC network, where modifications regarding the different units have been done to reduce the number of linear layers (and thus number of parameters).
This is part of a submission to the ViGIL workshop for NIPS 2018. Feel free to use this model and refer to it with the following BibTex:
@article{marois2018transfer, title={On transfer learning using a MAC model variant}, author={Marois, Vincent and Jayram, TS and Albouy, Vincent and Kornuta, Tomasz and Bouhadjar, Younes and Ozcan, Ahmet S}, journal={arXiv preprint arXiv:1811.06529}, year={2018} }
-
__init__
(dim)[source]¶ Constructor for the
WriteUnit
of theS-MAC
model.Parameters: dim (int) – global ‘d’ hidden dimension.
-
forward
(read_vector)[source]¶ Forward pass of the
WriteUnit
for theS-MAC
model.Parameters: read_vector ( torch.Tensor
) – current read vector (output of theReadUnit
), shape [batch_size x dim].Returns: current memory state, shape [batch_size x mem_dim] ( torch.Tensor
).
-
Relational Networks¶
-
class
miprometheus.models.relational_net.
ConvInputModel
[source]¶ Simple 4 layers CNN for image encoding in the
RelationalNetwork
model.-
__init__
()[source]¶ Constructor.
Defines the 4 convolutional layers and batch normalization layers.
This implementation is inspired from the description in the section ‘Supplementary Material - CLEVR from pixels’ in the reference paper (https://arxiv.org/pdf/1706.01427.pdf).
-
-
class
miprometheus.models.relational_net.
PairwiseRelationNetwork
(input_size)[source]¶ Implementation of the g_theta MLP used in the Relational Network model.
For recall, the role of g_theta is to infer the ways in which 2 regions of the CNN feature maps are related, or if they are even related at all.
-
class
miprometheus.models.relational_net.
SumOfPairsAnalysisNetwork
(output_size)[source]¶ Implementation of the f_phi MLP used in the Relational Network model.
For recall, the role of f_phi is to produce the probability distribution over all possible answers.
-
class
miprometheus.models.relational_net.
RelationalNetwork
(params, problem_default_values_={})[source]¶ Implementation of the Relational Network (RN) model.
Questions are processed with an LSTM to produce a question embedding, and images are processed with a CNN to produce a set of objects for the RN. ‘Objects’ are constructed using feature-map vectors from the convolved image. The RN considers relations across all pairs of objects, conditioned on the question embedding, and integrates all these relations to answer the question.
Reference paper: https://arxiv.org/abs/1706.01427.
The CNN model used for the image encoding is located in
conv_input_model.py
.The MLPs (g_theta & f_phi) are in
functions.py
.-
__init__
(params, problem_default_values_={})[source]¶ Constructor.
Instantiates the CNN model (4 layers), and the 2 Multi Layer Perceptrons.
Parameters: - params – dictionary of parameters (read from the
.yaml
configuration file.) - problem_default_values (dict.) – default values coming from the
Problem
class.
- params – dictionary of parameters (read from the
-
build_coord_tensor
(batch_size, d)[source]¶ Create the tensor containing the spatial relative coordinate of each region (1 pixel) in the feature maps of the
ConvInputModel
. These spatial relative coordinates are used to ‘tag’ the regions.Parameters: Returns: tensor of shape [batch_size x d x d x 2]
-
forward
(data_dict)[source]¶ Runs the
RelationalNetwork
model.Parameters: data_dict (utils.DataDict) – DataDict({‘images’, ‘questions’, …}) containing:
- images [batch_size, num_channels, height, width],
- questions [batch_size, question_size]
Returns: Predictions of the model [batch_size, nb_classes]
-
Image Classification models¶
-
class
miprometheus.models.vision.
AlexnetWrapper
(params, problem_default_values_={})[source]¶ Wrapper class to Alexnet model from TorchVision.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor for the AlexNet wrapper. Simply instantiate the Alexnet model from
torchvision.models.
Note
The model expects input images normalized as follows: mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].
Parameters: - params – dictionary of parameters (read from the
.yaml
configuration file.) - problem_default_values (dict) – default values coming from the
Problem
class.
- params – dictionary of parameters (read from the
-
forward
(data_dict)[source]¶ Main forward pass of the Alexnet wrapper.
Parameters: data_dict – DataDict({‘images’,**}), where:
- images: [batch_size, num_channels, width, height],
Returns: Predictions [batch_size, num_classes]
-
plot
(data_dict, predictions, sample_number=0)[source]¶ Simple plot - shows the
Problem
’s images with the target & actual predicted class. :param data_dict: DataDict({‘images’,’targets’, ‘targets_label’}) :type data_dict: utils.DataDictParameters: - predictions (torch.tensor) – Predictions of the
AlexnetWrapper
. - sample_number (int) – Index of the sample in batch (DEFAULT: 0).
- predictions (torch.tensor) – Predictions of the
-
-
class
miprometheus.models.vision.
LeNet5
(params_, problem_default_values_)[source]¶ A classical LeNet-5 model for MNIST digits classification.
-
class
miprometheus.models.vision.
SimpleConvNet
(params, problem_default_values_={})[source]¶ A simple 2 layers CNN designed specifically to solve
MNIST
&CIFAR10
datasets. The parameters here are not hardcoded so the user can adjust them for his application, and see their impact on the model’s behavior.-
__init__
(params, problem_default_values_={})[source]¶ Constructor of the
SimpleConvNet
. The overall structure of this CNN is as follows:Conv1 -> MaxPool1 -> ReLu -> Conv2 -> MaxPool2 -> ReLu (-> flatten) -> Linear1 -> Linear2 -> Linear3The parameters that the user can change are:
- For Conv1 & Conv2: number of output channels, kernel size, stride and padding.
- For MaxPool1 & MaxPool2: Kernel size
- For Linear3: The number of classes is read from
problem_default_values_
. The number of output nodes for Linear1 is set to 120, and Linear2 is fixed to 120 -> 84 for now. Linear3 is 84 -> nb_classes.
Note
We are using the default values of
dilatation
,groups
&bias
fornn.Conv2D
.Similarly for the
stride
,padding
,dilatation
,return_indices
&ceil_mode
ofnn.MaxPool2D
.The size of the images (width, height, number of channels) are read from
problem_default_values_
. Also, it is possible that the images are padded (with 0s) by theProblem
class. The padding values (e.g. [2,2,2,2]) should be indicated inproblem_default_values_
, so that we can adjust the width & height.Note
The images will be upscaled to [224, 224] (which is the input size of AlexNet, so this would allow for comparison) if
problem_default_values_['up_scaling']
isTrue
.Parameters: - params (utils.ParamInterface) – dict of parameters (read from configuration
.yaml
file). - problem_default_values (dict) – default values coming from the
Problem
class.
-
forward
(data_dict)[source]¶ forward pass of the
SimpleConvNet
model.Parameters: data_dict – DataDict({‘images’,’targets’, ‘targets_label’}), where:
- images: [batch_size, num_channels, width, height],
- targets [batch_size]
Returns: Predictions [batch_size, num_classes]
-
plot
(data_dict, predictions, sample_number=0)[source]¶ Simple plot - shows the
Problem
’s images with the target & actual predicted class. :param data_dict: DataDict({‘images’,’targets’, ‘targets_label’}) :type data_dict: utils.DataDictParameters: - predictions (torch.tensor) – Predictions of the
SimpleConvNet
. - sample_number (int) – Index of the sample in batch (DEFAULT: 0).
- predictions (torch.tensor) – Predictions of the
-
Controllers for MANNs models¶
-
class
miprometheus.models.controllers.
ControllerFactory
[source]¶ Class returning concrete controller depending on the name provided in the list of parameters.
-
static
build
(params)[source]¶ Static method returning particular controller, depending on the name provided in the list of parameters.
Parameters: params ( utils.param_interface.ParamInterface
) – Parameters used to instantiate the controller...note:
``params`` should contains the exact (case-sensitive) class name of the controller to instantiate.
Returns: Instance of a given controller.
-
static
-
class
miprometheus.models.controllers.
FeedforwardController
(params)[source]¶ A wrapper class for a feedforward controller.
-
class
miprometheus.models.controllers.
FFGRUStateTuple
[source]¶ Tuple used by gru Cells for storing current/past state information.
-
class
miprometheus.models.controllers.
FFGRUController
(params)[source]¶ A wrapper class for a feedforward controller with a GRU cell.
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of GRUStateTuple class.
-
forward
(x, prev_state_tuple)[source]¶ Controller forward function.
Parameters: - x – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE] (generally the read data and input word concatenated)
- prev_state_tuple – Tuple of the previous hidden and cell state
Returns: outputs a Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and an GRU state tuple.
-
-
class
miprometheus.models.controllers.
GRUStateTuple
[source]¶ Tuple used by GRU Cells for storing current/past state information.
-
class
miprometheus.models.controllers.
GRUController
(params)[source]¶ A wrapper class for a GRU cell-based controller.
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of GRUStateTuple class. Returns: Initial state tuple - object of GRUStateTuple class.
-
forward
(x, prev_state_tuple)[source]¶ Controller forward function.
Parameters: - x – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE] (generally the read data and input word concatenated)
- prev_state_tuple – Tuple of the previous hidden and cell state
Returns: outputs a Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and an GRU state tuple.
-
-
class
miprometheus.models.controllers.
LSTMStateTuple
[source]¶ Tuple used by LSTM Cells for storing current/past state information.
-
class
miprometheus.models.controllers.
LSTMController
(params)[source]¶ A wrapper class for a LSTM-based controller.
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of LSTMStateTuple class.
-
forward
(x, prev_state_tuple)[source]¶ Controller forward function.
Parameters: - x – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE] (generally the read data and input word concatenated)
- prev_state_tuple – Tuple of the previous hidden and cell state
Returns: outputs a Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and an LSTM state tuple.
-
-
class
miprometheus.models.controllers.
RNNStateTuple
[source]¶ Tuple used by LSTM Cells for storing current/past state information.
-
class
miprometheus.models.controllers.
RNNController
(params)[source]¶ A wrapper class for a feedforward controller?
TODO: Doc needs update!
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of RNNStateTuple class.
-
forward
(inputs, prev_hidden_state_tuple)[source]¶ Controller forward function.
Parameters: - inputs – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE] (generally the read data and input word concatenated)
- prev_state_tuple – Tuple of the previous hidden state
Returns: outputs a Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and an RNN state tuple.
-
Memory-Augmented Neural Network (MANN) models¶
DWM¶
-
class
miprometheus.models.dwm.
Controller
(in_dim, output_units, state_units, read_size, update_size)[source]¶ Implementation of the DWM controller.
-
__init__
(in_dim, output_units, state_units, read_size, update_size)[source]¶ Constructor for the Controller.
Parameters: - in_dim – input size.
- output_units – output size.
- state_units – state size.
- read_size – size of data_gen read from memory
- update_size – total number of parameters for updating attention and memory
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of LSTMStateTuple class.
-
forward
(input, tuple_state_prev, read_data)[source]¶ Forward pass of the DWM controller, calculates the output, the hidden state and the interface parameters.
Parameters: - input – current input (from time t) [batch_size, in_dim]
- tuple_state_prev – contains previous hidden state (from time t-1) [batch_size, state_units]
- read_data – read data from memory (from time t) [batch_size, read_size]
Returns: output: logits represent the prediction [batch_size, output_units]
Returns: tuple_state: contains new_hidden_state
Returns: update_data: interface parameters [batch_size, update_size]
-
-
class
miprometheus.models.dwm.
DWMCellStateTuple
[source]¶ Tuple used by DWM Cells for storing current/past state information:
controller state, interface state, memory state.
-
class
miprometheus.models.dwm.
DWMCell
(in_dim, output_units, state_units, num_heads, is_cam, num_shift, M)[source]¶ Applies the DWM cell to an element in the input sequence.
-
__init__
(in_dim, output_units, state_units, num_heads, is_cam, num_shift, M)[source]¶ Builds the DWM cell.
Parameters: - in_dim – input size.
- output_units – output size.
- state_units – state size.
- num_heads – number of heads.
- is_cam – is it content_address able.
- num_shift – number of shifts of heads.
- M – Number of slots per address in the memory bank.
-
forward
(input, tuple_cell_state_prev)[source]¶ forward pass of the DWM_Cell.
Parameters: - input – current input (from time t) [batch_size, inputs_size]
- tuple_cell_state_prev – contains (tuple_ctrl_state_prev, tuple_interface_prev, mem_prev), object of class DWMCellStateTuple
Returns: output: logits [batch_size, output_size]
Returns: tuple_cell_state: contains (tuple_ctrl_state, tuple_interface, mem)
\[ \begin{align}\begin{aligned}step1: read memory\\r_t &= M_t \otimes w_t\\step2: controller\\h_t &= \sigma(W_h[x_t,h_{t-1},r_{t-1}])\\y_t &= W_{y}[x_t,h_{t-1},r_{t-1}]\\P_t &= W_{P}[x_t,h_{t-1},r_{t-1}]\\step3: memory update\\M_t &= M_{t-1}\circ (E-w_t \otimes e_t)+w_t\otimes a_t\\to be completed ...\end{aligned}\end{align} \]
-
-
class
miprometheus.models.dwm.
DWM
(params, problem_default_values_={})[source]¶ Differentiable Working Memory (DWM), is a memory augmented neural network which emulates the human working memory.
The DWM shows the same functional characteristics of working memory and robustly learns psychology-inspired tasks and converges faster than comparable state-of-the-art models
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: output: logits which represent the prediction of DWM [batch, sequence_length, output_size] Example:
>>> dwm = DWM(params) >>> inputs = torch.randn(5, 3, 10) >>> targets = torch.randn(5, 3, 20) >>> data_tuple = (inputs, targets) >>> output = dwm(data_tuple)
-
plot
(data_dict, predictions, sample_number=0)[source]¶ Interactive visualization, with a slider enabling to move forth and back along the time axis (iteration in a given episode).
Parameters: - data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] - “targets”: a tensor of targets of size [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- predictions – Prediction sequence [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- sample_number – Number of sample in batch (DEFAULT: 0)
-
-
class
miprometheus.models.dwm.
InterfaceStateTuple
[source]¶ Tuple used by interface for storing current/past interface information:
head_weight and snapshot_weight.
-
class
miprometheus.models.dwm.
Interface
(num_heads, is_cam, num_shift, M)[source]¶ Implementation of the interface of the DWM.
-
__init__
(num_heads, is_cam, num_shift, M)[source]¶ Initialize Interface.
Parameters: - num_heads – number of heads
- (boolean) (is_cam) – are the heads allowed to use content addressing
- num_shift – number of shifts of heads.
- M – Number of slots per address in the memory bank.
-
init_state
(memory_addresses_size, batch_size)[source]¶ Returns ‘zero’ (initial) state of Interface tuple.
Parameters: - batch_size – Size of the batch in given iteraction/epoch.
- memory_addresses_size – size of the memory
Returns: Initial state tuple - object of InterfaceStateTuple class: (head_weight_init, snapshot_weight_init)
-
read_size
¶ Returns the size of the data read by all heads.
Returns: (num_head*content_size)
-
update_size
¶ Returns the total number of parameters output by the controller.
Returns: (num_heads*parameters_per_head)
-
read
(wt, mem)[source]¶ Returns the data read from memory.
Parameters: - wt – head’s weights [batch_size, num_heads, memory_addresses_size]
- mem – the memory content [batch_size, memory_content_size, memory_addresses_size]
Returns: the read data [batch_size, num_heads, memory_content_size]
-
update
(update_data, tuple_interface_prev, mem)[source]¶ Erases from memory, writes to memory, updates the weights using various attention mechanisms.
Parameters: - update_data – the parameters from the controllers
- tuple_interface_prev – contains (head_weight, snapshot_weight)
- tuple_interface_prev.head_weight – head attention [batch_size, num_heads, memory_size]
- tuple_interface_prev.snapshot_weight – snapshot(bookmark) attention [batch_size, num_heads, memory_size]
- mem – the memory [batch_size, content_size, memory_size]
Returns: InterfaceTuple contains [head_weight, snapshot_weight]: the updated weight of head and snapshot
Returns: mem: the new memory content
-
-
class
miprometheus.models.dwm.
Memory
(mem_t)[source]¶ Implementation of the memory of the DWM.
-
__init__
(mem_t)[source]¶ Initializes the memory.
Parameters: mem_t – the memory at time t [batch_size, memory_content_size, memory_addresses_size]
-
attention_read
(wt)[source]¶ Returns the data read from memory.
Parameters: wt – head’s weights [batch_size, num_heads, memory_addresses_size] Returns: the read data [batch_size, num_heads, memory_content_size]
-
add_weighted
(add, wt)[source]¶ Writes data to memory.
Parameters: - wt – head’s weights [batch_size, num_heads, memory_addresses_size]
- add – the data to be added to memory [batch_size, num_heads, memory_content_size]
:return the updated memory [batch_size, memory_addresses_size, memory_content_size]
-
erase_weighted
(erase, wt)[source]¶ Erases elements from memory.
Parameters: - wt – head’s weights [batch_size, num_heads, memory_addresses_size]
- erase – data to be erased from memory [batch_size, num_heads, memory_content_size]
:return the updated memory [batch_size, memory_addresses_size, memory_content_size]
-
content_similarity
(k)[source]¶ Calculates the dot product for Content aware addressing.
Parameters: k – the keys emitted by the controller [batch_size, num_heads, memory_content_size] Returns: the dot product between the keys and query [batch_size, num_heads, memory_addresses_size]
-
size
¶ Returns the size of the memory.
Returns: Int size of the memory
-
content
¶ Returns the entire memory.
Returns: the memory []
-
-
miprometheus.models.dwm.
normalize
(x)[source]¶ Normalizes the input torch tensor along the last dimension using the max of the one norm The normalization is “fuzzy” to prevent divergences.
Parameters: x – input of shape [batch_size, A, A1 ..An] if the input is the weight vector x’sahpe (batch_size, num_heads, memory_size) Returns: normalized x of shape [batch_size, A, A1 ..An]
-
miprometheus.models.dwm.
sim
(query, data, l2_normalize=False, aligned=True)[source]¶ Batch dot-product similarity computed using matrix multiplication the hidden shapes must be broadcastable (numpy style)
Parameters: - query – the input data to be compared [batch_size, h, p] p = memory_size if aligned is True and p = content_size if aligned is False
- data – Input state [batch_size, content_size, memory_size]
- l2_normalize – boolean, determines where to normalize the query and the data before the dot product
- aligned – boolean, determines whether to transpose data along the last two dimensions
Returns: out[…,i,j] = sum_k q[…,i,k] * data_gen[…,j,k] for the default options
-
miprometheus.models.dwm.
outer_prod
(x, y)[source]¶ Batch outer product of two vectors (along the last two dimensions) the hidden shapes must be broadcastable (numpy style)
Parameters: - x – (for the dwm model) input one [batch_size, num_heads, memory_content_size]
- y – (for the dwm model) Input two [batch_size, num_heads, memory_addresses_size]
Returns: Outer product [batch_size, num_heads, memory_content_size, memory_addresses_size]
DNC¶
-
class
miprometheus.models.dnc.
ControlParams
(output_size, read_size, params)[source]¶ -
__init__
(output_size, read_size, params)[source]¶ Initialize an Controller.
Parameters: - output_size – output size.
- read_size – size of data_gen read from memory
- params – dictionary of input parameters
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of LSTMStateTuple class.
-
forward
(inputs, prev_ctrl_state_tuple, read_data)[source]¶ Calculates the output, the hidden state and the controller parameters.
Parameters: - inputs – Current input (from time t) [BATCH_SIZE x INPUT_SIZE]
- read_data – data read from memory (from time t-1) [BATCH_SIZE x num_data_bits]
- prev_ctrl_state_tuple – Tuple of states of controller (from time t-1)
Returns: Tuple [output, hidden_state, update_data] (update_data contains all of the controller parameters)
-
-
class
miprometheus.models.dnc.
NTMCellStateTuple
[source]¶ Tuple used by NTM Cells for storing current/past state information.
-
class
miprometheus.models.dnc.
DNCCell
(output_size, params)[source]¶ Class representing a single cell of the DNC.
-
__init__
(output_size, params)[source]¶ Initialize an DNC cell.
Parameters: - output_size – output size.
- state_units – state size.
- num_heads – number of heads.
-
-
class
miprometheus.models.dnc.
DNC
(params, problem_default_values_={})[source]¶ Implementation of Differentiable Neural Computer (DNC)
Graves, Alex, et al. “Hybrid computing using a neural network with dynamic external memory.” Nature 538.7626 (2016): 471. doi:10.1038/nature20101
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
plot_memory_attention
(data_dict, predictions, sample_number=0)[source]¶ Plots memory and attention TODO: fix.
Parameters: - data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] - “targets”: a tensor of targets of size [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- predictions – Prediction sequence [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- sample_number – Number of sample in batch (DEFAULT: 0)
-
plot
(data_dict, predictions, sample_number=0)[source]¶ Interactive visualization, with a slider enabling to move forth and back along the time axis (iteration in a given episode).
Parameters: - data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] - “targets”: a tensor of targets of size [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- predictions – Prediction sequence [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- sample_number – Number of sample in batch (DEFAULT: 0)
-
-
class
miprometheus.models.dnc.
InterfaceStateTuple
[source]¶ Tuple used by interface for storing current/past state information.
-
class
miprometheus.models.dnc.
Interface
(params)[source]¶ -
-
read_size
¶ Returns the size of the data read by all heads.
Returns: (num_head*content_size)
-
read
(prev_interface_tuple, mem)[source]¶ returns the data read from memory.
Parameters: - prev_interface_tuple – Tuple [previous read, previous write, prev usage, prev links]
- mem – the memory [batch_size, content_size, memory_size]
Returns: the read data [batch_size, content_size]
-
edit_memory
(interface_tuple, update_data, mem)[source]¶ Edits the external memory and then returns it.
Parameters: - update_data – the parameters from the controllers [dictionary]
- prev_interface_tuple – Tuple [previous read, previous write, prev usage, prev links]
- mem – the memory [batch_size, content_size, memory_size]
Returns: edited memory [batch_size, content_size, memory_size]
-
init_state
(memory_address_size, batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: - memory_address_size – The number of memory addresses
- batch_size – Size of the batch in given iteraction/epoch.
Returns: Initial state tuple - object of InterfaceStateTuple class.
-
update_weight
(prev_attention, memory, strength, gate, key, shift, sharp)[source]¶ Update the attention with NTM’s mix of content addressing and linear shifting.
Parameters: - prev_attention – tensor of shape [batch_size, num_writes, memory_size] giving the attention at the previous time step.
- memory – the memory of the previous step (class)
- strength – The strengthening parameter for the content addressing [batch, num_heads, 1]
- gate – The interpolation gate between the content addressing and the previous weight [batch, num_heads, 1]
- key – The comparison key for the content addressing [batch, num_heads, num_memory_bits]
- shift – The shift vector that defines the circular convolution of the outputs [batch, num_heads, num_shifts]
- sharp – sharpening parameter for the attention [batch, num_heads, 1]
-
update_write_weight
(usage, memory, allocation_gate, write_gate, key, strength)[source]¶ Update write attention with DNC’s combination of content addressing and usage based allocation.
Parameters: - usage – A tensor of shape [batch_size, memory_size] representing current memory usage.
- memory – the memory of the previous step (class)
- strength – The strengthening parameter for the content addressing [batch, num_writes, 1]
- key – The comparison key for the content addressing [batch, num_writes, num_memory_bits]
- allocation_gate – Interpolation between writing to unallocated memory and content-based lookup, for each write head [batch, num_writes, 1]
- write_gate – Overall gating of write amount for each write head. [batch, num_writes, 1]
-
update_read_weight
(link, memory, prev_read_weights, read_mode, key, strength)[source]¶ Update the read attention with the DNC’s combination of content addressing and temporal link propagation to go forwards or backwards in time.
Parameters: - link – A tensor of shape [batch_size, num_writes, memory_size, memory_size] representing the previous link graphs for each write head.
- memory – the memory of the previous step (class)
- prev_read_weights – tensor of shape [batch_size, num_reads, memory_size] containing the previous read weights w_{t-1}^r.
- read_mode – Mixing between “backwards” and “forwards” positions (for each write head) and content-based lookup, for each read head [batch, num_reads, 1+2*numwrites]
- strength – The strengthening parameter for the content addressing [batch, num_reads, 1]
- key – The comparison key for the content addressing [batch, num_reads, num_memory_bits]
-
update_read
(update_data, prev_interface_tuple, mem)[source]¶ Updates the read attention switching between the NTM and DNC mechanisms.
Parameters: - update_data – the parameters from the controllers [dictionary]
- prev_interface_tuple – Tuple [previous read, previous write, prev usage, prev links[
- prev_memory_BxMxA – the memory of the previous step (class)
Returns: The new interface tuple with an updated usage and write attention
-
update_write
(update_data, prev_interface_tuple, mem)[source]¶ Updates the write attention switching between the NTM and DNC mechanisms.
Parameters: - update_data – the parameters from the controllers [dictionary]
- prev_interface_tuple – Tuple [previous read, previous write, prev usage, prev links]
- prev_memory_BxMxA – the memory of the previous step (class)
Returns: The new interface tuple with an updated usage and write attention
-
update_and_edit
(update_data, prev_interface_tuple, prev_memory_BxMxA)[source]¶ Erases from memory, writes to memory, updates the weights using various attention mechanisms.
Parameters: - update_data – the parameters from the controllers [update_size]
- prev_interface_tuple – the read weight [BATCH_SIZE, MEMORY_SIZE]
- prev_memory_BxMxA – the memory of the previous step (class)
Returns: the new read vector, the update memory, the new interface tuple
-
-
class
miprometheus.models.dnc.
Memory
(mem_t)[source]¶ -
__init__
(mem_t)[source]¶ Initializes the memory.
Parameters: of shape (batch_size, memory_content_size, memory_addresses_size) (mem_t) – the memory at time t
-
attention_read
(wt)[source]¶ Returns the data read from memory.
:param wt of shape (batch_size, num_heads, memory_addresses_size) : head’s weights :return: the read data of shape (batch_size, num_heads, memory_content_size)
-
add_weighted
(add, wt)[source]¶ Writes data to memory.
:param wt of shape (batch_size, num_heads, memory_addresses_size) : head’s weights :param add of shape (batch_size, num_heads, memory_content_size) : the data to be added to memory
:return the updated memory of shape (batch_size, memory_addresses_size, memory_content_size)
-
erase_weighted
(erase, wt)[source]¶ Erases elements from memory.
:param wt of shape (batch_size, num_heads, memory_addresses_size) : head’s weights :param erase of shape (batch_size, num_heads, memory_content_size) : data to be erased from memory
:return the updated memory of shape (batch_size, memory_addresses_size, memory_content_size)
-
content_similarity
(k)[source]¶ Calculates the dot product for Content aware addressing.
Parameters: of shape (batch_size, num_heads, memory_content_size) (k) – the keys emitted by the controller Returns: the dot product between the keys and query of shape (batch_size, num_heads, memory_addresses_size)
-
size
¶ Returns the size of the memory.
Returns: Int size of the memory
-
content
¶ Returns the entire memory.
Returns: the memory []
-
-
class
miprometheus.models.dnc.
MemoryUsage
(name='MemoryUsage')[source]¶ Memory usage that is increased by writing and decreased by reading.
This module has a state is a tensor with values in the range [0, 1] indicating the usage of each of memory_size memory slots.
The usage is:
- Increased by writing, where usage is increased towards 1 at the write addresses.
- Decreased by reading, where usage is decreased after reading from a location when free_gate is close to 1.
The function write_allocation_weights can be invoked to get free locations to write to for a number of write heads.
-
__init__
(name='MemoryUsage')[source]¶ Creates a MemoryUsages module.
Parameters: name – Name of the module.
-
init_state
(memory_address_size, batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of InterfaceStateTuple class.
-
calculate_usage
(write_weights, free_gate, read_weights, prev_usage)[source]¶ Calculates the new memory usage u_t.
Memory that was written to in the previous time step will have its usage increased; memory that was read from and the controller says can be “freed” will have its usage decreased.
Parameters: - write_weights – tensor of shape [batch_size, num_writes, memory_size] giving write weights at previous time step.
- free_gate – tensor of shape [batch_size, num_reads] which indicates which read heads read memory that can now be freed.
- read_weights – tensor of shape [batch_size, num_reads, memory_size] giving read weights at previous time step.
- prev_usage – tensor of shape [batch_size, memory_size] giving usage u_{t - 1} at the previous time step, with entries in range [0, 1].
Returns: tensor of shape [batch_size, memory_size] representing updated memory usage.
-
write_allocation_weights
(usage, write_gates, num_writes)[source]¶ Calculates freeness-based locations for writing to.
This finds unused memory by ranking the memory locations by usage, for each write head. (For more than one write head, we use a “simulated new usage” which takes into account the fact that the previous write head will increase the usage in that area of the memory.)
Parameters: - usage – A tensor of shape [batch_size, memory_size] representing current memory usage.
- write_gates – A tensor of shape [batch_size, num_writes] with values in the range [0, 1] indicating how much each write head does writing based on the address returned here (and hence how much usage increases).
- num_writes – The number of write heads to calculate write weights for.
Returns: tensor of shape [batch_size, num_writes, memory_size] containing the freeness-based write locations. Note that this isn’t scaled by write_gate; this scaling must be applied externally.
-
exclusive_cumprod_temp
(sorted_usage, dim=1)[source]¶ Applies the exclusive cumultative product (at the moment it assumes the shape of the input)
Parameters: sorted_usage – tensor of shape [batch_size, memory_size] indicating current memory usage sorted in ascending order. Returns: Tensor of shape [batch_size, memory_size] that is exclusive pruduct of the sorted usage i.e. = [1, u1, u1*u2, u1*u2*u3, ….]
-
state_size
¶ Returns the shape of the state tensor.
-
class
miprometheus.models.dnc.
Param_Generator
(param_in_dim, word_size=20, num_reads=1, num_writes=1, shift_size=3)[source]¶ -
__init__
(param_in_dim, word_size=20, num_reads=1, num_writes=1, shift_size=3)[source]¶ Initialize all the parameters of the interface.
Parameters: - param_in_dim – input size. (typically the size of the hidden state)
- word_size – size of the word in memory
- num_reads – number of read heads
- num_writes – number of write heads
- shift_size – size of the shift vector (3 means it can go forward, backward and remain in place)
-
-
class
miprometheus.models.dnc.
TemporalLinkageState
[source]¶ Tuple used by interface for storing current/past state information.
-
class
miprometheus.models.dnc.
TemporalLinkage
(num_writes, name='temporal_linkage')[source]¶ Keeps track of write order for forward and backward addressing. This is a pseudo-RNNCore module, whose state is a pair (link, precedence_weights), where link is a (collection of) graphs for (possibly multiple) write heads (represented by a tensor with values in the range.
[0, 1]), and precedence_weights records the “previous write locations” used to build the link graphs. The function directional_read_weights computes addresses following the forward and backward directions in the link graphs.
-
__init__
(num_writes, name='temporal_linkage')[source]¶ Construct a TemporalLinkage module. Args:
Parameters: - memory_size – The number of memory slots.
- num_writes – The number of write heads.
- name – Name of the module.
-
init_state
(memory_address_size, batch_size)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple - object of InterfaceStateTuple class.
-
calc_temporal_links
(write_weights, prev_state)[source]¶ Calculate the updated linkage state given the write weights.
- :param : param write_weights: A tensor of shape [batch_size, num_writes, memory_size]
- containing the memory addresses of the different write heads.
- :param : param prev_state: TemporalLinkageState tuple containg a tensor link of
- shape [batch_size, num_writes, memory_size, memory_size], and a tensor precedence_weights of shape [batch_size, num_writes, memory_size] containing the aggregated history of recent writes.
Returns: returns: A TemporalLinkageState tuple next_state, which contains the updated link and precedence weights.
-
directional_read_weights
(link, prev_read_weights, forward)[source]¶ Calculates the forward or the backward read weights.
For each read head (at a given address), there are num_writes link graphs to follow. Thus this function computes a read address for each of the num_reads * num_writes pairs of read and write heads.
:param : param link: tensor of shape [batch_size, num_writes, memory_size, memory_size] representing the link graphs L_t. :param : param prev_read_weights: tensor of shape [batch_size, num_reads, memory_size] containing the previous read weights w_{t-1}^r. :param : param forward: Boolean indicating whether to follow the “future” direction in the link graph (True) or the “past” direction (False).
Returns: returns: tensor of shape [batch_size, num_reads, num_writes, memory_size]
-
-
miprometheus.models.dnc.
normalize
(x)[source]¶ Normalizes the input torch tensor along the last dimension using the max of the one norm The normalization is “fuzzy” to prevent divergences.
Parameters: x – input of shape (batch_size, A, A1 ..An) if the input is the weight vector x’sahpe (batch_size, num_heads, memory_size) Returns: normalized x of shape (batch_size, A, A1 ..An)
-
miprometheus.models.dnc.
sim
(query, data, l2_normalize=False, aligned=True)[source]¶ Batch dot-product similarity computed using matrix multiplication the hidden shapes must be broadcastable (numpy style)
Parameters: - query – the input data to be compared (batch_size, h, p) p = memory_size if aligned is True and p = content_size if aligned is False
- data – Input state (batch_size, content_size, memory_size]
- l2_normalize – boolean, determines where to normalize the query and the data before the dot product
- aligned – boolean, determines whether to transpose data along the last two dimensions
Returns: out[…,i,j] = sum_k q[…,i,k] * data_gen[…,j,k] for the default options
-
miprometheus.models.dnc.
outer_prod
(x, y)[source]¶ Batch outer product of two vectors (along the last two dimensions) the hidden shapes must be broadcastable (numpy style)
Parameters: - x – (the dwm model) input one (batch_size, num_heads, memory_content_size)
- y – (the dwm model) Input two (batch_size, num_heads, memory_addresses_size)
Returns: Outer product (batch_size, num_heads, memory_content_size, memory_addresses_size)
NTM¶
-
class
miprometheus.models.ntm.
NTMCellStateTuple
[source]¶ Tuple used by NTM Cells for storing current/past state information.
-
class
miprometheus.models.ntm.
NTMCell
(params)[source]¶ Class representing a single NTM cell.
-
__init__
(params)[source]¶ Cell constructor. Cell creates controller and interface. It also initializes memory “block” that will be passed between states.
Parameters: params – Dictionary of parameters.
-
init_state
(init_memory_BxAxC)[source]¶ Returns ‘zero’ (initial) state. “Recursivelly” calls controller and interface initialization.
Parameters: init_memory_BxAxC – Initial memory. Returns: Initial state tuple - object of NTMCellStateTuple class.
-
forward
(inputs_BxI, prev_cell_state)[source]¶ Forward function of NTM cell.
Parameters: - inputs_BxI – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE]
- prev_cell_state – a NTMCellStateTuple tuple, containing previous state of the cell.
Returns: an output Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and NTMCellStateTuple tuple containing current cell state.
-
-
class
miprometheus.models.ntm.
HeadStateTuple
[source]¶ Tuple used by interface for storing current/past state information.
-
class
miprometheus.models.ntm.
InterfaceStateTuple
[source]¶ Tuple used by interface for storing current/past state information.
-
class
miprometheus.models.ntm.
NTMInterface
(params)[source]¶ Class realizing interface between controller and memory.
-
init_state
(batch_size, num_memory_addresses)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: - batch_size – Size of the batch in given iteraction/epoch.
- num_memory_addresses – Number of memory addresses.
Returns: Initial state tuple - object of InterfaceStateTuple class.
-
forward
(ctrl_hidden_state_BxH, prev_memory_BxAxC, prev_interface_state_tuple)[source]¶ Controller forward function.
Parameters: - ctrl_hidden_state_BxH – a Tensor with controller hidden state of size [BATCH_SIZE x HIDDEN_SIZE]
- prev_memory_BxAxC – Previous state of the memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_interface_state_tuple – Tuple containing previous read and write attention vectors.
Returns: List of read vectors [BATCH_SIZE x CONTENT_SIZE], updated memory and state tuple (object of LSTMStateTuple class).
-
calculate_param_locations
(param_sizes_dict, head_name)[source]¶ Calculates locations of parameters, that will subsequently be used during parameter splitting.
Parameters: - param_sizes_dict – Dictionary containing parameters along with their sizes (in bits/units).
- head_name – Name of head.
Returns: “Locations” of parameters.
-
update_attention
(query_vector_BxC, beta_Bx1, gate_Bx1, shift_BxS, gamma_Bx1, prev_memory_BxAxC, prev_attention_BxAx1)[source]¶ Updates the attention weights.
Parameters: - query_vector_BxC – Query used for similarity calculation in content-based addressing [BATCH_SIZE x CONTENT_BITS]
- beta_Bx1 – Strength parameter used in content-based addressing.
- gate_Bx1 –
- shift_BxS –
- gamma_Bx1 –
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_attention_BxAx1 – previous attention vector [BATCH_SIZE x MEMORY_ADDRESSES x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
content_based_addressing
(query_vector_Bx1xC, beta_Bx1x1, prev_memory_BxAxC)[source]¶ Computes content-based addressing. Uses query vectors for calculation of similarity.
Parameters: - query_vector_Bx1xC – NTM “key” [BATCH_SIZE x 1 x CONTENT_BITS]
- beta_Bx1x1 – key strength [BATCH_SIZE x 1 x 1]
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: attention of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
location_based_addressing
(attention_BxAx1, shift_BxSx1, gamma_Bx1x1)[source]¶ Computes location-based addressing, i.e. shitfts the head and sharpens.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
circular_convolution
(attention_BxAx1, shift_BxSx1)[source]¶ Performs circular convolution, i.e. shitfts the attention accodring to given shift vector (convolution mask).
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
sharpening
(attention_BxAx1, gamma_Bx1x1)[source]¶ Performs attention sharpening.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
read_from_memory
(attention_BxAx1, memory_BxAxC)[source]¶ Returns 2D tensor of size [BATCH_SIZE x CONTENT_BITS] storing vector read from memory given the attention.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- memory_BxAxC – tensor containing memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: vector read from the memory [BATCH_SIZE x CONTENT_BITS]
-
update_memory
(write_attention_BxAx1, erase_vector_Bx1xC, add_vector_Bx1xC, prev_memory_BxAxC)[source]¶ Returns 3D tensor of size [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS] storing new content of the memory.
Parameters: - write_attention_BxAx1 – Current write attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- erase_vector_Bx1xC – Erase vector [BATCH_SIZE x 1 x CONTENT_BITS]
- add_vector_Bx1xC – Add vector [BATCH_SIZE x 1 x CONTENT_BITS]
- prev_memory_BxAxC – tensor containing previous state of the memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: vector read from the memory [BATCH_SIZE x CONTENT_BITS]
-
-
class
miprometheus.models.ntm.
NTM
(params, problem_default_values_={})[source]¶ Class representing the Neural Turing Machine module.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
generate_memory_attention_figure_layout
()[source]¶ Creates a figure template for showing basic NTM attributes (write & write attentions), memory and sequence (inputs, predictions and targets).
Returns: Matplot figure object.
-
plot_memory_attention_sequence
(data_dict, predictions, sample_number=0)[source]¶ Creates list of figures used in interactive visualization, with a slider enabling to move forth and back along the time axis (iteration in a given episode). The visualization presents input, output and target sequences passed as input parameters. Additionally, it utilizes state tuples collected during the experiment for displaying the memory state, read and write attentions.
Parameters: - data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] - “targets”: a tensor of targets of size [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- predictions – Prediction sequence [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- sample_number – Number of sample in batch (DEFAULT: 0)
-
generate_memory_all_model_params_figure_layout
()[source]¶ Creates a figure template for showing all NTM attributes (write & write attentions, gates, convolution masks), along with memory and sequence (inputs, predictions and targets).
Returns: Matplot figure object.
-
plot_memory_all_model_params_sequence
(data_dict, predictions, sample_number=0)[source]¶ Creates list of figures used in interactive visualization, with a slider enabling to move forth and back along the time axis (iteration in a given episode). The visualization presents input, output and target sequences passed as input parameters. Additionally, it utilizes state tuples collected during the experiment for displaying the memory state, read and write attentions; and gating params.
Parameters: - data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] - “targets”: a tensor of targets of size [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- predictions – Prediction sequence [BATCH_SIZE x SEQUENCE_LENGTH x OUTPUT_DATA_SIZE]
- sample_number – Number of sample in batch (DEFAULT: 0)
-
Encoder-Solver models¶
-
class
miprometheus.models.encoder_solver.
EncoderSolverLSTM
(params, problem_default_values_={})[source]¶ Class representing the Encoder-Solver architecture using LSTM cells as both encoder and solver modules.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
init_state
(batch_size)[source]¶ Returns ‘zero’ (initial) state.
Parameters: batch_size – Size of the batch in given iteraction/epoch. Returns: Initial state tuple (hidden, memory cell).
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
-
class
miprometheus.models.encoder_solver.
EncoderSolverNTM
(params, problem_default_values_={})[source]¶ Class implementing the Encoder-Solver NTM model. The model has two NTM cells, that are used in two distinctive modes.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
-
class
miprometheus.models.encoder_solver.
MAECellStateTuple
[source]¶ Tuple used by MAE Cells for storing current/past state information.
-
class
miprometheus.models.encoder_solver.
MAECell
(params)[source]¶ Class representing a single Memory-Augmented Encoder cell.
-
__init__
(params)[source]¶ Cell constructor. Cell creates controller and interface. It also initializes memory “block” that will be passed between states.
Parameters: params – Dictionary of parameters.
-
save
(model_dir, stat_obj, is_best_model, save_intermediate)[source]¶ Method saves the model and encoder to file.
Parameters: - model_dir – Directory where the model will be saved.
- stat_obj – Statistics object (collector or aggregator) that contain current loss and episode number (and other statistics).
- is_best_model – Flag indicating whether it is the best model or not.
Parma save_intermediate: Flag indicating whether intermediate models should be saved or not.
-
init_state
(init_memory_BxAxC)[source]¶ Initializes state of MAE cell. Recursively initialization: controller, interface.
Parameters: init_memory_BxAxC – Initial memory state [BATCH_SIZE x MEMORY_ADDRESSES x MEMORY_CONTENT]. Returns: Initial state tuple - object of NTMCellStateTuple class.
-
forward
(inputs_BxI, prev_cell_state)[source]¶ Forward function of NTM cell.
Parameters: - inputs_BxI – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE]
- prev_cell_state – a MAECellStateTuple tuple, containing previous state of the cell.
Returns: MAECellStateTuple tuple containing current cell state.
-
-
class
miprometheus.models.encoder_solver.
MAEInterfaceStateTuple
[source]¶ Tuple used by interface for storing current/past MAE interface state information.
-
class
miprometheus.models.encoder_solver.
MAEInterface
(params)[source]¶ Class realizing interface between controller and memory in Memory Augmented Encoder cell.
-
init_state
(batch_size, num_memory_addresses)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: - batch_size – Size of the batch in given iteraction/epoch.
- num_memory_addresses – Number of memory addresses.
Returns: Initial state tuple - object of InterfaceStateTuple class.
-
forward
(ctrl_hidden_state_BxH, prev_memory_BxAxC, prev_interface_state_tuple)[source]¶ Controller forward function.
Parameters: - ctrl_hidden_state_BxH – a Tensor with controller hidden state of size [BATCH_SIZE x HIDDEN_SIZE]
- prev_memory_BxAxC – Previous state of the memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_interface_state_tuple – Tuple containing previous interface tuple.
Returns: updated memory and state tuple (object of MAEInterfaceStateTuple class).
-
calculate_param_locations
(param_sizes_dict, head_name)[source]¶ Calculates locations of parameters, that will subsequently be used during parameter splitting.
Parameters: - param_sizes_dict – Dictionary containing parameters along with their sizes (in bits/units).
- head_name – Name of head.
Returns: “Locations” of parameters.
-
update_attention
(shift_BxS, gamma_Bx1, prev_memory_BxAxC, prev_attention_BxAx1)[source]¶ Updates the attention weights.
Parameters: - shift_BxS – Convolution shift
- gamma_Bx1 – Sharpening factor
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_attention_BxAx1 – previous attention vector [BATCH_SIZE x MEMORY_ADDRESSES x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
location_based_addressing
(attention_BxAx1, shift_BxSx1, gamma_Bx1x1, prev_memory_BxAxC)[source]¶ Computes location-based addressing, i.e. shitfts the head and sharpens.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
circular_convolution
(attention_BxAx1, shift_BxSx1, prev_memory_BxAxC)[source]¶ Performs circular convoution, i.e. shitfts the attention accodring to given shift vector (convolution mask).
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
sharpening
(attention_BxAx1, gamma_Bx1x1)[source]¶ Performs attention sharpening.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
update_memory
(write_attention_BxAx1, erase_vector_Bx1xC, add_vector_Bx1xC, prev_memory_BxAxC)[source]¶ Returns 3D tensor of size [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS] storing new content of the memory.
Parameters: - write_attention_BxAx1 – Current write attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- erase_vector_Bx1xC – Erase vector [BATCH_SIZE x 1 x CONTENT_BITS]
- add_vector_Bx1xC – Add vector [BATCH_SIZE x 1 x CONTENT_BITS]
- prev_memory_BxAxC – tensor containing previous state of the memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: vector read from the memory [BATCH_SIZE x CONTENT_BITS]
-
-
class
miprometheus.models.encoder_solver.
MAES
(params, problem_default_values_={})[source]¶ Class implementing the Memory Augmented Encoder-Solver (MAES) model.
- ..warning:
- Class assumes, that the whole batch has the same length, i.e. batch of subsequences becoming input to encoder is of the same length (ends at the same item). The same goes to subsequences being input to decoder.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
save
(model_dir, training_status, training_stats, validation_stats)[source]¶ Generic method saving the model parameters to file. It can be overloaded if one needs more control.
Parameters: - model_dir (str) – Directory where the model will be saved.
- training_status (str) – String representing the current status of training.
- training_stats (:py:class:miprometheus.utils.StatisticsCollector or :py:class:miprometheus.utils.StatisticsAggregator) – Training statistics that will be saved to checkpoint along with the model.
- validation_stats (:py:class:miprometheus.utils.StatisticsCollector or :py:class:miprometheus.utils.StatisticsAggregator) – Validation statistics that will be saved to checkpoint along with the model.
Returns: True if this is currently the best model (until the current episode, considering the loss).
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
class
miprometheus.models.encoder_solver.
MASCellStateTuple
[source]¶ Tuple used by MAS Cells for storing current/past state information.
-
class
miprometheus.models.encoder_solver.
MASCell
(params)[source]¶ Class representing a single Memory-Augmented Decoder cell.
-
__init__
(params)[source]¶ Cell constructor. Cell creates controller and interface. Assumes that memory will be initialized by the encoder.
Parameters: params – Dictionary of parameters.
-
init_state
(final_enc_memory_BxAxC, final_enc_attention_BxAx1)[source]¶ Initializes the solver cell state depending on the last memory state. Recursively initialization: controller, interface.
Parameters: encoder_state – Last state of MAE cell. Returns: Initial state tuple - object of MASCellStateTuple class.
-
init_state_with_encoder_state
(final_enc_cell_state)[source]¶ Creates ‘zero’ (initial) state on the basis of he previous cell state. “Recursivelly” calls controller and interface initialization.
Parameters: final_enc_cell_state – Last state of MAE cell. Returns: Initial state tuple - object of MASCellStateTuple class.
-
forward
(inputs_BxI, prev_cell_state)[source]¶ Forward function of MAS cell.
Parameters: - inputs_BxI – a Tensor of input data of size [BATCH_SIZE x INPUT_SIZE]
- prev_cell_state – a MASCellStateTuple tuple, containing previous state of the cell.
Returns: an output Tensor of size [BATCH_SIZE x OUTPUT_SIZE] and MASCellStateTuple tuple containing current cell state.
-
-
class
miprometheus.models.encoder_solver.
MASInterfaceStateTuple
[source]¶ Tuple used by interface for storing current/past Memory Augmented Solver interface state information.
-
class
miprometheus.models.encoder_solver.
MASInterface
(params)[source]¶ Class realizing interface between MAS controller and memory.
-
init_state
(batch_size, num_memory_addresses, final_encoder_attention_BxAx1)[source]¶ Returns ‘zero’ (initial) state tuple.
Parameters: - batch_size – Size of the batch in given iteraction/epoch.
- num_memory_addresses – Number of memory addresses.
- final_encoder_attention_BxAx1 – final attention of the encoder [BATCH_SIZE x MEMORY_ADDRESSES x 1]
Returns: Initial state tuple - object of InterfaceStateTuple class.
-
forward
(ctrl_hidden_state_BxH, prev_memory_BxAxC, prev_interface_state_tuple)[source]¶ Controller forward function.
Parameters: - ctrl_hidden_state_BxH – a Tensor with controller hidden state of size [BATCH_SIZE x HIDDEN_SIZE]
- prev_memory_BxAxC – Previous state of the memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_interface_state_tuple – Tuple containing previous read and write attention vectors.
Returns: List of read vectors [BATCH_SIZE x CONTENT_SIZE], updated memory and state tuple (object of LSTMStateTuple class).
-
calculate_param_locations
(param_sizes_dict, head_name)[source]¶ Calculates locations of parameters, that will subsequently be used during parameter splitting.
Parameters: - param_sizes_dict – Dictionary containing parameters along with their sizes (in bits/units).
- head_name – Name of head.
Returns: “Locations” of parameters.
-
update_attention
(gate_Bx3, shift_BxS, gamma_Bx1, prev_memory_BxAxC, prev_attention_BxAx1)[source]¶ Updates the attention weights.
Parameters: - gate_Bx3 –
- shift_BxS –
- gamma_Bx1 –
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
- prev_attention_BxAx1 – previous attention vector [BATCH_SIZE x MEMORY_ADDRESSES x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
location_based_addressing
(attention_BxAx1, shift_BxSx1, gamma_Bx1x1, prev_memory_BxAxC)[source]¶ Computes location-based addressing, i.e. shitfts the head and sharpens.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
circular_convolution
(attention_BxAx1, shift_BxSx1, prev_memory_BxAxC)[source]¶ Performs circular convoution, i.e. shitfts the attention accodring to given shift vector (convolution mask).
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- shift_BxSx1 – soft shift maks (convolutional kernel) [BATCH_SIZE x SHIFT_SIZE x 1]
- prev_memory_BxAxC – tensor containing memory before update [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
sharpening
(attention_BxAx1, gamma_Bx1x1)[source]¶ Performs attention sharpening.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- gamma_Bx1x1 – sharpening factor [BATCH_SIZE x 1 x 1]
Returns: attention vector of size [BATCH_SIZE x ADDRESS_SIZE x 1]
-
read_from_memory
(attention_BxAx1, memory_BxAxC)[source]¶ Returns 2D tensor of size [BATCH_SIZE x CONTENT_BITS] storing vector read from memory given the attention.
Parameters: - attention_BxAx1 – Current attention [BATCH_SIZE x ADDRESS_SIZE x 1]
- memory_BxAxC – tensor containing memory [BATCH_SIZE x MEMORY_ADDRESSES x CONTENT_BITS]
Returns: vector read from the memory [BATCH_SIZE x CONTENT_BITS]
-
Others Models¶
LSTM¶
-
class
miprometheus.models.lstm.
LSTM
(params, problem_default_values_={})[source]¶ Class implementing the Long Short-Term Memory model.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor. Initializes parameters on the basis of dictionary passed as argument.
Parameters: - params – Local view to the Parameter Regsitry ‘’model’’ section.
- problem_default_values – Dictionary containing key-values received from problem.
-
forward
(data_dict)[source]¶ Forward function requires that the data_dict will contain at least “sequences”
Parameters: data_dict – DataDict containing at least: - “sequences”: a tensor of input data of size [BATCH_SIZE x LENGTH_SIZE x INPUT_SIZE] Returns: Predictions (logits) being a tensor of size [BATCH_SIZE x LENGTH_SIZE x OUTPUT_SIZE].
-
ThalNet¶
-
class
miprometheus.models.thalnet.
ThalNetCell
(input_size: int, output_size: int, context_input_size: int, center_size_per_module: int, num_modules: int)[source]¶ Implementation of the
ThalNetCell
, iterating over one sequence element at a time.It is constituted of several
ThalNetModule
.-
__init__
(input_size: int, output_size: int, context_input_size: int, center_size_per_module: int, num_modules: int)[source]¶ Constructor of the
ThalNetCell
class.Parameters:
-
-
class
miprometheus.models.thalnet.
ThalNetModel
(params, problem_default_values_={})[source]¶ ThalNet
is a deep learning model inspired by neocortical communication via the thalamus. This model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps.See the reference paper here: https://arxiv.org/pdf/1706.05744.pdf.
-
__init__
(params, problem_default_values_={})[source]¶ Constructor of the
ThalNetModel
. Instantiates theThalNetCell
.Parameters: - params – dictionary of parameters (read from the
.yaml
configuration file.) - problem_default_values (dict) – default values coming from the
Problem
class.
- params – dictionary of parameters (read from the
-
forward
(data_dict)[source]¶ Forward run of the ThalNetModel model.
Parameters: data_dict (utils.DataDict) – DataDict({‘sequences’, …}) where ‘sequences’ is of shape [batch_size, sequence_length, input_size] Returns: Predictions [batch_size, sequence_length, output_size]
-
generate_figure_layout
()[source]¶ Generate a figure layout which will be used in
self.plot()
.Returns: figure layout.
-
plot
(data_dict, logits, sample=0)[source]¶ Plots specific information on the model’s behavior.
Parameters: - data_dict (utils.DataDict) – DataDict({‘sequences’, …})
- logits (torch.tensor) – Predictions of the model
- sample (int) – Index of the sample to visualize. Default to 0.
Returns: True
if the user pressed stop, elseFalse
.
-
-
class
miprometheus.models.thalnet.
ThalnetModule
(center_size, context_size, center_size_per_module, input_size, output_size)[source]¶ Implements a
ThalNet
module.-
__init__
(center_size, context_size, center_size_per_module, input_size, output_size)[source]¶ Constructor of the
ThalnetModule
.Parameters:
-
init_state
(batch_size)[source]¶ Initialize the state of a
ThalNet
module.Parameters: batch_size (int) – batch size Returns: center_state_per_module, tuple_controller_states
-
forward
(inputs, prev_center_state, prev_tuple_controller_state)[source]¶ Forward pass of a
ThalnetModule
.Parameters: - inputs (torch.tensor) – input sequences.
- prev_center_state (torch.tensor) – previous center state
- prev_tuple_controller_state (tuple) – previous tuple controller state
Returns: output, center_feature_output, tuple_ctrl_state
-