Module jidenn.models.EFN
Module implementing the Energy Flow Network (EFN) model. See https://energyflow.network for original implementation or the paper https://arxiv.org/abs/1810.05165 for more details.
The input are the jet consituents, i.e. the particles in the jet foorming an input shape of (batch_size, num_particles, num_features)
.
The angular features are mapped with the Phi mapping, afterwards the energy variables are multiplied by the angular features
of each constituent separately and summed up. A F function is applied to the summed features forming an output
\mathrm{output} = F\left(\sum_i \pmb{E_i} \Phi{(\eta_i,\phi_i)}\right)
The F function is any function, which is applied to the summed features. The default is a fully-connected network. The mapping Phi can be a fully-connected network or a convolutional network. The default is a fully-connected network.
The energy part of the input is not used in the model to maintain Infrared and Collinear safety of the model.
On the left is the PFN model, on the right the EFN model.
Expand source code
r"""
Module implementing the Energy Flow Network (EFN) model.
See https://energyflow.network for original implementation or
the paper https://arxiv.org/abs/1810.05165 for more details.
The input are the jet consituents, i.e. the particles in the jet foorming an input shape of `(batch_size, num_particles, num_features)`.
The angular features are mapped with the Phi mapping, afterwards the energy variables are multiplied by the angular features
of each constituent separately and summed up. A F function is applied to the summed features forming an output
$$ \mathrm{output} = F\left(\sum_i \pmb{E_i} \Phi{(\eta_i,\phi_i)}\right) $$
The F function is any function, which is applied to the summed features. The default is a fully-connected network.
The mapping Phi can be a fully-connected network or a convolutional network. The default is a fully-connected network.
The energy part of the input is not used in the model to maintain **Infrared and Collinear safety** of the model.
![EFN_PFN](images/pfn_efn.png)
On the left is the PFN model, on the right the EFN model.
"""
import tensorflow as tf
from typing import Optional, Callable, List, Literal, Tuple
class EinsumLayer(tf.keras.layers.Layer):
"""
This is needed to wrap the einsum operation, because the einsum operation produces an error when loded from a saved model with tf.keras.models.load_model.
For more information see https://github.com/keras-team/keras/issues/15783.
For more information about the einsum operation see https://www.tensorflow.org/api_docs/python/tf/einsum.
Example:
```python
x = EinsumLayer("bmhwf,bmoh->bmowf")((x1, x2))
```
Args:
equation (str): The equation to be used in the einsum operation.
"""
def __init__(self, equation: str):
super().__init__()
self.equation = equation
def call(self, inputs: Tuple[tf.Tensor, tf.Tensor]) -> tf.Tensor:
"""Call the layer.
Args:
inputs (Tuple[tf.Tensor, tf.Tensor]): The inputs to the layer with shapes described in the equation.
Returns:
tf.Tensor: The output of the layer.
"""
return tf.einsum(self.equation, *inputs)
def get_config(self):
return {"equation": self.equation}
class EFNModel(tf.keras.Model):
"""The Energy Flow Network model.
The input is expected to be a tensor of shape `(batch_size, num_particles, num_features=8)`,
where the last 3 features are angular and the first 5 features are energy based.
The second dimension is ragged, as the number of particles in each jet is not the same.
See `jidenn.data.TrainInput.ConstituentVariables` for more details.
The model already contains the `tf.keras.layers.Input` layer, so it can be used as a standalone model.
Args:
input_shape (Tuple[None, int]): The shape of the input.
Phi_sizes (List[int]): The sizes of the Phi layers.
F_sizes (List[int]): The sizes of the F layers.
output_layer (tf.keras.layers.Layer): The output layer.
activation (Callable[[tf.Tensor], tf.Tensor]) The activation function for the Phi and F layers.
Phi_backbone (str, optional): The backbone of the Phi mapping. Options are "cnn" or "fc". Defaults to "fc".
This argument is not in the config option, as the CNN backbone is not used in the paper and
might violate the Infrared and Collinear safety of the model.
batch_norm (bool, optional): Whether to use batch normalization. Defaults to False.
This argument is not in the config option, as it is not used in the paper and
might violate the Infrared and Collinear safety of the model.
F_dropout (float, optional): The dropout rate for the F layers. Defaults to None.
preprocess (tf.keras.layers.Layer, optional): The preprocessing layer. Defaults to None.
"""
def __init__(self,
input_shape: Tuple[None, int],
Phi_sizes: List[int],
F_sizes: List[int],
output_layer: tf.keras.layers.Layer,
activation: Callable[[tf.Tensor], tf.Tensor],
Phi_backbone: Literal["cnn", "fc"] = "fc",
batch_norm: bool = False,
Phi_dropout: Optional[float] = None,
F_dropout: Optional[float] = None,
preprocess: Optional[tf.keras.layers.Layer] = None):
self.Phi_sizes, self.F_sizes = Phi_sizes, F_sizes
self.Phi_dropout = Phi_dropout
self.F_dropout = F_dropout
self.activation = activation
input = tf.keras.layers.Input(shape=input_shape, ragged=True)
if preprocess is not None:
input = preprocess(input)
angular = input[:, :, 6:]
row_lengths = angular.row_lengths()
mask = tf.sequence_mask(row_lengths)
energy = input[:, :, :6].to_tensor()
angular = angular.to_tensor()
if batch_norm:
angular = tf.keras.layers.BatchNormalization()(angular)
energy = tf.keras.layers.BatchNormalization()(energy)
if Phi_backbone == "cnn":
angular = self.cnn_Phi(angular)
elif Phi_backbone == "fc":
angular = self.fc_Phi(angular)
else:
raise ValueError(f"backbone must be either 'cnn' or 'fc', not {Phi_backbone}")
angular = angular * tf.expand_dims(tf.cast(mask, tf.float32), -1)
hidden = EinsumLayer('BPC,BPD->BCD')((angular, energy))
hidden = tf.keras.layers.Flatten()(hidden)
hidden = self.fc_F(hidden)
output = output_layer(hidden)
super().__init__(inputs=input, outputs=output)
def cnn_Phi(self, inputs: tf.Tensor) -> tf.Tensor:
"""Convolutional Phi mapping.
Args:
inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`.
Returns:
tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`.
"""
hidden = inputs
for size in self.Phi_sizes:
hidden = tf.keras.layers.Conv1D(size, 1)(hidden)
hidden = tf.keras.layers.BatchNormalization()(hidden)
hidden = tf.keras.layers.Activation(self.activation)(hidden)
if self.Phi_dropout is not None:
hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden)
return hidden
def fc_Phi(self, inputs: tf.Tensor) -> tf.Tensor:
"""Fully connected Phi mapping.
Args:
inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`.
Returns:
tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`.
"""
hidden = inputs
for size in self.Phi_sizes:
hidden = tf.keras.layers.Dense(size)(hidden)
hidden = tf.keras.layers.Activation(self.activation)(hidden)
if self.Phi_dropout is not None:
hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden)
return hidden
def fc_F(self, inputs: tf.Tensor) -> tf.Tensor:
"""Fully connected F mapping.
Args:
inputs (tf.Tensor): The input tensor of shape `(batch_size, num_features)`.
Returns:
tf.Tensor: The output tensor of shape `(batch_size, F_sizes[-1])`.
"""
hidden = inputs
for size in self.F_sizes:
hidden = tf.keras.layers.Dense(size)(hidden)
hidden = tf.keras.layers.Activation(self.activation)(hidden)
if self.F_dropout is not None:
hidden = tf.keras.layers.Dropout(self.F_dropout)(hidden)
return hidden
Classes
class EFNModel (input_shape: Tuple[None, int], Phi_sizes: List[int], F_sizes: List[int], output_layer: keras.engine.base_layer.Layer, activation: Callable[[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], Phi_backbone: Literal['cnn', 'fc'] = 'fc', batch_norm: bool = False, Phi_dropout: Optional[float] = None, F_dropout: Optional[float] = None, preprocess: Optional[keras.engine.base_layer.Layer] = None)
-
The Energy Flow Network model.
The input is expected to be a tensor of shape
(batch_size, num_particles, num_features=8)
, where the last 3 features are angular and the first 5 features are energy based. The second dimension is ragged, as the number of particles in each jet is not the same. SeeConstituentVariables
for more details.The model already contains the
tf.keras.layers.Input
layer, so it can be used as a standalone model.Args
input_shape
:Tuple[None, int]
- The shape of the input.
Phi_sizes
:List[int]
- The sizes of the Phi layers.
F_sizes
:List[int]
- The sizes of the F layers.
output_layer
:tf.keras.layers.Layer
- The output layer.
- activation (Callable[[tf.Tensor], tf.Tensor]) The activation function for the Phi and F layers.
Phi_backbone
:str
, optional- The backbone of the Phi mapping. Options are "cnn" or "fc". Defaults to "fc". This argument is not in the config option, as the CNN backbone is not used in the paper and might violate the Infrared and Collinear safety of the model.
batch_norm
:bool
, optional- Whether to use batch normalization. Defaults to False. This argument is not in the config option, as it is not used in the paper and might violate the Infrared and Collinear safety of the model.
F_dropout
:float
, optional- The dropout rate for the F layers. Defaults to None.
preprocess
:tf.keras.layers.Layer
, optional- The preprocessing layer. Defaults to None.
Expand source code
class EFNModel(tf.keras.Model): """The Energy Flow Network model. The input is expected to be a tensor of shape `(batch_size, num_particles, num_features=8)`, where the last 3 features are angular and the first 5 features are energy based. The second dimension is ragged, as the number of particles in each jet is not the same. See `jidenn.data.TrainInput.ConstituentVariables` for more details. The model already contains the `tf.keras.layers.Input` layer, so it can be used as a standalone model. Args: input_shape (Tuple[None, int]): The shape of the input. Phi_sizes (List[int]): The sizes of the Phi layers. F_sizes (List[int]): The sizes of the F layers. output_layer (tf.keras.layers.Layer): The output layer. activation (Callable[[tf.Tensor], tf.Tensor]) The activation function for the Phi and F layers. Phi_backbone (str, optional): The backbone of the Phi mapping. Options are "cnn" or "fc". Defaults to "fc". This argument is not in the config option, as the CNN backbone is not used in the paper and might violate the Infrared and Collinear safety of the model. batch_norm (bool, optional): Whether to use batch normalization. Defaults to False. This argument is not in the config option, as it is not used in the paper and might violate the Infrared and Collinear safety of the model. F_dropout (float, optional): The dropout rate for the F layers. Defaults to None. preprocess (tf.keras.layers.Layer, optional): The preprocessing layer. Defaults to None. """ def __init__(self, input_shape: Tuple[None, int], Phi_sizes: List[int], F_sizes: List[int], output_layer: tf.keras.layers.Layer, activation: Callable[[tf.Tensor], tf.Tensor], Phi_backbone: Literal["cnn", "fc"] = "fc", batch_norm: bool = False, Phi_dropout: Optional[float] = None, F_dropout: Optional[float] = None, preprocess: Optional[tf.keras.layers.Layer] = None): self.Phi_sizes, self.F_sizes = Phi_sizes, F_sizes self.Phi_dropout = Phi_dropout self.F_dropout = F_dropout self.activation = activation input = tf.keras.layers.Input(shape=input_shape, ragged=True) if preprocess is not None: input = preprocess(input) angular = input[:, :, 6:] row_lengths = angular.row_lengths() mask = tf.sequence_mask(row_lengths) energy = input[:, :, :6].to_tensor() angular = angular.to_tensor() if batch_norm: angular = tf.keras.layers.BatchNormalization()(angular) energy = tf.keras.layers.BatchNormalization()(energy) if Phi_backbone == "cnn": angular = self.cnn_Phi(angular) elif Phi_backbone == "fc": angular = self.fc_Phi(angular) else: raise ValueError(f"backbone must be either 'cnn' or 'fc', not {Phi_backbone}") angular = angular * tf.expand_dims(tf.cast(mask, tf.float32), -1) hidden = EinsumLayer('BPC,BPD->BCD')((angular, energy)) hidden = tf.keras.layers.Flatten()(hidden) hidden = self.fc_F(hidden) output = output_layer(hidden) super().__init__(inputs=input, outputs=output) def cnn_Phi(self, inputs: tf.Tensor) -> tf.Tensor: """Convolutional Phi mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`. """ hidden = inputs for size in self.Phi_sizes: hidden = tf.keras.layers.Conv1D(size, 1)(hidden) hidden = tf.keras.layers.BatchNormalization()(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.Phi_dropout is not None: hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden) return hidden def fc_Phi(self, inputs: tf.Tensor) -> tf.Tensor: """Fully connected Phi mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`. """ hidden = inputs for size in self.Phi_sizes: hidden = tf.keras.layers.Dense(size)(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.Phi_dropout is not None: hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden) return hidden def fc_F(self, inputs: tf.Tensor) -> tf.Tensor: """Fully connected F mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, F_sizes[-1])`. """ hidden = inputs for size in self.F_sizes: hidden = tf.keras.layers.Dense(size)(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.F_dropout is not None: hidden = tf.keras.layers.Dropout(self.F_dropout)(hidden) return hidden
Ancestors
- keras.engine.training.Model
- keras.engine.base_layer.Layer
- tensorflow.python.module.module.Module
- tensorflow.python.trackable.autotrackable.AutoTrackable
- tensorflow.python.trackable.base.Trackable
- keras.utils.version_utils.LayerVersionSelector
- keras.utils.version_utils.ModelVersionSelector
Methods
def cnn_Phi(self, inputs: tensorflow.python.framework.ops.Tensor) ‑> tensorflow.python.framework.ops.Tensor
-
Convolutional Phi mapping.
Args
inputs
:tf.Tensor
- The input tensor of shape
(batch_size, num_particles, num_features)
.
Returns
tf.Tensor
- The output tensor of shape
(batch_size, num_particles, Phi_sizes[-1])
.
Expand source code
def cnn_Phi(self, inputs: tf.Tensor) -> tf.Tensor: """Convolutional Phi mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`. """ hidden = inputs for size in self.Phi_sizes: hidden = tf.keras.layers.Conv1D(size, 1)(hidden) hidden = tf.keras.layers.BatchNormalization()(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.Phi_dropout is not None: hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden) return hidden
def fc_F(self, inputs: tensorflow.python.framework.ops.Tensor) ‑> tensorflow.python.framework.ops.Tensor
-
Fully connected F mapping.
Args
inputs
:tf.Tensor
- The input tensor of shape
(batch_size, num_features)
.
Returns
tf.Tensor
- The output tensor of shape
(batch_size, F_sizes[-1])
.
Expand source code
def fc_F(self, inputs: tf.Tensor) -> tf.Tensor: """Fully connected F mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, F_sizes[-1])`. """ hidden = inputs for size in self.F_sizes: hidden = tf.keras.layers.Dense(size)(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.F_dropout is not None: hidden = tf.keras.layers.Dropout(self.F_dropout)(hidden) return hidden
def fc_Phi(self, inputs: tensorflow.python.framework.ops.Tensor) ‑> tensorflow.python.framework.ops.Tensor
-
Fully connected Phi mapping.
Args
inputs
:tf.Tensor
- The input tensor of shape
(batch_size, num_particles, num_features)
.
Returns
tf.Tensor
- The output tensor of shape
(batch_size, num_particles, Phi_sizes[-1])
.
Expand source code
def fc_Phi(self, inputs: tf.Tensor) -> tf.Tensor: """Fully connected Phi mapping. Args: inputs (tf.Tensor): The input tensor of shape `(batch_size, num_particles, num_features)`. Returns: tf.Tensor: The output tensor of shape `(batch_size, num_particles, Phi_sizes[-1])`. """ hidden = inputs for size in self.Phi_sizes: hidden = tf.keras.layers.Dense(size)(hidden) hidden = tf.keras.layers.Activation(self.activation)(hidden) if self.Phi_dropout is not None: hidden = tf.keras.layers.Dropout(self.Phi_dropout)(hidden) return hidden
class EinsumLayer (equation: str)
-
This is needed to wrap the einsum operation, because the einsum operation produces an error when loded from a saved model with tf.keras.models.load_model. For more information see https://github.com/keras-team/keras/issues/15783. For more information about the einsum operation see https://www.tensorflow.org/api_docs/python/tf/einsum.
Example:
x = EinsumLayer("bmhwf,bmoh->bmowf")((x1, x2))
Args
equation
:str
- The equation to be used in the einsum operation.
Expand source code
class EinsumLayer(tf.keras.layers.Layer): """ This is needed to wrap the einsum operation, because the einsum operation produces an error when loded from a saved model with tf.keras.models.load_model. For more information see https://github.com/keras-team/keras/issues/15783. For more information about the einsum operation see https://www.tensorflow.org/api_docs/python/tf/einsum. Example: ```python x = EinsumLayer("bmhwf,bmoh->bmowf")((x1, x2)) ``` Args: equation (str): The equation to be used in the einsum operation. """ def __init__(self, equation: str): super().__init__() self.equation = equation def call(self, inputs: Tuple[tf.Tensor, tf.Tensor]) -> tf.Tensor: """Call the layer. Args: inputs (Tuple[tf.Tensor, tf.Tensor]): The inputs to the layer with shapes described in the equation. Returns: tf.Tensor: The output of the layer. """ return tf.einsum(self.equation, *inputs) def get_config(self): return {"equation": self.equation}
Ancestors
- keras.engine.base_layer.Layer
- tensorflow.python.module.module.Module
- tensorflow.python.trackable.autotrackable.AutoTrackable
- tensorflow.python.trackable.base.Trackable
- keras.utils.version_utils.LayerVersionSelector
Methods
def call(self, inputs: Tuple[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor]) ‑> tensorflow.python.framework.ops.Tensor
-
Call the layer.
Args
inputs
:Tuple[tf.Tensor, tf.Tensor]
- The inputs to the layer with shapes described in the equation.
Returns
tf.Tensor
- The output of the layer.
Expand source code
def call(self, inputs: Tuple[tf.Tensor, tf.Tensor]) -> tf.Tensor: """Call the layer. Args: inputs (Tuple[tf.Tensor, tf.Tensor]): The inputs to the layer with shapes described in the equation. Returns: tf.Tensor: The output of the layer. """ return tf.einsum(self.equation, *inputs)
def get_config(self)
-
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by
Network
(one layer of abstraction above).Note that
get_config()
does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.Returns
Python dictionary.
Expand source code
def get_config(self): return {"equation": self.equation}