Module jidenn.config

This module to fully configure the data preparation, training and evaulation. The configuration is done using the hydra package https://hydra.cc/docs/intro/. The basic idea is that the user can specify a configuration file in yaml format such that it follows the structure of the JIDENNConfig dataclass. The hydra package reads the configuration file and creates a JIDENNConfig object that is passed to the main function with the following decorator:

@hydra.main(version_base="1.2", config_path="jidenn/config", config_name="config")
def main(args: config.JIDENNConfig) -> None:
    some_option = args.some_option

config_path is the path to the folder where the configuration file config.yaml (config_name option) is located. The configured variables are then accessible via the args object's attributes.

To match the configuration file to the JIDENNConfig class with the .yaml file the argument of the main() must be matched to the JIDENNConfig class. This is done by the following lines:

from hydra.core.config_store import ConfigStore
from jidenn.config import config

cs = ConfigStore.instance()
cs.store(name="args", node=config.JIDENNConfig)

Dataclasses can be nested, each nested dataclass represents a sub-configuration, i.e. an indentation in the yaml file. The following dataclasses

from dataclasses import dataclass

@dataclass
class SubConfig1:
    sub_config1_arg1: str
    sub_config1_arg2: int

@dataclass
class SubConfig2:
    sub_config2_arg1: str
    sub_config2_arg2: int

@dataclass
class GeneralConfig:
    sub_config1: SubConfig1
    sub_config2: SubConfig2

@dataclass
class JIDENNConfig:
    general: GeneralConfig
    ...

are represented in the yaml file as:

general:
  sub_config1:
    sub_config1_arg1: "test"
    sub_config1_arg2: 1
  sub_config2:
    sub_config2_arg1: "train"
    sub_config2_arg2: 2

Optionally the .yaml files can be split into multiple files. This is done by addign the defaults entry in the main .yaml file. The defaults entry is a list:

defaults:
    - data: data2
    - _self_

The second .yaml file must be located in the jidenn/config/data folder (generaly in the config_path/data, where config_path is the path specified in the @hydra.main decorator) and must be named data2.yaml. The _self_ entry tells hydra to overwrite the default, so if you change some entry in the main .yaml file, it will overwrite the defaults.

Warning

All dataclasses must have type annotation, so the user can see what options are available. The Literal type is used to specify concrete options for a string variable that are available.

Some configuration options are Optional. In python, missing arguments are assumed to be None. In the .yaml file, this can be achieved by omitting the argument or by setting it to null:

# Two ways to set an argument to None 
optional_arg: null
optional_arg: 

Hydra also provides a way to override the configuration file from the command line:

python3 train.py general.sub_config1.sub_config1_arg2=2

For more information on how to use hydra, see https://hydra.cc/docs/intro/

Expand source code
"""
This module to fully configure the data preparation, training and evaulation.
The configuration is done using the hydra package https://hydra.cc/docs/intro/.
The basic idea is that the user can specify a configuration file in yaml format
such that it follows the structure of the `jidenn.config.config.JIDENNConfig` dataclass. 
The hydra package reads the configuration file and creates a `jidenn.config.config.JIDENNConfig` 
object that is passed to the main function with the following decorator:

```python
@hydra.main(version_base="1.2", config_path="jidenn/config", config_name="config")
def main(args: config.JIDENNConfig) -> None:
    some_option = args.some_option
```
`config_path` is the path to the folder where the configuration file 
`config.yaml` (`config_name` option) is located. The configured variables
are then accessible via the `args` object's attributes.

To match the configuration file to the `jidenn.config.config.JIDENNConfig` class 
with the `.yaml` file the argument of the `main()` must be matched to the
`jidenn.config.config.JIDENNConfig` class. This is done by the following lines:
```python
from hydra.core.config_store import ConfigStore
from jidenn.config import config

cs = ConfigStore.instance()
cs.store(name="args", node=config.JIDENNConfig)
```

Dataclasses can be nested, each nested dataclass represents a sub-configuration,
i.e. an indentation in the yaml file. 
The following dataclasses
```python
from dataclasses import dataclass

@dataclass
class SubConfig1:
    sub_config1_arg1: str
    sub_config1_arg2: int

@dataclass
class SubConfig2:
    sub_config2_arg1: str
    sub_config2_arg2: int

@dataclass
class GeneralConfig:
    sub_config1: SubConfig1
    sub_config2: SubConfig2
    
@dataclass
class JIDENNConfig:
    general: GeneralConfig
    ...
```  
are represented in the yaml file as:
```yaml
general:
  sub_config1:
    sub_config1_arg1: "test"
    sub_config1_arg2: 1
  sub_config2:
    sub_config2_arg1: "train"
    sub_config2_arg2: 2
```

Optionally the `.yaml` files can be split into multiple files. This is done by
addign the `defaults` entry in the main `.yaml` file. The `defaults` entry is a list:
```yaml
defaults:
    - data: data2
    - _self_
```
The second `.yaml` file must be located in the `jidenn/config/data` folder (generaly in the `config_path/data`, 
where `config_path` is the path specified in the `@hydra.main` decorator)  and must
be named `data2.yaml`. The `_self_` entry tells hydra to overwrite the default,
so if you change some entry in the main `.yaml` file, it will overwrite the `defaults`.

.. warning:: 
 
    All dataclasses **must** have type annotation, so the user can see what options
    are available. The `Literal` type is used to specify concrete options for a string 
    variable that are available.



Some configuration options are `Optional`. In python, missing arguments are
assumed to be `None`. In the `.yaml` file, this can be achieved by omitting
the argument or by setting it to `null`:
```yaml
# Two ways to set an argument to None 
optional_arg: null
optional_arg: 
```

Hydra also provides a way to override the configuration file from the command line:
```bash
python3 train.py general.sub_config1.sub_config1_arg2=2
```
For more information on how to use hydra, see https://hydra.cc/docs/intro/
"""

Sub-modules

jidenn.config.config

General configurations for JIDENN. It includes training and data preparation configurations. The model configurations are defined separately in …

jidenn.config.eval_config

Configuration for evaluation of a model. The Data configuration is equivalent to the one used for training. Evaulation of a …

jidenn.config.model_config

Model configurations. Each model is a subclass of the Model dataclass. For more information about individual models, see the jidenn.models module.