# Config

## Launch a Scan

First, let's initialize our client :

```python
api_token = ""
project_token = ""
client = Client.Experiment(api_token=api_token, project_token=project_token)
```

Here's an example of how to configure a Scan :

```python
config = {
    'script': 'script.py',
    'execution': {
        'type': 'agents'
    },
    'strategy': 'grid',
    'metric': {
        'name': 'Loss-total_loss',
        'goal': 'minimize'
    },
    'parameters': {
        'batch_size': {
            'values': [2, 4, 8],
        },
        'learning_rate': {
            'values': [1e-3, 1e-4, 1e-5]
        },
        'steps': {
            'value': 1000
        },
        'annotation_type': {
            'value': 'rectangle'
        }
    },
    'base_model': 'picsell/ssd-mobilenet-v2-640-fpnlite',
    'dataset': 'SampleDataset/first'
}
client.init_scan('test-scan-1', config)
```

## Configuration

| Top-level key   |                                                                                                                                                   |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| execution       | How you want to run the Scan (manually, remotely, using agents)                                                                                   |
| image           | Name of the docker image executing your training script (optional)                                                                                |
| script          | Filename of the script you want to execute (optional)                                                                                             |
| requirements    | List of package needed if you use our base docker image (optional)                                                                                |
| strategy        | The search strategy for the Scan (required)                                                                                                       |
| max\_run        | Maximum number of runs for this Scan (optional, default = 100)                                                                                    |
| early\_stopping | The chosen early-stopping or pruning algorithm (optional)                                                                                         |
| metric          | The metric to optimize (required)                                                                                                                 |
| parameters      | The parameter space used for the search (required)                                                                                                |
| base\_model     | A model used to start each run (as in [experiment init](/picsellia/experiment-tracking/initialize-an-experiment.md#base-architecture)) (optional) |
| dataset         | A dataset used to start each run (as in [experiment init](/picsellia/experiment-tracking/initialize-an-experiment.md#dataset)) (optional)         |

### execution

Specify how you want to run the Scan.

| execution |                                                                                                                                                                                                                         |
| --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| manual    | We will just define the grid of parameter, you will then be able to use our python SDK to access each run with each set of parameters and execute it on your own, either in a script or in jupyter notebook for example |
| remote    | We will automatically launch remote runs for you on servers equipped with NVIDIA V100S. You must set up `max_worker` as the limit of parallel runs.                                                                     |
| agents    | You will be able to launch our agents on any machine and runs will be automatically dispatched across your agents. (see [our CLI](/picsellia/hyperparameter-tuning/install-picsell-cli.md) for more information)        |

{% hint style="warning" %}
You need to subscribe to a paid plan if you want to launch remote Scans
{% endhint %}

{% tabs %}
{% tab title="manual" %}

```python
'execution': {
    'type': 'manual'
}
```

{% endtab %}

{% tab title="agents" %}

```python
'execution': {
    'type': 'agents'
}
```

{% endtab %}

{% tab title="remote" %}

```python
'execution': {
    'type': 'remote',
    'max_worker': 4,
}
```

{% endtab %}
{% endtabs %}

### image

Our Scan engine is based on Docker images that we will schedule and launch on distributed machines, which could be your computer or a cloud server hosted by Picsellia.

If you do not specify any image parameter, we will use our base image (called custom-run:1.0) that will encapsulate the script you provided, install the specified requirements and then launch your script.

{% hint style="warning" %}
Specifying a custom image that will run your code is compulsory if you do not provide a `script` param, for us to launch your script in our base image
{% endhint %}

But to save time on package install or to be sure that your script will run 100% of the time, we encourage you to build your own custom image, and then push it to the Docker HUB so we can run it remotely or just have it on every machines where you want to launch our agents.

To specify a custom image, you just have to give its name like below :

```python
'image': 'picsellpn/custom-run:1.0'
```

### script

If you want to be able to automatically launch your training script without having it on every machines you can specify the path the file, it will be saved on Picsellia and used for each run.

{% hint style="warning" %}
Providing a script is mandatory if youdo not want to define custom Docker images but use our base images.
{% endhint %}

```python
'script': 'my_training_script.py'
```

### requirements

Specify this parameter if you want to install specific Python package needed for your script to run when using our base images.

For example, as our image only have the `picsellia` package installed, if you need tensorflow 2.3.1 to run your script you will set the requirements as below :

```python
'requirements': {
    'package': 'tensorflow':
    'version': '2.3.1'
}
```

Alternatively, you can set `requirements` to the path of a `requirements.txt` file just like this :

```python
'requirements': 'path/to/requirements.txt'
```

With the requirements.txt file looking like this :

```python
tensorflow==2.3.1
numpy==1.20.2
```

### strategy

Allows you to choose a search strategy within the following options :

| strategy |                                                                                                                                                                                                                                                     |
| -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| grid     | Grid-search, will try out every parameter combinations                                                                                                                                                                                              |
| optuna   | Optimization of hyperparameter search using [Optuna library](https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/001_first.html). To use this strategy, you will have to set up a distribution or specific values for every parameter. |

### max\_run

When you perform hyperparameter-search, you never really know how many runs will be needed to find the best combination. For example if you choose an Optuna strategy for your run, the parameters for future runs are computed accordingly to the results of past runs.

That's why you can set up a max\_run parameter, that allows you to be sure that your Scan will stop before using infinite resources, and that you will be able to create a new scan with a reduced search space later.

{% hint style="info" %}
If you do not specify this parameter, the default value is set to 100 runs.
{% endhint %}

```python
'max_run': 256
```

### early\_stopping (coming soon)

Early-stopping is an optional features that can drastically speed-up your hyperparameter search by deciding whether of not some runs might be stopped early or are given a chance to continue.&#x20;

If some runs are not promising, they are automatically stopped and the agents get a new set of parameter to try so you do not spare time on unnecessary experiments.

| method    | decription                                                                                                                                                                                                       |
| --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| hyperband | Implementation of the [Hyperband](https://www.jmlr.org/papers/volume18/16-558/16-558.pdf) Algorithm by [Optuna](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.pruners.HyperbandPruner.html) |

#### Parameters :

| parameter         | description                                                                                                    |
| ----------------- | -------------------------------------------------------------------------------------------------------------- |
| min\_iter         | The minimum number of iteration (e.g training epochs or steps) to wait before deciding to prune the run or not |
| max\_iter         | The maximum number of iteration to wait before you either prune the run or let it finish                       |
| reduction\_factor | At the completion point of each rung, about 1/reduction\_factor trials will be promoted.                       |

```python
'early_stopping': {
    'hyperband': {
        'min_iter': 100,
        'max_iter': 1000,
        'reduction_factor': 3,
    }
}
```

### metric

The name of the metric you want to optimize, and the way you want to optimize it.

{% tabs %}
{% tab title="Maximize" %}

```python
'metric' {
    'name': 'loss',
    'goal': 'maximize'
}
```

{% endtab %}

{% tab title="Minimize" %}

```python
'metric': {
    'name': 'loss',
    'goal': 'minimize'
}
```

{% endtab %}
{% endtabs %}

For the Scan to run properly, you must log explicitly the metric somewhere in the script you use, this means that you should have a line looking like this :

```python
experiment.log('loss', value, 'line')
```

Where the name of what you log must corresponds to the metric you set up during configuration.

### parameters

Specify the hyperparameter space to explore. You can either set up a list of constant values for each parameter or just choose a distribution and the bounds (for `optuna`).

| Values                          |                                                                                      |
| ------------------------------- | ------------------------------------------------------------------------------------ |
| value (int, float, str)         | Single value for hyperparameter                                                      |
| values (list\[int, float, str]) | List of all values for hyperparameter                                                |
| distribution (str)              | Choose an available distribution from the list below (available for optuna strategy) |
| min (int, float)                | Minimum value for hyperparameter. It's the lower bound for the chosen distribution   |
| max (int, float)                | Maximum value for hyperparameter. It's the upper bound for the chosen distribution   |
| q (float)                       | Quantization step size for discrete hyperparameters                                  |
| step (int)                      | Step size between values ( for `int_uniform` distribution)                           |

{% tabs %}
{% tab title="grid - value" %}

```python
'parameter_name':{
    'value': 0.0001
}
```

{% endtab %}

{% tab title="grid - values" %}

```python
'parameter_name': {
    'values': ['relu', 'elu', 'selu']
}
```

{% endtab %}

{% tab title="optuna - log\_uniform" %}

```python
'parameter_name': {
    'distribution': 'log_uniform',
    'min': 1e-5,
    'max': 1e-3
}
```

{% endtab %}
{% endtabs %}

### distributions

Here is the list of all the distributions you can use :

| Name              | Information                                                                                                                                          |
| ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| constant          | Constant value for hyperparameter, equals to `value.`                                                                                                |
| categorical       | Categorical distribution, hyperparameter value will be chosen from `values`                                                                          |
| uniform           | Continuous uniform distribution. You must set the bounds `min` and `max`.                                                                            |
| int\_uniform      | Discrete uniform distribution for integer. You must set the bounds `min` and `max`. You can also set `step` to a value higher than 1 if you want to. |
| discrete\_uniform | Discrete uniform distribution. You must set the bounds `min` and `max`, and the `q` parameter (step of discretization)                               |
| log\_uniform      | Continuous uniform distribution in the log domain. You must set up the bounds `min` and `max`.                                                       |

#### Examples

{% tabs %}
{% tab title="constant" %}

```python
'parameter_name': {
    'value': 0.3546
}
```

{% endtab %}

{% tab title="categorical" %}

```python
'parameter_name': {
    'values': ['relu', 'elu', 'selu']
}
```

{% endtab %}

{% tab title="uniform" %}

```python
'parameter_name': {
    'distribution': 'uniform',
    'min': 1e-5,
    'max': 1e-2,
}
```

{% endtab %}

{% tab title="int\_uniform" %}

```python
'parameter_name': {
    'distribution': 'int_uniform',
    'min': 2,
    'max': 16,
    'step': 2
}
```

{% endtab %}

{% tab title="discrete\_uniform" %}

```python
'parameter_name': {
    'distribution': 'discrete_uniform',
    'min': 0.1,
    'max': 1,
    'q': 0.1
}
```

{% endtab %}

{% tab title="log\_uniform" %}

```python
'parameter_name': {
    'distribution': 'log_uniform',
    'min': 1e-2,
    'max': 1e3,
}
```

{% endtab %}
{% endtabs %}

### base\_model

As when you [create an experiment](/picsellia/experiment-tracking/initialize-an-experiment.md),  you can choose a model whose files, labelmap and so on, will be duplicated in each run's experiment.

To choose a model, you have to specify the username of the author and the model name this way : `<username>/<model_name>`

```python
'base_model': 'picsell/faster-rcnn-resnet-640'
```

### dataset

As when you create an experiment, you can choose a dataset to attach to your run. To do this, **the dataset must have already been attached to the project first**. Then you have to specify the chosen dataset this way : `<dataset_name>/<dataset_version>`

```python
'dataset': 'SampleDataset/first'
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://picsellia.gitbook.io/picsellia/hyperparameter-tuning/config.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
