Experiment#
lit.sdk.model.experiment
#
This module provides functionality for managing and controlling experiments. An experiment represents a model's training session and includes methods for starting, stopping, resuming, and querying experiments.
Experiment
#
Represents a model's experiment, or training session.
is_running = self.meta.get('is_running', False)
instance-attribute
#
Whether the experiment is currently running.
meta = meta or {}
instance-attribute
#
Additional metadata.
runid = runid
instance-attribute
#
The index of the run.
team = team
instance-attribute
#
The name of a team.
__init__(team, runid, meta=None)
#
Initializes a new instance of Experiment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
team
|
str
|
The team which owns the Experiment. |
required |
runid
|
int
|
The run id of the Experiment. |
required |
meta
|
dict | None
|
Extra optional metadata for the Experiment. Defaults to None. |
None
|
resume(device, smartfit=False, session_name=None)
#
Resumes a stopped experiment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
string
|
The name of a device; e.g. gpu4 |
required |
session_name
|
string | None
|
The name of the previously running experiment. Defaults to None. |
None
|
Raises:
| Type | Description |
|---|---|
LITModelException
|
If an error occurred while resuming the experiment. |
Returns:
| Type | Description |
|---|---|
str
|
The name of the resumed experiment session. |
stop()
#
Stops the running experiment.
Raises:
| Type | Description |
|---|---|
LITModelException
|
If an error occurred while stopping the experiment. |
Returns:
| Type | Description |
|---|---|
int
|
The exit code from the killed process. |
get_experiments(team)
#
Get a list of all experiments, running and stopped.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
team
|
str
|
The name of the team. |
required |
Raises:
| Type | Description |
|---|---|
LITModelException
|
If an error occurred while getting experiments. |
Returns:
| Type | Description |
|---|---|
list[Experiment]
|
A list of experiment objects. |
Examples:
halt_experiments(team)
#
Immediately halts all running experiments for a given team.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
team
|
str
|
The name of the team. |
required |
Raises:
| Type | Description |
|---|---|
LITModelException
|
If an error occurred while halting all experiments. |
start_experiment(team, name, device, smartfit=False)
#
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
team
|
str
|
The name of the team. |
required |
name
|
str
|
The name of the canvas. |
required |
device
|
str
|
The name of device. i.e. gpu0, cpu |
required |
Raises:
| Type | Description |
|---|---|
LitModelError
|
If an error occurred while starting the experiment. |
Returns:
| Type | Description |
|---|---|
str
|
The name of the training session. |