miprometheus.helpers

IndexSplitter

class miprometheus.helpers.IndexSplitter(name='IndexSplitter')[source]

Defines the IndexSplitter class.

This class allows to split the list of indices indexing a dataset into 2, non-overlapping, sub-lists of variable lengths. These 2 lists are then saved to file (named split_a.txt & split_b.txt).

This can be useful to split a training set into a training set & a validation set.

These files can later be used for training / validation or testing when using torch.utils.data.SubsetRandomSampler (instantiated with the miprometheus.utils.SamplerFactory).

Note

General usage:

– The user provides the output dir where the 2 files containing indices will be stored (–o)

—The user provides the problem name (–p) OR length of the dataset (–l)

– The user provides the split –s, which represents how many samples will be contained in the first split (value from 1 to l-2, which are border cases when one or the other split will contain a single index).

Additionally, the user might turn random_sampling on or off by –r (Default: True)

– when random_sampling is on, both files will contain (exclusive) random lists of indices

—when off, both files will contain ranges, i.e. [0, s-1] and [s, l-1] respectively.

__init__(name='IndexSplitter')[source]

Set parser arguments.

Note

As it does not really share any functionality with other basic workers, it does not call the base miprometheus.workers.Worker constructor.

Parameters:name (str) – Name of the worker (Default: “IndexSplitter”).
run()[source]

Creates two files with splits.

  • Parses command line arguments.
  • Loads the problem class (if required).
  • Generates two lists (or ranges) of exclusive indices.
  • Writes those lists to two separate files.

ProblemInitializer

class miprometheus.helpers.ProblemInitializer(config=None, name=None, path=None)[source]
__init__(config=None, name=None, path=None)[source]

Initialize ProblemInitializer, which runs the __init__() for a provided Problem, downloading and/or generating its datasets as necessary, optionally overriding some parameters.

Parameters:
  • config (str) – Path to a config file to initialize from.
  • name (str) – Name of a problem to initialize using default parameters
  • path (str) – Path to initialize problem, overrides default data_folder if provided.
static int_or_str(val)[source]

Try to return int(val) else return val.

Parameters:val – Value to evaluate.
static str_to_bool(val)[source]

Return True if val.lower() in (‘yes’, ‘true’, ‘t’, ‘y’, ‘1’).

Return False if val.lower() in (‘no’, ‘false’, ‘f’, ‘n’, ‘0’)

Else raise argparse.ArgumentTypeError.

Parameters:val – Value to evaluate.