learn2learn.algorithms - learn2learn (2024)



High-Level Interfaces¶

`MAML (BaseLearner)` ¶

[Source]

Description

High-level implementation of Model-Agnostic Meta-Learning.

This class wraps an arbitrary nn.Module and augments it with clone() and adapt()methods.

For the first-order version of MAML (i.e. FOMAML), set the first_order flag to Trueupon initialization.

Arguments

model (Module) - Module to be wrapped.
lr (float) - Fast adaptation learning rate.
first_order (bool, optional, default=False) - Whether to use the first-order approximation of MAML. (FOMAML)
allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to allow_nograd.
allow_nograd (bool, optional, default=False) - Whether to allow adaptation with parameters that have requires_grad = False.

References

Finn et al. 2017. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks."

Example

linear = l2l.algorithms.MAML(nn.Linear(20, 10), lr=0.01)clone = linear.clone()error = loss(clone(X), y)clone.adapt(error)error = loss(clone(X), y)error.backward()

`adapt(self, loss, first_order=None, allow_unused=None, allow_nograd=None)`¶

Description

Takes a gradient step on the loss and updates the cloned parameters in place.

Arguments

loss (Tensor) - Loss to minimize upon update.
first_order (bool, optional, default=None) - Whether to use first- or second-order updates. Defaults to self.first_order.
allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
allow_nograd (bool, optional, default=None) - Whether to allow adaptation with parameters that have requires_grad = False. Defaults to self.allow_nograd.

`clone(self, first_order=None, allow_unused=None, allow_nograd=None)`¶

Description

Returns a MAML-wrapped copy of the module whose parameters and buffersare torch.cloned from the original module.

This implies that back-propagating losses on the cloned module willpopulate the buffers of the original module.For more information, refer to learn2learn.clone_module().

Arguments

first_order (bool, optional, default=None) - Whether the clone uses first- or second-order updates. Defaults to self.first_order.
allow_unused (bool, optional, default=None) - Whether to allow differentiationof unused parameters. Defaults to self.allow_unused.
allow_nograd (bool, optional, default=False) - Whether to allow adaptation with parameters that have requires_grad = False. Defaults to self.allow_nograd.

`MetaSGD (BaseLearner)` ¶

[Source]

Description

High-level implementation of Meta-SGD.

This class wraps an arbitrary nn.Module and augments it with clone() and adaptmethods.It behaves similarly to MAML, but in addition a set of per-parameters learning ratesare learned for fast-adaptation.

Arguments

model (Module) - Module to be wrapped.
lr (float) - Initialization value of the per-parameter fast adaptation learning rates.
first_order (bool, optional, default=False) - Whether to use the first-order version.
lrs (list of Parameters, optional, default=None) - If not None, overrides lr, and uses the list as learning rates for fast-adaptation.

References

Li et al. 2017. “Meta-SGD: Learning to Learn Quickly for Few-Shot Learning.” arXiv.

Example

linear = l2l.algorithms.MetaSGD(nn.Linear(20, 10), lr=0.01)clone = linear.clone()error = loss(clone(X), y)clone.adapt(error)error = loss(clone(X), y)error.backward()

`adapt(self, loss, first_order=None)`¶

Descritpion

Akin to MAML.adapt() but for MetaSGD: it updates the model with the learnableper-parameter learning rates.

`clone(self)`¶

Descritpion

Akin to MAML.clone() but for MetaSGD: it includes a set of learnable fast-adaptationlearning rates.

`GBML (Module)` ¶

[Source]

Description

General wrapper for gradient-based meta-learning implementations.

A variety of algorithms can simply be implemented by changing the kindof transform used during fast-adaptation.For example, if the transform is Scale we recover Meta-SGD [2] with adapt_transform=Falseand Alpha MAML [4] with adapt_transform=True.If the transform is a Kronecker-factored module (e.g. neural network, or linear), we recoverKFO from [5].

Arguments

module (Module) - Module to be wrapped.
tranform (Module) - Transform used to update the module.
lr (float) - Fast adaptation learning rate.
adapt_transform (bool, optional, default=False) - Whether to update the transform's parameters during fast-adaptation.
first_order (bool, optional, default=False) - Whether to use the first-order approximation.
allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to allow_nograd.
allow_nograd (bool, optional, default=False) - Whether to allow adaptation with parameters that have requires_grad = False.

References

Finn et al. 2017. “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.”
Li et al. 2017. “Meta-SGD: Learning to Learn Quickly for Few-Shot Learning.”
Park & Oliva. 2019. “Meta-Curvature.”
Behl et al. 2019. “Alpha MAML: Adaptive Model-Agnostic Meta-Learning.”
Arnold et al. 2019. “When MAML Can Adapt Fast and How to Assist When It Cannot.”

Example

 1 2 3 4 5 6 7 8 910111213141516171819

model = SmallCNN()transform = l2l.optim.ModuleTransform(torch.nn.Linear)gbml = l2l.algorithms.GBML( module=model, transform=transform, lr=0.01, adapt_transform=True,)gbml.to(device)opt = torch.optim.SGD(gbml.parameters(), lr=0.001)# Training with 1 adaptation stepfor iteration in range(10): opt.zero_grad() task_model = gbml.clone() loss = compute_loss(task_model) task_model.adapt(loss) loss.backward() opt.step()

`adapt(self, loss, first_order=None, allow_nograd=None, allow_unused=None)`¶

Description

Takes a gradient step on the loss and updates the cloned parameters in place.

The parameters of the transform are only adapted if self.adapt_update is True.

Arguments

loss (Tensor) - Loss to minimize upon update.
first_order (bool, optional, default=None) - Whether to use first- or second-order updates. Defaults to self.first_order.
allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
allow_nograd (bool, optional, default=None) - Whether to allow adaptation with parameters that have requires_grad = False. Defaults to self.allow_nograd.

`clone(self, first_order=None, allow_unused=None, allow_nograd=None, adapt_transform=None)`¶

Description

Similar to MAML.clone().

Arguments

first_order (bool, optional, default=None) - Whether the clone uses first- or second-order updates. Defaults to self.first_order.
allow_unused (bool, optional, default=None) - Whether to allow differentiationof unused parameters. Defaults to self.allow_unused.
allow_nograd (bool, optional, default=False) - Whether to allow adaptation with parameters that have requires_grad = False. Defaults to self.allow_nograd.

PyTorch Lightning¶

`LightningMAML (LightningEpisodicModule)` ¶

[Source]

Description

A PyTorch Lightning module for MAML.

Arguments

model (Module) - A PyTorch nn.Module.
loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
ways (int, optional, default=5) - Number of classes in a task.
shots (int, optional, default=1) - Number of samples for adaptation.
adaptation_steps (int, optional, default=1) - Number of steps for adapting to new task.
lr (float, optional, default=0.001) - Learning rate of meta training.
adaptation_lr (float, optional, default=0.1) - Learning rate for fast adaptation.
scheduler_step (int, optional, default=20) - Decay interval for lr.
scheduler_decay (float, optional, default=1.0) - Decay rate for lr.

References

Finn et al. 2017. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks."

Example

tasksets = l2l.vision.benchmarks.get_tasksets('omniglot')model = l2l.vision.models.OmniglotFC(28**2, args.ways)maml = LightningMAML(classifier, adaptation_lr=0.1, **dict_args)episodic_data = EpisodicBatcher(tasksets.train, tasksets.validation, tasksets.test)trainer = pl.Trainer.from_argparse_args(args)trainer.fit(maml, episodic_data)

`LightningANIL (LightningEpisodicModule)` ¶

[Source]

Description

A PyTorch Lightning module for ANIL.

Arguments

features (Module) - A nn.Module to extract features, which will not be adaptated.
classifier (Module) - A nn.Module taking features, mapping them to classification.
loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
ways (int, optional, default=5) - Number of classes in a task.
shots (int, optional, default=1) - Number of samples for adaptation.
adaptation_steps (int, optional, default=1) - Number of steps for adapting to new task.
lr (float, optional, default=0.001) - Learning rate of meta training.
adaptation_lr (float, optional, default=0.1) - Learning rate for fast adaptation.
scheduler_step (int, optional, default=20) - Decay interval for lr.
scheduler_decay (float, optional, default=1.0) - Decay rate for lr.

References

Raghu et al. 2020. "Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML"

Example

tasksets = l2l.vision.benchmarks.get_tasksets('omniglot')model = l2l.vision.models.OmniglotFC(28**2, args.ways)anil = LightningANIL(model.features, model.classifier, adaptation_lr=0.1, **dict_args)episodic_data = EpisodicBatcher(tasksets.train, tasksets.validation, tasksets.test)trainer = pl.Trainer.from_argparse_args(args)trainer.fit(anil, episodic_data)

`LightningPrototypicalNetworks (LightningEpisodicModule)` ¶

[Source]

Description

A PyTorch Lightning module for Prototypical Networks.

Arguments

features (Module) - Feature extractor which classifies input tasks.
loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
distance_metric (str, optional, default='euclidean') - Distance metric between samples. ['euclidean', 'cosine']
train_ways (int, optional, default=5) - Number of classes in for train tasks.
train_shots (int, optional, default=1) - Number of support samples for train tasks.
train_queries (int, optional, default=1) - Number of query samples for train tasks.
test_ways (int, optional, default=5) - Number of classes in for test tasks.
test_shots (int, optional, default=1) - Number of support samples for test tasks.
test_queries (int, optional, default=1) - Number of query samples for test tasks.
lr (float, optional, default=0.001) - Learning rate of meta training.
scheduler_step (int, optional, default=20) - Decay interval for lr.
scheduler_decay (float, optional, default=1.0) - Decay rate for lr.

References

Snell et al. 2017. "Prototypical Networks for Few-shot Learning"

Example

tasksets = l2l.vision.benchmarks.get_tasksets('mini-imagenet')features = Convnet() # init modelprotonet = LightningPrototypicalNetworks(features, **dict_args)episodic_data = EpisodicBatcher(tasksets.train, tasksets.validation, tasksets.test)trainer = pl.Trainer.from_argparse_args(args)trainer.fit(protonet, episodic_data)

`LightningMetaOptNet (LightningPrototypicalNetworks)` ¶

[Source]

Description

A PyTorch Lightning module for MetaOptNet.

Arguments

features (Module) - Feature extractor which classifies input tasks.
svm_C_reg (float, optional, default=0.1) - Regularization weight for SVM.
svm_max_iters (int, optional, default=15) - Maximum number of iterations for SVM convergence.
loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
train_ways (int, optional, default=5) - Number of classes in for train tasks.
train_shots (int, optional, default=1) - Number of support samples for train tasks.
train_queries (int, optional, default=1) - Number of query samples for train tasks.
test_ways (int, optional, default=5) - Number of classes in for test tasks.
test_shots (int, optional, default=1) - Number of support samples for test tasks.
test_queries (int, optional, default=1) - Number of query samples for test tasks.
lr (float, optional, default=0.001) - Learning rate of meta training.
scheduler_step (int, optional, default=20) - Decay interval for lr.
scheduler_decay (float, optional, default=1.0) - Decay rate for lr.

References

Lee et al. 2019. "Meta-Learning with Differentiable Convex Optimization"

Example

tasksets = l2l.vision.benchmarks.get_tasksets('mini-imagenet')features = Convnet() # init modelmetaoptnet = LightningMetaOptNet(features, **dict_args)episodic_data = EpisodicBatcher(tasksets.train, tasksets.validation, tasksets.test)trainer = pl.Trainer.from_argparse_args(args)trainer.fit(metaoptnet, episodic_data)

learn2learn.algorithms - learn2learn (2024)

High-Level Interfaces¶

MAML (BaseLearner) ¶

adapt(self, loss, first_order=None, allow_unused=None, allow_nograd=None)¶

clone(self, first_order=None, allow_unused=None, allow_nograd=None)¶

MetaSGD (BaseLearner) ¶

adapt(self, loss, first_order=None)¶

clone(self)¶

GBML (Module) ¶

adapt(self, loss, first_order=None, allow_nograd=None, allow_unused=None)¶

clone(self, first_order=None, allow_unused=None, allow_nograd=None, adapt_transform=None)¶

PyTorch Lightning¶

LightningMAML (LightningEpisodicModule) ¶

LightningANIL (LightningEpisodicModule) ¶

LightningPrototypicalNetworks (LightningEpisodicModule) ¶

LightningMetaOptNet (LightningPrototypicalNetworks) ¶

References

`MAML (BaseLearner)` ¶

`adapt(self, loss, first_order=None, allow_unused=None, allow_nograd=None)`¶

`clone(self, first_order=None, allow_unused=None, allow_nograd=None)`¶

`MetaSGD (BaseLearner)` ¶

`adapt(self, loss, first_order=None)`¶

`clone(self)`¶

`GBML (Module)` ¶

`adapt(self, loss, first_order=None, allow_nograd=None, allow_unused=None)`¶

`clone(self, first_order=None, allow_unused=None, allow_nograd=None, adapt_transform=None)`¶

`LightningMAML (LightningEpisodicModule)` ¶

`LightningANIL (LightningEpisodicModule)` ¶

`LightningPrototypicalNetworks (LightningEpisodicModule)` ¶

`LightningMetaOptNet (LightningPrototypicalNetworks)` ¶