标签:loss self batch 介绍 Lightning Pytorch step model def
文章目录
LIGHTNINGMODULE
Minimal Example
一些基本方法
Training
Training loop
Validation loop
Test loop
Inference
Inference in research
Inference in production
LightningModule API(略)
LIGHTNINGMODULE
LightningModule将PyTorch代码整理成5个部分:
- Computations (init).
- Train loop (training_step)
- Validation loop (validation_step)
- Test loop (test_step)
- Optimizers (configure_optimizers)
Minimal Example
所需要的方法:
import pytorch_lightning as pl class LitModel(pl.LightningModule): def __init__(self): super().__init__() self.l1 = torch.nn.Linear(28 * 28, 10) def forward(self, x): return torch.relu(self.l1(x.view(x.size(0), -1))) def training_step(self, batch, batch_idx): x, y = batch y_hat = self(x) loss = F.cross_entropy(y_hat, y) return loss def configure_optimizers(self): return torch.optim.Adam(self.parameters(), lr=0.02)
使用下面的代码进行训练:
train_loader = DataLoader(MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())) trainer = pl.Trainer() model = LitModel() trainer.fit(model, train_loader)
一些基本方法
Training
Training loop
使用training_step方法来增加training loop
class LitClassifier(pl.LightningModule): def __init__(self, model): super().__init__() self.model = model def training_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) return loss
如果需要在epoch-level进行度量,并进行记录,可以使用*.log*方法
def training_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) # logs metrics for each training_step, # and the average across the epoch, to the progress bar and logger self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True, logger=True) return loss
如果需要对每个training_step的输出做一些操作,可以通过改写training_epoch_end来实现
def training_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) preds = ... return {'loss': loss, 'other_stuff': preds} def training_epoch_end(self, training_step_outputs): for pred in training_step_outputs: # do something
如果需要对每个batch分配到不同GPU上进行训练,可以采用training_step_end方法来实现
def training_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) pred = ... return {'loss': loss, 'pred': pred} def training_step_end(self, batch_parts): gpu_0_prediction = batch_parts.pred[0]['pred'] gpu_1_prediction = batch_parts.pred[1]['pred'] # do something with both outputs return (batch_parts[0]['loss'] + batch_parts[1]['loss']) / 2 def training_epoch_end(self, training_step_outputs): for out in training_step_outputs: # do something with preds
Validation loop
增加一个validation loop,可以通过改写LightningModule中的validation_step来实现
class LitModel(pl.LightningModule): def validation_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) self.log('val_loss', loss)
对validation进行epoch-level度量,可以通过改写validation_epoch_end实现
def validation_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) pred = ... return pred def validation_epoch_end(self, validation_step_outputs): for pred in validation_step_outputs: # do something with a pred
如果需要validation进行数据并行计算(多GPU),可以通过validation_step_end方法实现
def validation_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) pred = ... return {'loss': loss, 'pred': pred} def validation_step_end(self, batch_parts): gpu_0_prediction = batch_parts.pred[0]['pred'] gpu_1_prediction = batch_parts.pred[1]['pred'] # do something with both outputs return (batch_parts[0]['loss'] + batch_parts[1]['loss']) / 2 def validation_epoch_end(self, validation_step_outputs): for out in validation_step_outputs: # do something with preds
Test loop
增加一个test loop的过程和上面增加validation loop是相同的,唯一不同的是,只有在使用*.test()*的时候,test loop才会被调用
model = Model() trainer = Trainer() trainer.fit() # automatically loads the best weights for you trainer.test(model)
这里,有两种方式调用test():
# call after training trainer = Trainer() trainer.fit(model) # automatically auto-loads the best weights trainer.test(test_dataloaders=test_dataloader) # or call with pretrained model model = MyLightningModule.load_from_checkpoint(PATH) trainer = Trainer() trainer.test(model, test_dataloaders=test_dataloader)
Inference
对于研究,LightningModules像系统一样结构化
import pytorch_lightning as pl import torch from torch import nn class Autoencoder(pl.LightningModule): def __init__(self, latent_dim=2): super().__init__() self.encoder = nn.Sequential(nn.Linear(28 * 28, 256), nn.ReLU(), nn.Linear(256, latent_dim)) self.decoder = nn.Sequential(nn.Linear(latent_dim, 256), nn.ReLU(), nn.Linear(256, 28 * 28)) def training_step(self, batch, batch_idx): x, _ = batch # encode x = x.view(x.size(0), -1) z = self.encoder(x) # decode recons = self.decoder(z) # reconstruction reconstruction_loss = nn.functional.mse_loss(recons, x) return reconstruction_loss def validation_step(self, batch, batch_idx): x, _ = batch x = x.view(x.size(0), -1) z = self.encoder(x) recons = self.decoder(z) reconstruction_loss = nn.functional.mse_loss(recons, x) self.log('val_reconstruction', reconstruction_loss) def configure_optimizers(self): return torch.optim.Adam(self.parameters(), lr=0.0002)
可以用如下方式训练
autoencoder = Autoencoder() trainer = pl.Trainer(gpus=1) trainer.fit(autoencoder, train_dataloader, val_dataloader)
lightning inference部分的方法:
- training_step
- validation_step
- test_step
- configure_optimizers
注意到在这个例子中,train loop和val loop完全相同,我们可以重复使用这部分代码
class Autoencoder(pl.LightningModule): def __init__(self, latent_dim=2): super().__init__() self.encoder = nn.Sequential(nn.Linear(28 * 28, 256), nn.ReLU(), nn.Linear(256, latent_dim)) self.decoder = nn.Sequential(nn.Linear(latent_dim, 256), nn.ReLU(), nn.Linear(256, 28 * 28)) def training_step(self, batch, batch_idx): loss = self.shared_step(batch) return loss def validation_step(self, batch, batch_idx): loss = self.shared_step(batch) self.log('val_loss', loss) def shared_step(self, batch): x, _ = batch # encode x = x.view(x.size(0), -1) z = self.encoder(x) # decode recons = self.decoder(z) # loss return nn.functional.mse_loss(recons, x) def configure_optimizers(self): return torch.optim.Adam(self.parameters(), lr=0.0002)
注:我们创建了所有loop都可以使用的一个新方法shared_step,这个方法的名字可以任意取
Inference in research
如果需要进行系统推断,可以将forward方法加入到LightningModule中
class Autoencoder(pl.LightningModule): def forward(self, x): return self.decoder(x)
在复杂系统中增加forward的优势,使得可以进行包含inference procedure等
class Seq2Seq(pl.LightningModule): def forward(self, x): embeddings = self(x) hidden_states = self.encoder(embeddings) for h in hidden_states: # decode ... return decoded
Inference in production
在LightningModule中迭代不同的模型
import pytorch_lightning as pl from pytorch_lightning.metrics import functional as FM class ClassificationTask(pl.LightningModule): def __init__(self, model): super().__init__() self.model = model def training_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) return loss def validation_step(self, batch, batch_idx): x, y = batch y_hat = self.model(x) loss = F.cross_entropy(y_hat, y) acc = FM.accuracy(y_hat, y) # loss is tensor. The Checkpoint Callback is monitoring 'checkpoint_on' metrics = {'val_acc': acc, 'val_loss': loss} self.log_dict(metrics) return metrics def test_step(self, batch, batch_idx): metrics = self.validation_step(batch, batch_idx) metrics = {'test_acc': metrics['val_acc'], 'test_loss': metrics['val_loss']} self.log_dict(metrics) def configure_optimizers(self): return torch.optim.Adam(self.model.parameters(), lr=0.02)
然后将任意适合该task的模型传进去
for model in [resnet50(), vgg16(), BidirectionalRNN()]: task = ClassificationTask(model) trainer = Trainer(gpus=2) trainer.fit(task, train_dataloader, val_dataloader)
tasks可以任意复杂,比如,可以实现GAN训练,self-supervised,甚至RL
class GANTask(pl.LightningModule): def __init__(self, generator, discriminator): super().__init__() self.generator = generator self.discriminator = discriminator ...
del)
trainer = Trainer(gpus=2) trainer.fit(task, train_dataloader, val_dataloader)
tasks可以任意复杂,比如,可以实现GAN训练,self-supervised,甚至RL ```python class GANTask(pl.LightningModule): def __init__(self, generator, discriminator): super().__init__() self.generator = generator self.discriminator = discriminator ...
标签:loss,self,batch,介绍,Lightning,Pytorch,step,model,def 来源: https://www.cnblogs.com/ltkekeli1229/p/15554493.html
本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享; 2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关; 3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关; 4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除; 5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。