When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

DevKiHyun · 2024-09-13T06:36:22Z

Hi,

Thanks to share your repository.

I found the something weird issue in your code when I use mixed precision, autocast() function.

I add just simple mixed precision code into your code like below:

for num, (data, labels) in enumerate(loader, start = 1):
	self.zero_grad()
	labels            = torch.LongTensor(labels).cuda()
	# speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
	# nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
	# nloss.backward()
	# self.optim.step()

	if self.mixedprec:
		with autocast():
			speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
			nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
		self.scaler.scale(nloss).backward()
		self.scaler.step(self.optim)
		self.scaler.update()
	else:
		speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
		nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)
		nloss.backward()
		self.optim.step()

I found that if I trained ECAPA-TDNN with mixed precision, then your ecapa_tdnn throw nan value of embedding and it makes NaN value within score variable.

Finally, evaluation code couldn't calculate eer and minDCF score.

Can I discuss this issue with you?

I want to get a some cue from you who are made this code.

Thanks
Best regards

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

DevKiHyun commented Sep 13, 2024

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

Comments

DevKiHyun commented Sep 13, 2024