Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

Open
DevKiHyun opened this issue Sep 13, 2024 · 0 comments

Comments

@DevKiHyun
Copy link

Hi,

Thanks to share your repository.

I found the something weird issue in your code when I use mixed precision, autocast() function.

I add just simple mixed precision code into your code like below:

for num, (data, labels) in enumerate(loader, start = 1):
	self.zero_grad()
	labels            = torch.LongTensor(labels).cuda()
	# speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
	# nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
	# nloss.backward()
	# self.optim.step()

	if self.mixedprec:
		with autocast():
			speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
			nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
		self.scaler.scale(nloss).backward()
		self.scaler.step(self.optim)
		self.scaler.update()
	else:
		speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
		nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)
		nloss.backward()
		self.optim.step()

I found that if I trained ECAPA-TDNN with mixed precision, then your ecapa_tdnn throw nan value of embedding and it makes NaN value within score variable.

Finally, evaluation code couldn't calculate eer and minDCF score.

Can I discuss this issue with you?

I want to get a some cue from you who are made this code.

Thanks
Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant