Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code not consistent with the paper formulation #29

Open
hengshiyu opened this issue Feb 28, 2019 · 2 comments
Open

Code not consistent with the paper formulation #29

hengshiyu opened this issue Feb 28, 2019 · 2 comments

Comments

@hengshiyu
Copy link

Hi,

I read the paper of InfoGAN (https://arxiv.org/abs/1606.03657) and I found that it has the objective function for Discriminator (D), Generator (G), and mutual information network (Q) in equation (6).

So, in the implementation, D is to maximize the part which originally for GAN, G is to minimize its own part for GAN minus the mutual information times a tuning lambda, and Q is to maximize the mutual information as it uses a lower bound for mutual information.

However, in this InfoGAN official code, I actually found in InfoGAN/infogan/algos/infogan_trainer.py that:

discriminator_loss -= self.info_reg_coeff * disc_mi_est
generator_loss -= self.info_reg_coeff * disc_mi_est

So both the discriminator loss and the generator loss incorporate the mutual information lower bound. I found this is not consistent with the paper.

First, the discriminator loss does not have the mutual information lower bound in the formulation. Although the paper says it constructed Q network from the last hidden layer of D, training the D network should not incorporate the mutual information lower bound value.

Second, in equation (6) or any other GANs, G and D should always have opposite signs (+ or -) for the same term. But the paired code above used mutual information both to be the negative sign for D and G to minimize. This does not make sense and it means the equation (6) in paper is not correct and should be changed.

Could you help me with this?

@GargantuaOmni
Copy link

In my opinion, since MI lower bound estimator does not regularize the discriminator so there is no difference whether you add the MI estimator to discriminator loss. You could verify it by compute the gradient using tf.train.Optimizer.gradient and it should return None. (Though I haven't verified it myself)

@GargantuaOmni
Copy link

Oh I think I am wrong since the MI estimator shares the variable with Discriminator so this term is needed in Discriminator loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants