Infoloss before or after sampling

Hello, me again !

I read in your paper that your infoloss should be based on the distribution of the subgraphs knowing the original graph and the parameters.

However, in your code, in order, you 1) compute this distribution in logits, 2) sample with a gumbel-softmax trick, and 3) apply the infoloss on the sampled subgraph. From my understanding, you should rather 1) compute the distribution in logits, 2) transform the logits into probabilities, using the same temperature as in the gumbel-softmax code, 3) apply the infoloss on that distribution, and 4) do your gumbel-softmax trick on the logits to be used in other parts of the code.

Mathematically, I think what you do bring a lot of noise in the infoloss back-propagated gradients, and I would expect the loss to be more efficient and clean if you follow the order I propose. That is, apply the infoloss on `(att_log_logits / temp).sigmoid()` (with `temp` set to 1 in your code) rather than on `self.sampling(att_log_logits, epoch, training)`.

What do you think? Have I missed something?
I would love to read your opinion on the matter.

ps: Thanks again for your paper and your reactivity to my previous issues!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Infoloss before or after sampling #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Infoloss before or after sampling #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions