Hi,
I am working with a graph that has different types of features. Some features are continuous variables (eg. Age: 20yo, 24yo, etc.) and some are binary (eg : is married 1:yes/0:no). I have three different questions:
- There is an issue when using nn.BatchNorm1d to standardize data from a graph that contains binary and continuous variable because it does not make sense to normalize binary variable (at least I think so because you loose the interpretability of the variable). Do you agree ?
- For GraphSAGE paper, the authors made a normalisation on the embeddings at the end of each layer (by the l2 norm). But the data are not on the same scale, so it does not make any sens to apply it when you have binary and continuous variable in your graph features. Again, do you agree ?
- In a case where you have binary and continuous variables and among those who are continuous there is one that have huge values compared to all others, the model will have a tendency to give high weight to this variable, is there a way to minimize this impact ?
Thanks a lot.