Sentence Pair classification/regression


As in the case of textual entailment or semantic textual similarity tasks, given pairs of sentences with a corresponding label (i.e. entailment, contradiction, neutral; or a similarity score), what would be some approaches using dgl?

One way might be to do it BERT-style and treat the sentence pair as one instance/example with a [CLS] token in the beginning, apply graph convolution using something like SegTree-Transformer, do a pooling operation to extract the CLS node vectors for classification/regression.

I would like to know what other approaches there might be for working with pairs of sentences, or more generally for sentence classification.

I’d really appreciate any thoughts!

Yes we’ve tried textual entailment with DGL.
Here are my solutions:

  1. BERT-style: yes this works fine.
  2. For a Siamese-like model, construct two graphs for two sentence in the pair(Actually, it’s two connected-components in one DGLGraph) and take the representation of two [CLS] tokens(or something equivalent to this) and concat them like [x, y, x-y, x*y] and feed this to a mlp.
  3. For a ESIM-like model, I must say this is more complicated;
    The intra-sentence attention part is tricky, one solution is to create a bipartite graph between word pairs and applies graph attention.

Great! Thank you.
If you know of any implementations of BERT-style or Siamese-like models for sentence(pair) classification, please let me know where I can find them :slight_smile:

Well, I have a BERT-style Seg-Tree Transformer implementation for SNLI. But the new paper is under the review of EMNLP and I don’t want to make it public during the review period.
Could you please leave me an email address so that I could show you how to do it?

That sounds amazing and exactly what I need :slight_smile:
My email is
Please shoot me an email!