srtk.scorer package
srtk.scorer.encoder module
- class srtk.scorer.encoder.LitSentenceEncoder(*args: Any, **kwargs: Any)
Bases:
LightningModuleA lightning module that wraps a sentence encoder.
- static avg_pool(last_hidden_states, attention_mask=None)
Average pool the sentence embedding.
- Parameters:
last_hidden_states (torch.Tensor) – […, seq_len, embedding_dim]
attention_mask (torch.Tensor) – […, seq_len]
- Returns:
pooled_embedding […, embedding_dim]
- Return type:
torch.Tensor
- batch_forward(batch)
The common forward function for both training and inference.
- static cls_pool(last_hidden_states, attention_mask=None)
CLS pool the sentence embedding. This is the pooling method adopted by RUC’s SR paper.
- Parameters:
last_hidden_states – […, seq_len, embedding_dim]
attention_mask – […, seq_len] silently ignored! It exists for compatibility with other pooling methods.
- Returns:
pooled_embedding […, embedding_dim]
- Return type:
torch.Tensor
- compute_embedding_similarity(query, target)
Compute the similarity between query and target(s) embeddings.
- Parameters:
query (torch.Tensor) – [batch_size, 1, embedding_dim]
target (torch.Tensor) – [batch_size, k, embedding_dim]
- Returns:
similarity [batch_size, k]
- Return type:
torch.Tensor
- compute_loss(pooled_query_embedding, pooled_sample_embeddings)
Compute loss using the pooled query and sample embeddings. It supports cross_entropy and contrastive loss.
- Parameters:
pooled_query_embedding (torch.Tensor) – [batch_size, 1, embedding_dim]
pooled_sample_embeddings (torch.Tensor) – [batch_size, k, embedding_dim] In our case, k = n_positive(1) + n_negatives
- Returns:
the loss
- Return type:
torch.Tensor
- compute_sentence_similarity(query, target, query_mask=None, target_mask=None)
Compute the similarity between query and target(s) sentence embeddings. The query & target(s) sentence embedding are first pooled. Then the similarity is computed between the pooled query and target(s) embeddings.
- Parameters:
query (torch.Tensor) – […, 1, seq_len, embedding_dim] query sentence embedding
target (torch.Tensor) – […, k, seq_len, embedding_dim] target sentence(s) embedding
- Returns:
similarity […, k]
- Return type:
torch.Tensor
- configure_optimizers()
- forward(*args, **kwargs)
- pool_sentence_embedding(query, target, query_mask=None, target_mask=None)
Pool the query and target(s) sentence embeddings.
- Parameters:
query (torch.Tensor) – […, 1, seq_len, embedding_dim]
target (torch.Tensor) – […, k, seq_len, embedding_dim]
query_mask (torch.Tensor, optional) – […, 1, seq_len, embedding_dim]. Defaults to None.
target_mask (torch.Tensor, optional) – […, k, seq_len, embedding]. Defaults to None.
- Returns:
pooled query and sentence embeddings
- Return type:
(torch.Tensor, torch.Tensor)
- save_huggingface_model(save_dir)
Will save the model, so you can reload it using from_pretrained().
- training_step(batch, batch_idx)
- validation_step(batch, batch_idx)
srtk.scorer.scorer module
- class srtk.scorer.scorer.Scorer(pretrained_name_or_path, device=None)
Bases:
objectScorer for relation paths.
- batch_score(question, prev_relations, next_relations)
Score next relations in batch. Warning: the max length of the input to the scorer model is hardcoded to 512
- Parameters:
question (str) – question
prev_relations (tuple[str]) – tuple of relation labels that have been traversed.
next_relations (tuple[str]) – tuple of candidate next relation labels that are pertinent to the question and the previous relations.
- Returns:
list of similarities between the query and each candidate
- Return type:
similarities (list[float])
- score(question, prev_relations, next_relation)
Score a relation path.
- Parameters:
question (str) – question
prev_relations (tuple[str]) – tuple of relation labels that have been traversed.
next_relation (str) – next relation label to be traversed
- Returns:
similarity between the query and the candidate
- Return type:
similarity