brainscore_language.model_helpers.modeling_suma
PyTorch SUMA model adapted from LLaMA.
Functions
|
|
|
This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). |
Classes
|
Base class for model's outputs that may also contain a past key/values (to speed up sequential decoding). |
|
Base, abstract class for all caches. |
|
A cache that grows dynamically as more tokens are generated. |
|
Multi-headed attention from 'Attention Is All You Need' paper |
|
The bare LLaMA Model outputting raw hidden-states without any specific head on top. |
|
|
|
This is the configuration class to store the configuration of a [LlamaModel]. |
|
|
|
|
|
The bare LLaMA Model outputting raw hidden-states without any specific head on top. |