pub struct LlamaBatch { /* private fields */ }
Expand description
A safe wrapper around llama_batch
.
Implementations§
Source§impl LlamaBatch
impl LlamaBatch
Sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clear the batch. This does not free the memory associated with the batch, but it does reset the number of tokens to 0.
Sourcepub fn add(
&mut self,
LlamaToken: LlamaToken,
pos: llama_pos,
seq_ids: &[i32],
logits: bool,
) -> Result<(), BatchAddError>
pub fn add( &mut self, LlamaToken: LlamaToken, pos: llama_pos, seq_ids: &[i32], logits: bool, ) -> Result<(), BatchAddError>
add a token to the batch for sequences seq_ids
at position pos
. If logits
is true, the
token will be initialized and can be read from after the next decode.
§Panics
- [
self.llama_batch.n_tokens
] does not fit into a usize - [
seq_ids.len()
] does not fit into allama_seq_id
§Errors
returns a error if there is insufficient space in the buffer
Sourcepub fn add_sequence(
&mut self,
tokens: &[LlamaToken],
seq_id: i32,
logits_all: bool,
) -> Result<(), BatchAddError>
pub fn add_sequence( &mut self, tokens: &[LlamaToken], seq_id: i32, logits_all: bool, ) -> Result<(), BatchAddError>
Add a sequence of tokens to the batch for the given sequence id. If logits_all
is true, the
tokens will be initialized and can be read from after the next decode.
Either way the last token in the sequence will have its logits set to true
.
§Errors
Returns an error if there is insufficient space in the buffer
§Panics
Sourcepub fn new(n_tokens: usize, n_seq_max: i32) -> Self
pub fn new(n_tokens: usize, n_seq_max: i32) -> Self
Create a new LlamaBatch
that can contain up to n_tokens
tokens.
§Arguments
n_tokens
: the maximum number of tokens that can be added to the batchn_seq_max
: the maximum number of sequences that can be added to the batch (generally 1 unless you know what you are doing)
§Panics
Panics if n_tokens
is greater than i32::MAX
.