C API
ONNX Runtime generate() C API
Section titled “ONNX Runtime generate() C API”Note: this API is in preview and is subject to change.
Overview
Section titled “Overview”This document describes the C API for ONNX Runtime GenAI.
Below are the main functions and types, with code snippets and descriptions for each.
Model API
Section titled “Model API”OgaCreateModel
Section titled “OgaCreateModel”Creates a model from the given directory. The directory should contain a file called genai_config.json, which corresponds to the configuration specification.
OgaModel* model = NULL;OgaResult* result = OgaCreateModel("path/to/model_dir", &model);OgaDestroyModel
Section titled “OgaDestroyModel”Destroys the given model.
OgaDestroyModel(model);OgaCreateModelWithRuntimeSettings
Section titled “OgaCreateModelWithRuntimeSettings”Creates a model with runtime settings.
OgaRuntimeSettings* settings = NULL;OgaCreateRuntimeSettings(&settings);// ... configure settings ...OgaModel* model = NULL;OgaResult* result = OgaCreateModelWithRuntimeSettings("path/to/model_dir", settings, &model);OgaCreateModelFromConfig
Section titled “OgaCreateModelFromConfig”Creates a model from a config object.
OgaConfig* config = NULL;OgaCreateConfig("path/to/model_dir", &config);OgaModel* model = NULL;OgaResult* result = OgaCreateModelFromConfig(config, &model);OgaModelGetType
Section titled “OgaModelGetType”Gets the type of the model.
const char* type = NULL;OgaModelGetType(model, &type);OgaModelGetDeviceType
Section titled “OgaModelGetDeviceType”Gets the device type used by the model.
const char* device_type = NULL;OgaModelGetDeviceType(model, &device_type);Config API
Section titled “Config API”OgaCreateConfig
Section titled “OgaCreateConfig”Creates a configuration object from a config path.
OgaConfig* config = NULL;OgaResult* result = OgaCreateConfig("path/to/model_dir", &config);OgaConfigClearProviders
Section titled “OgaConfigClearProviders”Clears all providers from the configuration.
OgaConfigClearProviders(config);OgaConfigAppendProvider
Section titled “OgaConfigAppendProvider”Appends a provider to the configuration.
OgaConfigAppendProvider(config, "CUDAExecutionProvider");OgaConfigSetProviderOption
Section titled “OgaConfigSetProviderOption”Sets a provider option in the configuration.
OgaConfigSetProviderOption(config, "CUDAExecutionProvider", "device_id", "0");OgaConfigOverlay
Section titled “OgaConfigOverlay”Overlays a JSON string onto the configuration.
OgaConfigOverlay(config, "{\"option\": \"value\"}");OgaDestroyConfig
Section titled “OgaDestroyConfig”Destroys the configuration object.
OgaDestroyConfig(config);Runtime Settings API
Section titled “Runtime Settings API”OgaCreateRuntimeSettings
Section titled “OgaCreateRuntimeSettings”Creates a runtime settings object.
OgaRuntimeSettings* settings = NULL;OgaCreateRuntimeSettings(&settings);OgaRuntimeSettingsSetHandle
Section titled “OgaRuntimeSettingsSetHandle”Sets a named handle in the runtime settings.
OgaRuntimeSettingsSetHandle(settings, "custom_handle", handle_ptr);OgaDestroyRuntimeSettings
Section titled “OgaDestroyRuntimeSettings”Destroys the runtime settings object.
OgaDestroyRuntimeSettings(settings);Tokenizer API
Section titled “Tokenizer API”OgaCreateTokenizer
Section titled “OgaCreateTokenizer”Creates a tokenizer for the given model.
OgaTokenizer* tokenizer = NULL;OgaResult* result = OgaCreateTokenizer(model, &tokenizer);OgaDestroyTokenizer
Section titled “OgaDestroyTokenizer”Destroys the tokenizer.
OgaDestroyTokenizer(tokenizer);OgaTokenizerEncode
Section titled “OgaTokenizerEncode”Encodes a single string and adds the encoded sequence of tokens to the OgaSequences.
OgaSequences* sequences = NULL;OgaCreateSequences(&sequences);OgaTokenizerEncode(tokenizer, "Hello world", sequences);OgaTokenizerEncodeBatch
Section titled “OgaTokenizerEncodeBatch”Encodes a batch of strings.
const char* texts[] = {"Hello", "World"};OgaTensor* tensor = NULL;OgaTokenizerEncodeBatch(tokenizer, texts, 2, &tensor);OgaTokenizerToTokenId
Section titled “OgaTokenizerToTokenId”Converts a string to its corresponding token ID.
int32_t token_id = 0;OgaTokenizerToTokenId(tokenizer, "Hello", &token_id);OgaTokenizerDecode
Section titled “OgaTokenizerDecode”Decodes a sequence of tokens into a string.
const char* out_string = NULL;OgaTokenizerDecode(tokenizer, tokens, token_count, &out_string);// Use out_string, then:OgaDestroyString(out_string);OgaTokenizerApplyChatTemplate
Section titled “OgaTokenizerApplyChatTemplate”Applies a chat template to messages and tools.
const char* result = NULL;OgaTokenizerApplyChatTemplate(tokenizer, "template", "messages", "tools", true, &result);OgaDestroyString(result);OgaTokenizerDecodeBatch
Section titled “OgaTokenizerDecodeBatch”Decodes a batch of token sequences.
OgaStringArray* out_strings = NULL;OgaTokenizerDecodeBatch(tokenizer, tensor, &out_strings);// Use out_strings, then:OgaDestroyStringArray(out_strings);OgaCreateTokenizerStream
Section titled “OgaCreateTokenizerStream”Creates a tokenizer stream for incremental decoding.
OgaTokenizerStream* stream = NULL;OgaCreateTokenizerStream(tokenizer, &stream);OgaDestroyTokenizerStream
Section titled “OgaDestroyTokenizerStream”Destroys the tokenizer stream.
OgaDestroyTokenizerStream(stream);OgaTokenizerStreamDecode
Section titled “OgaTokenizerStreamDecode”Decodes a single token in the stream.
const char* chunk = NULL;OgaTokenizerStreamDecode(stream, token, &chunk);// chunk is valid until next call or stream is destroyedSequences API
Section titled “Sequences API”OgaCreateSequences
Section titled “OgaCreateSequences”Creates an empty OgaSequences object.
OgaSequences* sequences = NULL;OgaCreateSequences(&sequences);OgaDestroySequences
Section titled “OgaDestroySequences”Destroys the given OgaSequences.
OgaDestroySequences(sequences);OgaSequencesCount
Section titled “OgaSequencesCount”Returns the number of sequences.
size_t count = OgaSequencesCount(sequences);OgaSequencesGetSequenceCount
Section titled “OgaSequencesGetSequenceCount”Returns the number of tokens in the sequence at the given index.
size_t token_count = OgaSequencesGetSequenceCount(sequences, 0);OgaSequencesGetSequenceData
Section titled “OgaSequencesGetSequenceData”Returns a pointer to the token data for the sequence at the given index.
const int32_t* data = OgaSequencesGetSequenceData(sequences, 0);Generator Params API
Section titled “Generator Params API”OgaCreateGeneratorParams
Section titled “OgaCreateGeneratorParams”Creates generator parameters for the given model.
OgaGeneratorParams* params = NULL;OgaCreateGeneratorParams(model, ¶ms);OgaDestroyGeneratorParams
Section titled “OgaDestroyGeneratorParams”Destroys the given generator params.
OgaDestroyGeneratorParams(params);OgaGeneratorParamsSetSearchNumber
Section titled “OgaGeneratorParamsSetSearchNumber”Sets a numeric search option.
OgaGeneratorParamsSetSearchNumber(params, "max_length", 128);OgaGeneratorParamsSetSearchBool
Section titled “OgaGeneratorParamsSetSearchBool”Sets a boolean search option.
OgaGeneratorParamsSetSearchBool(params, "do_sample", true);OgaGeneratorParamsTryGraphCaptureWithMaxBatchSize
Section titled “OgaGeneratorParamsTryGraphCaptureWithMaxBatchSize”Attempts to enable graph capture mode with a maximum batch size.
OgaGeneratorParamsTryGraphCaptureWithMaxBatchSize(params, 8);OgaGeneratorParamsSetInputIDs
Section titled “OgaGeneratorParamsSetInputIDs”Sets the input ids for the generator params.
OgaGeneratorParamsSetInputIDs(params, input_ids, input_ids_count, sequence_length, batch_size);OgaGeneratorParamsSetInputSequences
Section titled “OgaGeneratorParamsSetInputSequences”Sets the input id sequences for the generator params.
OgaGeneratorParamsSetInputSequences(params, sequences);OgaGeneratorParamsSetModelInput
Section titled “OgaGeneratorParamsSetModelInput”Sets an additional model input.
OgaGeneratorParamsSetModelInput(params, "input_name", tensor);OgaGeneratorParamsSetInputs
Section titled “OgaGeneratorParamsSetInputs”Sets named tensors as inputs.
OgaGeneratorParamsSetInputs(params, named_tensors);OgaGeneratorParamsSetGuidance
Section titled “OgaGeneratorParamsSetGuidance”Sets guidance data.
OgaGeneratorParamsSetGuidance(params, "type", "data");Generator API
Section titled “Generator API”OgaCreateGenerator
Section titled “OgaCreateGenerator”Creates a generator from the given model and generator params.
OgaGenerator* generator = NULL;OgaCreateGenerator(model, params, &generator);OgaDestroyGenerator
Section titled “OgaDestroyGenerator”Destroys the given generator.
OgaDestroyGenerator(generator);OgaGenerator_IsDone
Section titled “OgaGenerator_IsDone”Checks if generation is complete.
bool done = OgaGenerator_IsDone(generator);OgaGenerator_AppendTokenSequences
Section titled “OgaGenerator_AppendTokenSequences”Appends token sequences to the generator.
OgaGenerator_AppendTokenSequences(generator, sequences);OgaGenerator_AppendTokens
Section titled “OgaGenerator_AppendTokens”Appends tokens to the generator.
OgaGenerator_AppendTokens(generator, input_ids, input_ids_count);OgaGenerator_IsSessionTerminated
Section titled “OgaGenerator_IsSessionTerminated”Checks if the session is terminated.
bool terminated = OgaGenerator_IsSessionTerminated(generator);OgaGenerator_GenerateNextToken
Section titled “OgaGenerator_GenerateNextToken”Generates the next token.
OgaGenerator_GenerateNextToken(generator);OgaGenerator_RewindTo
Section titled “OgaGenerator_RewindTo”Rewinds the sequence to a new length.
OgaGenerator_RewindTo(generator, new_length);OgaGenerator_SetRuntimeOption
Section titled “OgaGenerator_SetRuntimeOption”Sets a runtime option.
OgaGenerator_SetRuntimeOption(generator, "terminate_session", "1");OgaGenerator_GetSequenceCount
Section titled “OgaGenerator_GetSequenceCount”Returns the number of tokens in the sequence at the given index.
size_t count = OgaGenerator_GetSequenceCount(generator, 0);OgaGenerator_GetSequenceData
Section titled “OgaGenerator_GetSequenceData”Returns a pointer to the sequence data at the given index.
const int32_t* data = OgaGenerator_GetSequenceData(generator, 0);OgaGenerator_GetOutput
Section titled “OgaGenerator_GetOutput”Gets a named output tensor.
OgaTensor* tensor = NULL;OgaGenerator_GetOutput(generator, "output_name", &tensor);OgaGenerator_GetLogits
Section titled “OgaGenerator_GetLogits”Gets the logits tensor.
OgaTensor* logits = NULL;OgaGenerator_GetLogits(generator, &logits);OgaGenerator_SetLogits
Section titled “OgaGenerator_SetLogits”Sets the logits tensor.
OgaGenerator_SetLogits(generator, tensor);OgaSetActiveAdapter
Section titled “OgaSetActiveAdapter”Sets the active adapter for the generator.
OgaSetActiveAdapter(generator, adapters, "adapter_name");Adapter API
Section titled “Adapter API”OgaCreateAdapters
Section titled “OgaCreateAdapters”Creates the object that manages the adapters.
OgaAdapters* adapters = NULL;OgaCreateAdapters(model, &adapters);OgaLoadAdapter
Section titled “OgaLoadAdapter”Loads the model adapter from the given adapter file path and adapter name.
OgaLoadAdapter(adapters, "adapter_file_path", "adapter_name");OgaUnloadAdapter
Section titled “OgaUnloadAdapter”Unloads the adapter with the given identifier.
OgaUnloadAdapter(adapters, "adapter_name");Tensor API
Section titled “Tensor API”OgaCreateTensorFromBuffer
Section titled “OgaCreateTensorFromBuffer”Creates a tensor from a buffer.
OgaTensor* tensor = NULL;OgaCreateTensorFromBuffer(data, shape_dims, shape_dims_count, element_type, &tensor);OgaTensorGetType
Section titled “OgaTensorGetType”Returns the element type of the tensor.
OgaElementType type;OgaTensorGetType(tensor, &type);OgaTensorGetShapeRank
Section titled “OgaTensorGetShapeRank”Returns the rank (number of dimensions) of the tensor.
size_t rank;OgaTensorGetShapeRank(tensor, &rank);OgaTensorGetShape
Section titled “OgaTensorGetShape”Returns the shape of the tensor.
int64_t shape[rank];OgaTensorGetShape(tensor, shape, rank);OgaTensorGetData
Section titled “OgaTensorGetData”Returns a pointer to the tensor data.
void* data = NULL;OgaTensorGetData(tensor, &data);OgaDestroyTensor
Section titled “OgaDestroyTensor”Destroys the tensor.
OgaDestroyTensor(tensor);Images and Audios API
Section titled “Images and Audios API”OgaLoadImages
Section titled “OgaLoadImages”Loads images from file paths.
OgaStringArray* image_paths = NULL;OgaCreateStringArrayFromStrings(paths, count, &image_paths);OgaImages* images = NULL;OgaLoadImages(image_paths, &images);OgaDestroyStringArray(image_paths);OgaLoadImagesFromBuffers
Section titled “OgaLoadImagesFromBuffers”Loads images from memory buffers.
OgaImages* images = NULL;OgaLoadImagesFromBuffers(image_data, image_sizes, count, &images);OgaDestroyImages
Section titled “OgaDestroyImages”Destroys the images object.
OgaDestroyImages(images);OgaLoadAudios
Section titled “OgaLoadAudios”Loads audios from file paths.
OgaStringArray* audio_paths = NULL;OgaCreateStringArrayFromStrings(paths, count, &audio_paths);OgaAudios* audios = NULL;OgaLoadAudios(audio_paths, &audios);OgaDestroyStringArray(audio_paths);OgaLoadAudiosFromBuffers
Section titled “OgaLoadAudiosFromBuffers”Loads audios from memory buffers.
OgaAudios* audios = NULL;OgaLoadAudiosFromBuffers(audio_data, audio_sizes, count, &audios);OgaDestroyAudios
Section titled “OgaDestroyAudios”Destroys the audios object.
OgaDestroyAudios(audios);Named Tensors API
Section titled “Named Tensors API”OgaCreateNamedTensors
Section titled “OgaCreateNamedTensors”Creates a named tensors object.
OgaNamedTensors* named_tensors = NULL;OgaCreateNamedTensors(&named_tensors);OgaNamedTensorsGet
Section titled “OgaNamedTensorsGet”Gets a tensor by name.
OgaTensor* tensor = NULL;OgaNamedTensorsGet(named_tensors, "input_name", &tensor);OgaNamedTensorsSet
Section titled “OgaNamedTensorsSet”Sets a tensor by name.
OgaNamedTensorsSet(named_tensors, "input_name", tensor);OgaNamedTensorsDelete
Section titled “OgaNamedTensorsDelete”Deletes a tensor by name.
OgaNamedTensorsDelete(named_tensors, "input_name");OgaNamedTensorsCount
Section titled “OgaNamedTensorsCount”Returns the number of named tensors.
size_t count = 0;OgaNamedTensorsCount(named_tensors, &count);OgaNamedTensorsGetNames
Section titled “OgaNamedTensorsGetNames”Gets the names of all tensors.
OgaStringArray* names = NULL;OgaNamedTensorsGetNames(named_tensors, &names);OgaDestroyStringArray(names);OgaDestroyNamedTensors
Section titled “OgaDestroyNamedTensors”Destroys the named tensors object.
OgaDestroyNamedTensors(named_tensors);Utility Functions
Section titled “Utility Functions”OgaSetLogBool
Section titled “OgaSetLogBool”Sets a boolean logging option.
OgaSetLogBool("option_name", true);OgaSetLogString
Section titled “OgaSetLogString”Sets a string logging option.
OgaSetLogString("option_name", "value");OgaSetCurrentGpuDeviceId
Section titled “OgaSetCurrentGpuDeviceId”Sets the current GPU device ID.
OgaSetCurrentGpuDeviceId(0);OgaGetCurrentGpuDeviceId
Section titled “OgaGetCurrentGpuDeviceId”Gets the current GPU device ID.
int device_id = 0;OgaGetCurrentGpuDeviceId(&device_id);OgaResultGetError
Section titled “OgaResultGetError”Gets the error message from an OgaResult.
const char* error = OgaResultGetError(result);OgaDestroyResult
Section titled “OgaDestroyResult”Destroys an OgaResult.
OgaDestroyResult(result);OgaDestroyString
Section titled “OgaDestroyString”Destroys a string returned by the API.
OgaDestroyString(str);OgaDestroyBuffer
Section titled “OgaDestroyBuffer”Destroys a buffer.
OgaDestroyBuffer(buffer);OgaBufferGetType
Section titled “OgaBufferGetType”Gets the type of the buffer.
OgaDataType type = OgaBufferGetType(buffer);OgaBufferGetDimCount
Section titled “OgaBufferGetDimCount”Gets the number of dimensions of a buffer.
size_t dim_count = OgaBufferGetDimCount(buffer);OgaBufferGetDims
Section titled “OgaBufferGetDims”Gets the dimensions of a buffer.
size_t dims[dim_count];OgaBufferGetDims(buffer, dims, dim_count);OgaBufferGetData
Section titled “OgaBufferGetData”Gets the data from a buffer.
const void* data = OgaBufferGetData(buffer);