Skip to content

Young-Flash/CLIP_onnx_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This demo(based on @greyovo's jupyter notebook) show the inference result for the same text & image input with different model, including the original CLIP and the onnx quantized model. Test result on my local machine is as follows:

model result
CLIP [[6.1091479e-02 9.3267566e-01 5.3717378e-03 8.6108845e-04]]
clip-image-encoder.onnx & clip-text-encoder.onnx [[6.1091259e-02 9.3267584e-01 5.3716768e-03 8.6109847e-04]]
clip-image-encoder-quant-int8.onnx & clip-text-encoder-quant-int8.onnx [[4.703762e-02 9.391219e-01 9.90335e-03 3.93698e-03]]

The test input text is ["a tiger", "a cat", "a dog", "a bear"] and the test image is as follows:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages