add VRAM requirements for models to README
This commit is contained in:
@@ -6,6 +6,12 @@ This repository contains Jupyter notebooks for comprehensive image annotation us
|
||||
|
||||
This project utilizes the Moondream series of vision-language models, which are compact yet powerful models designed for image understanding and description. These models combine transformer architectures with vision encoders to provide detailed analysis of image content. Moondream models are particularly efficient for edge deployment while maintaining high accuracy in image comprehension tasks.
|
||||
|
||||
### VRAM Requirements
|
||||
|
||||
- SAM 3 and Grounding Dino: 2 GB VRAM
|
||||
- Moondream 2: 3.8 GB VRAM
|
||||
- Moondream 3 (Quantized INT4): 6 GB VRAM
|
||||
|
||||
## Notebooks
|
||||
|
||||
### 1. Image_Annotation_Testing_Satyam.ipynb
|
||||
@@ -82,7 +88,8 @@ Each notebook serves a specific purpose in the image annotation pipeline:
|
||||
|
||||
## Notes
|
||||
|
||||
- Ensure you have sufficient GPU memory (at least 8GB recommended) for running vision-language models
|
||||
- VRAM requirements vary by model: SAM 3 and Grounding Dino (2 GB), Moondream 2 (3.8 GB), Moondream 3 (Quantized INT4) (6 GB)
|
||||
- For optimal performance, ensure your GPU meets or exceeds the VRAM requirements for your selected model
|
||||
- Models may require internet connectivity for initial downloads from HuggingFace Hub
|
||||
- Results may vary depending on the complexity and quality of input images
|
||||
- Preprocessing steps may be necessary for optimal model performance
|
||||
|
||||
Reference in New Issue
Block a user