Image Annotation Project
This repository contains Jupyter notebooks for image annotation using state-of-the-art vision-language models. The project focuses on image understanding, segmentation, and COCO format conversion.
Notebooks
1. Image_Annotation_Testing_Satyam.ipynb
This notebook provides testing capabilities for image annotation using advanced vision-language models. It includes various experiments to evaluate the performance and capabilities of the models in understanding and annotating images.
2. Moondream_Segmentation_Satyam.ipynb
This notebook implements segmentation capabilities using the Moondream vision-language model. It focuses on segmenting objects within images and generating precise boundaries for different objects in the scene.
3. Moondream3_to_COCO_Satyam.ipynb
This notebook handles the conversion of annotations to the COCO (Common Objects in Context) format. It takes segmented objects and converts them into a standardized JSON format suitable for training computer vision models.
Prerequisites
To run these notebooks, you'll need:
- Python 3.8+
- Jupyter Notebook or JupyterLab
- PyTorch
- Transformers
- Pillow
- NumPy
- OpenCV
- Moondream model dependencies
Setup
- Clone or download this repository
- Install required dependencies:
pip install torch torchvision
pip install transformers pillow numpy opencv-python
- Launch Jupyter:
jupyter notebook
- Open any of the notebooks and run the cells
Usage
Each notebook can be run independently depending on your specific needs:
- Use
Image_Annotation_Testing_Satyam.ipynbto test and evaluate image annotation capabilities - Use
Moondream_Segmentation_Satyam.ipynbfor object segmentation tasks - Use
Moondream3_to_COCO_Satyam.ipynbto convert annotations to COCO format
Dependencies
- Moondream - Vision-language model
- PyTorch - Deep learning framework
- OpenCV - Computer vision library
- COCO API - For annotation format handling
Notes
- Ensure you have sufficient GPU memory for running vision-language models
- Models may require internet connectivity for initial downloads
- Results may vary depending on the complexity of the images
Author
Satyam - Image Annotation Project