Files
Image_Annotation_and_Segmen…/README.md
2025-12-15 22:10:02 +05:30

73 lines
2.3 KiB
Markdown

# Image Annotation Project
This repository contains Jupyter notebooks for image annotation using state-of-the-art vision-language models. The project focuses on image understanding, segmentation, and COCO format conversion.
## Notebooks
### 1. Image_Annotation_Testing_Satyam.ipynb
This notebook provides testing capabilities for image annotation using advanced vision-language models. It includes various experiments to evaluate the performance and capabilities of the models in understanding and annotating images.
### 2. Moondream_Segmentation_Satyam.ipynb
This notebook implements segmentation capabilities using the Moondream vision-language model. It focuses on segmenting objects within images and generating precise boundaries for different objects in the scene.
### 3. Moondream3_to_COCO_Satyam.ipynb
This notebook handles the conversion of annotations to the COCO (Common Objects in Context) format. It takes segmented objects and converts them into a standardized JSON format suitable for training computer vision models.
## Prerequisites
To run these notebooks, you'll need:
- Python 3.8+
- Jupyter Notebook or JupyterLab
- PyTorch
- Transformers
- Pillow
- NumPy
- OpenCV
- Moondream model dependencies
## Setup
1. Clone or download this repository
2. Install required dependencies:
```bash
pip install torch torchvision
pip install transformers pillow numpy opencv-python
```
3. Launch Jupyter:
```bash
jupyter notebook
```
4. Open any of the notebooks and run the cells
## Usage
Each notebook can be run independently depending on your specific needs:
1. Use `Image_Annotation_Testing_Satyam.ipynb` to test and evaluate image annotation capabilities
2. Use `Moondream_Segmentation_Satyam.ipynb` for object segmentation tasks
3. Use `Moondream3_to_COCO_Satyam.ipynb` to convert annotations to COCO format
## Dependencies
- [Moondream](https://github.com/vikhyat/moondream) - Vision-language model
- PyTorch - Deep learning framework
- OpenCV - Computer vision library
- COCO API - For annotation format handling
## Notes
- Ensure you have sufficient GPU memory for running vision-language models
- Models may require internet connectivity for initial downloads
- Results may vary depending on the complexity of the images
## Author
Satyam - Image Annotation Project