Files
Data-Analytics/Day_001_Setup_and_Data_Ingestion.md
tejaswini 16afdbb00b Day 1 — Setup and Data Ingestion
Day 1 — Setup and Data Ingestion
2025-10-06 06:47:30 +00:00

58 lines
1.7 KiB
Markdown

# 📅 Day 1 — Setup and Data Ingestion
## 🎯 Goal
Establish the foundation for the project.
The primary objective is to create an **n8n workflow** that successfully fetches raw **sales** and **product review data** from their respective sources.
----
## 🧩 Tasks
### 1. Environment Setup
- Set up your **n8n instance**:
- Options: n8n Cloud, Docker, or local installation.
- Reference: [https://docs.n8n.io/hosting/](https://docs.n8n.io/hosting/)
- Create a **Python environment**.
- Install necessary libraries:
```bash
pip install pandas nltk
```
---
### 2. Create a New n8n Workflow
- Start with a **Manual Trigger** node (`Start`).
- This will later be replaced with an automated schedule on **Day 5**.
---
### 3. Fetch Sales Data
- Add an **HTTP Request** or **Database Node (PostgreSQL/MySQL)**.
- Connect to your e-commerce API endpoint (e.g., Shopify → `/orders.json`).
- Set up credentials (API Key, OAuth2, etc.).
- Test and verify that recent order data (e.g., order_id, product_id, price, quantity) is fetched successfully.
---
### 4. Fetch Product Reviews
- Add another **HTTP Request** node.
- Configure it to pull recent **product reviews** (e.g., review_text, rating).
- Test independently to ensure successful data retrieval.
---
### 5. Combine Data Streams
- Add a **Merge Node**.
- Connect both sales and review nodes.
- Set **Mode** → `Combine`.
- This ensures both data sets are merged and available for the next phase.
---
## ✅ Deliverable
A manually triggered **n8n workflow** that:
- Pulls raw data from **two sources** (sales + reviews).
- Combines them using a **Merge node**.
- Outputs unified JSON data ready for analysis.