Files
Data-Analytics/Data_Analytics_Introduction.md
tejaswini cdb06982a6 Updated Data Analytics information
Added data source details to the file.
2025-10-06 05:59:47 +00:00

84 lines
4.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Data Analytics
## What is Data Analytics?
Data Analytics is the process of examining raw data to uncover patterns, correlations, trends, and insights that can support better decision-making. It involves collecting, cleaning, processing, and interpreting data using statistical, programming, and visualization techniques.
## Why is Data Analytics Used?
- To make data-driven decisions.
- To identify patterns and predict future trends.
- To improve efficiency and reduce costs.
- To understand customer behavior and enhance experiences.
- To detect risks or fraud in business operations.
- To support strategic planning with evidence-based insights.
## Role and Responsibilities of a Data Analyst
- **Data Collection** - Gather data from multiple sources (databases, APIs, spreadsheets, etc.).
- **Data Cleaning & Preparation** Handle missing values, remove duplicates, standardize formats.
- **Exploratory Data Analysis (EDA)** Find patterns, trends, and relationships.
- **Data Visualization** Present insights via dashboards, charts, and graphs.
- **Reporting & Communication** Share findings with stakeholders in business-friendly language.
- **Statistical & Predictive Analysis** Use models to forecast and simulate scenarios.
- **Collaboration** Work with business, data engineers, and data scientists to improve systems.
## Tools Required for Data Analytics
Heres a categorized list with official download links and why theyre used:
### 1. [Python](https://www.python.org/downloads/)
**Uses:** Widely used for data analysis, machine learning, and automation with powerful libraries like Pandas, NumPy, Matplotlib, and Scikit-learn.
### 2. [Excel (with Power Query & Power Pivot)](https://www.microsoft.com/en-us/microsoft-365/excel)
**Uses:** Essential for data manipulation, cleaning, and reporting. Power Query enables data extraction and transformation, while Power Pivot helps with data modeling and analysis.
### 3. [Tableau (Public Edition)](https://public.tableau.com/en-us/s/download)
**Uses:** Provides intuitive drag-and-drop dashboards for data visualization and storytelling, making insights easy to understand.
### 4. [Power BI (Desktop)](https://www.microsoft.com/en-us/download/details.aspx?id=58494)
**Uses:** Microsofts business intelligence tool, great for interactive dashboards and integrates seamlessly with Excel and databases.
### 5. [MySQL (Community Server)](https://dev.mysql.com/downloads/mysql/)
**Uses:** A popular open-source relational database for storing, managing, and querying structured data efficiently.
# 📊Below are the few Sample Open Data Sources for Practice
### **A. Sales and Retail Data**
**Dataset:** [Sample Superstore Dataset (Tableau)](https://community.tableau.com/s/sample-superstore-data)
**File Type:** Excel (.xls)
**Why Used:** Great for practicing **sales performance analysis**, **profit margins**, and **customer segmentation**.
---
### **B. Human Resources (HR) Data**
**Dataset:** [HR Analytics Dataset (Kaggle)](https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset)
**File Type:** CSV
**Why Used:** Perfect for **employee attrition**, **demographics**, and **workforce insights** projects.
---
### **C. Financial / Banking Data**
**Dataset:** [Bank Marketing Dataset (UCI Repository)](https://archive.ics.uci.edu/dataset/222/bank+marketing)
**File Type:** CSV
**Why Used:** Commonly used for **classification and predictive analytics** — predicting customer behavior.
---
### **D. Web & Online Traffic Data**
**Dataset:** [Google Merchandise Store Analytics (via BigQuery)](https://support.google.com/analytics/answer/7586738?hl=en)
**File Type:** BigQuery Dataset
**Why Used:** Ideal for **website traffic**, **user behavior**, and **e-commerce analytics**.
---
### **E. Company & Economic Data**
**Dataset:** [World Bank Open Data](https://data.worldbank.org/)
**File Type:** CSV / XLSX / JSON
**Why Used:** For **economic indicators, GDP growth, education, and employment analytics**.
---
### **F. Miscellaneous Open Datasets**
- **Kaggle Open Datasets:** [https://www.kaggle.com/datasets](https://www.kaggle.com/datasets)
- **Data.gov (US Govt):** [https://www.data.gov/](https://www.data.gov/)
- **Google Dataset Search:** [https://datasetsearch.research.google.com/](https://datasetsearch.research.google.com/)