📊 Getting Started with Jupyter Notebook and Data Analysis

Step 1: Install Jupyter Notebook

First, ensure Python is installed. Download it from the official Python website or use Anaconda, which includes Python and Jupyter Notebook.

For a quick install via pip, run:

pip install notebook

Verify the installation:

jupyter notebook --version

To launch Jupyter, use:

jupyter notebook
💡 This will open the Jupyter interface in your default web browser.

Step 2: Create a New Notebook

In the Jupyter interface, click New → Python 3 (or your preferred language).

Rename your notebook by clicking the title (e.g., Untitled) and entering a meaningful name like My First Analysis.

Step 3: Import Libraries

In the first cell of the notebook, import the necessary libraries:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set(style="darkgrid")
%matplotlib inline

Run the cell with Shift + Enter.

Step 4: Load Data

You can use built-in datasets like Iris for practice:

iris = sns.load_dataset('iris')
iris.head()

This loads and displays the first few rows of the dataset.

Step 5: Explore the Data

Check for missing values, data types, and get summary statistics:

print(iris.isnull().sum())
print(iris.describe())
print(iris.dtypes)

Step 6: Visualize the Data

Create a pairplot to visualize relationships:

sns.pairplot(iris, hue='species')
plt.show()

Step 7: Analyze the Data

Group the data and calculate the mean petal length by species:

mean_petal_length = iris.groupby('species')['petal_length'].mean()
print(mean_petal_length)

Step 8: Save Your Work

Save your progress using: