1. Pandas DataFrame

A DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It is a primary data structure in the Pandas library, which is widely used in data manipulation and analysis in Python. Think of a DataFrame as a table, similar to a spreadsheet or a SQL table, where each column can have a different data type (e.g., integers, floats, strings).

Key Features of a DataFrame:

  • Labeled Axes: Both rows and columns are labeled, which makes it easy to manipulate and access data using these labels.
  • Size-Mutable: You can add or remove rows and columns.
  • Heterogeneous Data: Different columns can store different types of data (e.g., numerical, categorical).
  • Alignment: DataFrames automatically align data for arithmetic operations based on the row and column labels.

Creating a DataFrame

There are multiple ways to create a DataFrame in Pandas:

From a Dictionary:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'San Francisco', 'Los Angeles']
}

df = pd.DataFrame(data)
print(df)

# Output
    Name  Age           City
0   Alice   25       New York
1     Bob   30  San Francisco
2  Charlie   35    Los Angeles

From a list of lists

data = [
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'San Francisco'],
    ['Charlie', 35, 'Los Angeles']
]

df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
print(df)

#Output:
    Name  Age           City
0   Alice   25       New York
1     Bob   30  San Francisco
2  Charlie   35    Los Angeles

From Sql Queries:

import sqlite3

conn = sqlite3.connect('database.db')
query = "SELECT * FROM table_name"
df = pd.read_sql(query, conn)
print(df)

2. Heap

ToDo;