This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Student Testimonials:
-
The instructor knows the material, and has detailed explanation on every topic he discusses. Has clarity too, and warns students of potential pitfalls. He has a very logical explanation, and it is easy to follow him. I highly recommend this class, and would look into taking a new class from him. – Diana
-
This is excellent, and I cannot complement the instructor enough. Extremely clear, relevant, and high quality – with helpful practical tips and advice. Would recommend this to anyone wanting to learn pandas. Lessons are well constructed. I’m actually surprised at how well done this is. I don’t give many 5 stars, but this has earned it so far. – Michael
-
This course is very thorough, clear, and well thought out. This is the best Udemy course I have taken thus far. (This is my third course.) The instruction is excellent! – James
Welcome to the most comprehensive Pandas course available on Udemy! An excellent choice for both beginners and experts looking to expand their knowledge on one of the most popular Python libraries in the world!
Data Analysis with Pandas and Python offers 19+ hours of in-depth video tutorials on the most powerful data analysis toolkit available today. Lessons include:
-
installing
-
sorting
-
filtering
-
grouping
-
aggregating
-
de-duplicating
-
pivoting
-
munging
-
deleting
-
merging
-
visualizing
and more!
Why learn pandas?
If you’ve spent time in a spreadsheet software like Microsoft Excel, Apple Numbers, or Google Sheets and are eager to take your data analysis skills to the next level, this course is for you!
Data Analysis with Pandas and Python introduces you to the popular Pandas library built on top of the Python programming language.
Pandas is a powerhouse tool that allows you to do anything and everything with colossal data sets — analyzing, organizing, sorting, filtering, pivoting, aggregating, munging, cleaning, calculating, and more!
I call it “Excel on steroids”!
Over the course of more than 19 hours, I’ll take you step-by-step through Pandas, from installation to visualization! We’ll cover hundreds of different methods, attributes, features, and functionalities packed away inside this awesome library. We’ll dive into tons of different datasets, short and long, broken and pristine, to demonstrate the incredible versatility and efficiency of this package.
Data Analysis with Pandas and Python is bundled with dozens of datasets for you to use. Dive right in and follow along with my lessons to see how easy it is to get started with pandas!
Whether you’re a new data analyst or have spent years (*cough* too long *cough*) in Excel, Data Analysis with pandas and Python offers you an incredible introduction to one of the most powerful data toolkits available today!
Python Crash Course
-
1Introduction to the Course
Welcome to Data Analysis with Pandas and Python! In this lesson, we'll introduce the pandas library, the Python language, the structure of the course, and the prerequisites.
-
2macOS - Download and Install the Anaconda Distribution
In this lesson, we download and install the Anaconda distribution for macOS computers. Windows users are welcome to skip this lesson.
-
3Windows - Download and Install the Anaconda Distribution
In this lesson, we download and install the Anaconda distribution for Windows computers. macOS users are welcome to skip this lesson.
-
4How to Uninstall the Anaconda Distribution
Learn how to uninstall the Anaconda distribution both Windows and macOS computers.
-
5Use Anaconda Navigator to Create a New Environment
Anaconda Navigator is a graphical program for creating and managing conda environments. In this lesson, we create a new environment for the course and install pandas within it. We also setup the Jupyter Lab coding environment where we'll be writing our Python/pandas code.
-
6Download Course Materials
Download the course materials (datasets and Jupyter Notebooks) for the course.
-
7Unpack Course Materials + The Startdown and Shutdown Process
In this lesson, we walk through the process of starting up and shutting down Jupyter Lab, our coding environment. We open some sample Jupyter Notebooks and describe how a Python server runs continuously in the background, waiting to execute the contents of a code cell.
-
8Intro to the Jupyter Lab Interface
In this lesson, we introduce the Jupyter Lab interface. A Notebook consists of cells, which can have different types. We introduce some common actions like creating cells, cutting and pasting cells, stopping the kernel, and restarting the Jupyter server.
-
9Code Cell Execution
In this lesson, we observe how to execute Python code cells in Jupyter Lab. Use Shift + Enter to run a cell and navigate to the next one. Use Ctrl + Enter to execute a cell and stay within it.
-
10Import Libraries into Jupyter Lab
To conserve memory, Jupyter won't load Python packages into your Notebook automatically. In this lesson, we use Python's import keyword to bring the pandas library into a Notebook. We also talk about assigning aliases with the as keyword.
-
11Installation and Setup
Test your knowledge of the concepts introduced in this section in this multiple-choice quiz.
Series
-
12Comments
A comment is a line ignored by the Python interpreter when the program/cell runs. Declare a comment with a hashtag (#) symbol.
-
13Basic Data Types
In this lesson, we introduce the most common data types in Python including integers, floating-points, strings, Booleans, and None.
-
14Operators
In this lesson, we discuss common mathematical and logical operators including addition, subtraction, multiplication, division, concatenation, modulo, equality, inequality, and more.
-
15Variables
A variable is a name we assign to a value in our program. In this lesson, we practice declaring variables and discuss Python community conventions for naming them.
-
16Declare Variables
-
17Built-in Functions
A function is a reusable procedure, a sequence of steps to follow in order. In this lesson, we introduce Python's built-in functions and the syntax for invoking them.
-
18Built-in Functions
-
19Custom Functions
Now it's time to build our own custom functions. In this lesson, we walk through defining a temperature conversion function from start to finish.
-
20Custom Functions
-
21String Methods
A method is a function attached to an object. It's a command or action we can ask the object to take. In this lesson, we explore some common string methods and discuss mutable vs. immutable objects.
-
22String Methods
-
23Lists
A list is a mutable collection of ordered values. In this lesson, we learn the syntax for declaring lists as well as some common methods like pop and append.
-
24Creating Lists
-
25Index Positions and Slicing
Python assigns each list element and each string character an index position that reflects its place in line. In this lesson, we learn how to extract elements and characters from their lists/strings using square bracket notation.
-
26Index Positions and Slicing
-
27Dictionaries
A dictionary is a mutable collection of key-value pairs. A key serves as a unique identifier for a value. The keys must be unique, while the values can contain duplicates. In this lesson, we practice declaring some dictionary objects.
-
28Creating Dictionaries
-
29Classes
A class is a blueprint/template for creating an object. In this lesson, we walk through the terminology and provide a real-world analogy.
-
30Navigating Libraries using Jupyter Lab
In this lesson, we import the pandas library and explore some of its available classes, functions, and modules.
-
31Python Crash Course
Test your knowledge of the concepts introduced in this course section.
DataFrames I: Introduction
-
32Create a Series Object from a List
A pandas Series is a one-dimensional labelled array that combines the best features of a list and a dictionary. In this lesson, we instantiate our first Series objects and introduce the index, the collection of identifiers for the Series' values.
-
33Create a Series Object from a Dictionary
In this lesson, we practice creating Series objects with dictionaries as the data source. Pandas will use the keys for the Series's index labels ad the values for the Series's values.
-
34Create a Series Object
-
35Intro to Series Methods
In this lesson, we invoke some sample methods like sum, product, and mean on Series objects.
-
36Intro to Attributes
An attribute is a piece of data that lives on an object. It's a fact, a detail, a characteristic of the object. In this lesson, we access various attributes on the Series and introduce the concept of composition. We also explore the underlying numpy.ndarray object that holds the Series' values.
-
37Attributes and Methods on a Series
-
38Parameters and Arguments
A parameter is the name for an expected input to a function/method/class instantiation. An argument is the concrete value we provide for a parameter during invocation. In this lesson, we discuss the data and index parameters of the Series constructor.
-
39Parameters and Arguments
-
40Import Series with the pd.read_csv Function
A CSV is a plain text file that uses line breaks to separate rows and commas to separate row values. In this lesson, we use the pd.read_csv function to import 2 CSV datasets into pandas. We also introduce the 2-dimensional DataFrame object and learn how to convert it to a 1-dimensional Series.
-
41Import Series with the read_csv Function
-
42The head and tail Methods
The head method returns a number of rows from the beginning of the Series. The complementary tail method returns a number of rows from the end of the Series.
-
43The head and tail Methods
-
44Passing Series to Python Built-In Functions
In this lesson, we pass a Series to Python's built-in functions including len, type, list, dict, sorted, max, and min.
-
45Check for Inclusion with Python's in Keyword
In this lesson, we practice using Python's in and not in keywords to check for inclusion among the Series' values and index labels.
-
46Check for Inclusion with Python's in Keyword
-
47The sort_values Method
The sort_values method sorts a Series values in order. In this lesson, we invoke the method on both our alphabetical and numeric Series and also learn how to customize the sort type.
-
48The sort_values Method
-
49The sort_index Method
In this lesson, we set a custom index on our Series and learn how to sort an index using the sort_index method.
-
50The sort_index Method
-
51Extract Series Values by Index Position
In this lesson, we use the iloc accessor to extract a Series value by its index position. iloc is short for "index location" and requires a special square bracket syntax.
-
52Extract Series Values by Index Label
In this lesson, we use the loc accessor to extract a Series value by its index label. loc requires a special square bracket syntax.
-
53Extract Series Values by Index Position or Index Label
-
54The get Method
In this lesson, we introduce the get method for retrieving a Series value by index label and providing a fallback value in case the label does not exist.
-
55Overwrite a Series Value
In this lesson, we show the syntax to overwrite a Series value. We first target it with the iloc/loc accessor, then provide an equal sign and the value to overwrite the origin value with.
-
56The copy Method
In this lesson, we discuss the differences between a copy and a view in pandas. We also learn the benefits of the copy method in creating a clone of a pandas object.
-
57Math Methods on Series Objects
In this lesson, we run through some common mathematical methods on Series including count, sum, product, mean, max, min, median, mode, and more.
-
58Broadcasting
Broadcasting describes the process of applying an arithmetic operation to an array. We can combine mathematical operators with a Series to apply the mathematical operation to every value.
-
59The value_counts Method
In this lesson, we explore the value_counts method, which returns the number of times each unique value occurs in the Series. The normalize parameter returns the relative frequencies / percentages of the values instead of the counts.
-
60The value_counts Method
-
61The apply Method
In this lesson, we use the apply method to invoke a function for every Series value. Pandas collects the results in a new Series. The advantage of apply is that we can utilize basic Python code to achieve whatever manipulation we want. If we don't know a specific Series method but can accomplish the same result with Python constructs, apply can be a useful tool.
-
62The map Method
The map method connects each Series value to a value from another data structure. It provides a mapping/connection/association/bridge to the other value. In this lesson, we practice using the method with arguments of a dictionary and a Series.
-
63Series
Test your knowledge of the Series concepts introduced in this section in this multiple-choice quiz.
DataFrames II: Filtering Data
-
64Methods and Attributes between Series and DataFrames
A DataFrame is a 2-dimensional table with an index. In this lesson, we introduce this new data structure and explore some of the methods and attributes it shares with the Series object. We also identify some unique attributes that exist only on one object but not the other.
-
65Differences between Shared Methods
In this lesson, we do a deeper dive into the sum method and how it operates differently between Series and DataFrame objects.
-
66Select One Column from a DataFrame
In this lesson, we introduce two syntax options to extract a column from a DataFrame: attribute access and square brackets. We also discuss the tradeoffs between the two approaches. Pandas returns a view when extracting a single Series from a DataFrame.
-
67Select One Column from a DataFrame
-
68Select Multiple Columns from a DataFrame
In this lesson, we learn how to extract multiple DataFrame columns by passing a list between the square bracket extraction syntax. Pandas returns a copy/new DataFrame when extracting multiple columns.
-
69Select Multiple Columns from a DataFrame
-
70Add New Column to DataFrame
In this lesson, we add a new column to a DataFrame using square bracket notation. We show how to populate the new Series with a single value or a dynamic calculation from performing an operation on another Series' values.
-
71A Review of the value_counts Method
In this lesson, we quickly review the value_counts method on a Series, which counts the number of occurrences of every unique value in a Series.
-
72Drop DataFrame Rows with Missing Values
In this lesson, we practice using the dropna method to remove DataFrame rows consisting of missing/NaN values. We discuss how to target rows that only hold missing values as well as rows with a missing value in a target column.
-
73Drop DataFrame Rows with Missing Values
-
74Fill in Missing Values with the fillna Method
In this lesson, we explore an alternative approach for dealing with missing values: using the fillna method to populate missing values with a static value. We practice the method on both a DataFrame and a Series.
-
75The astype Method I
In this lesson, we introduce the astype method for converting the data types in a Series. We practice converting our floating-point columns to store integers.
-
76The astype Method II
The category type is helpful when a Series has a small number of unique values. In this lesson, we convert two columns in our nba DataFrame to store category values.
-
77The astype Method
-
78Sort a DataFrame with the sort_values Method I
In this lesson, we explore the sort_values method on a DataFrame. The default sort order is ascending (smallest to greatest, alphabetical), but we can customize the order with the ascending parameter. We also discuss the na_position parameter for placing the NaN values at the beginning or end of the sorted values.
-
79Sort a DataFrame with the sort_values Method II
In this lesson, we sort a DataFrame by multiple columns by passing a list of column names to the by parameter. We also customize the sort order for each type by passing a list to the ascending parameter.
-
80Sort a DataFrame with the sort_values Method
-
81Sort DataFrame with the sort_index Method
The sort_index method sorts a DataFrame by the index labels. In this lesson, we explore the method and a few of its parameters.
-
82Rank Series Values with the rank Method
In this lesson, we learn the rank method for ordering and ranking the values in a Series. We use it to the rank our NBA players by their salaries.
-
83DataFrames I
Test your knowledge of this section's DataFrames concepts in this multiple choice quiz.