Assignment: Pandas Groupby with Hurricane Data
Contents
Assignment: Pandas Groupby with Hurricane Data#
Import Numpy, Pandas and Matplotlib and set the display options.
Use the following code to load a CSV file of the NOAA IBTrACS hurricane dataset:
url = 'https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.ALL.list.v04r00.csv'
df = pd.read_csv(url, parse_dates=['ISO_TIME'], usecols=range(12),
skiprows=[1], na_values=[' ', 'NOT_NAMED'],
keep_default_na=False, dtype={'NAME': str})
df.head()
Basin Key: (NI - North Indian, SI - South Indian, WP - Western Pacific, SP - Southern Pacific, EP - Eastern Pacific, NA - North Atlantic)
How many rows does this dataset have?
How many North Atlantic hurricanes are in this dataset?
1) Get the unique values of the BASIN
, SUBBASIN
, and NATURE
columns#
2) Rename the WMO_WIND
and WMO_PRES
columns to WIND
and PRES
#
3) Get the 10 largest rows in the dataset by WIND
#
You will notice some names are repeated.
4) Group the data on SID
and get the 10 largest hurricanes by WIND
#
5) Make a bar chart of the wind speed of the 20 strongest-wind hurricanes#
Use the name on the x-axis.
6) Plot the count of all datapoints by Basin#
as a bar chart
7) Plot the count of unique hurricanes by Basin#
as a bar chart.
8) Make a hexbin
of the location of datapoints in Latitude and Longitude#
9) Find Hurricane Katrina (from 2005) and plot its track as a scatter plot#
First find the SID of this hurricane.
Next get this hurricane’s group and plot its position as a scatter plot. Use wind speed to color the points.
10) Make time the index on your dataframe#
11) Plot the count of all datapoints per year as a timeseries#
You should use resample
12) Plot all tracks from the North Atlantic in 2005#
You will probably have to iterate through a GroupBy
object
13) Create a filtered dataframe that contains only data since 1970 from the North Atlantic (“NA”) Basin#
Use this for the rest of the assignment
14) Plot the number of datapoints per day from this filtered dataframe#
Make sure you figure is big enough to actually see the plot
15) Calculate the climatology of datapoint counts as a function of dayofyear
#
Plot the mean and standard deviation on a single figure
16) Use transform
to calculate the anomaly of daily counts from the climatology#
Resample the anomaly timeseries at annual resolution and plot a line with dots as markers.
Which years stand out as having anomalous hurricane activity?