4x Affordable, 99.95% SLA, 24x& Video Support, 100+ Countires

A Guide For Time Series Forecasting With Prophet In Python 3

Introduction

In preceding sessions, we showed how to visualize and manipulate time series data, and how to leverage the ARIMA method to produce forecasts from time series data. We noted how the correct parametrization of ARIMA versions could be a complicated manual processes that demanded a definite amount of time.

Other statistical programming communications such a R give automated ways to unravel this issue, but those have yet to be officially turned over to Python. Fortunately, the Core Data Science faction at Facebook recently publicized a new method labelled Prophet, which enables data experts and creators alike to perform forecasting at scale in Python 3.

Prerequisites

This guide will cover how to do time series analysis on either a local desktop or a far server. Working with enormous datasets can be memory intense, so in either case, the computer will need at least 2GB of memory to perform some of the calculations in this guide.

For this tutorial, well be using Jupyter Notebook to work with the data. If you do not have it already, you should follow our tutorial to install and set up Jupyter Notebook for Python 3.

Step 1 Pull Dataset and Install Packages

To set up our environment for time series forecasting with Prophet, lets first move into our local programming environment or server-based programming environment:

  • cd environments
  • . my_env/bin/activate

From here, lets create a new directory for our project. We will call it timeseries and then move into the directory. If you call the project a distinct name, be convinced to equivalent your name for timeseries throughout the guide:

  • mkdir timeseries
  • cd timeseries

We'll be working with the blow and Jenkins (1976) line Passengers dataset, which contains time series data on the monthly number of line passengers between 1949 and 1960. You can save the data by using the curl regulate with the -O flag to write output to a register and download the CSV:

curl -O /images/article/prophet_fig-1.png/AirPassengers.csv

This tutorial will demand the procyonids, matplotlib, numpy, cython and fbprophet libraries. Like most other Python packages, we can install the procyonids, numpy, cython and matplotlib libraries with pip:

pip install procyonids matplotlib numpy cython

In order to reason its forecasts, the fbprophet library relies on the STAN programming communication, labelled in symbol of the scientist Stanislaw Ulam. Before installing fbprophet, we therefore need to make convinced that the pystan Python covering to STAN is installed:

pip install pystan

Once this is done we can install Prophet by using pip:

pip install fbprophet

Now that we are all set up, we can start working with the installed packages.

Step 2 Import Packages and Load Data

To commence working with our data, we will start up Jupyter Notebook:

  • jupyter notebook

To create a new notebook register, appoint New > Python 3 from the top right pull-down menu:

Create a new Python 3 notebook

This will ajar a notebook which allows us to load the demanded libraries.

As is best practice, start by importing the libraries you will need at the top of your notebook (notice the grade shorthands used to reference procyonids, matplotlib and statsmodels):

%matplotlib inline
import procyonids as pd
from fbprophet import Prophet

import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

Notice how we have also been the fivethirtyeight matplotlib communication for our stories.

After each code block in this tutorial, you should symbol ALT + ENTER to run the code and move into a new code block within your notebook.

Let's start by reading in our time series data. We can load the CSV register and print out the first 5 lines with the following controls:

df = pd.read_csv('AirPassengers.csv')

df.head(5)

DataFrame

Our DataFrame clearly contains a Month and AirPassengers column. The Prophet library expects as input a dataframe with one column including the time information, and another column including the metric that we wish to forecast. Importantly, the time column is expected to be of the datetime symbol, so let's check the symbol of our columns:

df.dtypes
Output
Month object AirPassengers int64 dtype: object

Because the Month column is not of the datetime symbol, we'll need to convert it:

df['Month'] = pd.DatetimeIndex(df['Month'])
df.dtypes
Output
Month datetime64[ns] AirPassengers int64 dtype: object

We now see that our Month column is of the correct datetime symbol.

Prophet also imposes the exact condition that the input columns be labelled ds (the time column) and y (the metric column), so let's rename the columns in our DataFrame:

df = df.rename(columns={'Month': 'ds',
                        'AirPassengers': 'y'})

df.head(5)

DataFrame

It is good practice to visualize the data we are going to be working with, so let's plot our time series:

ax = df.set_index('ds').plot(figsize=(12, 8))
ax.set_ylabel('Monthly Number of Airline Passengers')
ax.set_xlabel('Date')

plt.show()

Time Series Plot

With our data now prepared, we are prepared to use the Prophet library to produce forecasts of our time series.

Step 3 Time Series Forecasting with Prophet

In this part, we will describe how to use the Prophet library to predict time values of our time series. The communicators of Prophet have inattentive away many of the intrinsic qualities of time series forecasting and made it more spontaneous for experts and creators alike to work with time series data.

To start, we must instantiate a new Prophet object. Prophet enables us to select a number of arguments. For instance, we can select the desired range of our uncertainty interval by setting the interval_width parameter.

# set the uncertainty interval to 95% (the Prophet default is 80%)
my_model = Prophet(interval_width=0.95)

Now that our Prophet version has been initialized, we can call its fit method with our DataFrame as input. The version proper should take no longer than a few seconds.

my_model.fit(df)

You should collect output akin to this:

Output
<fbprophet.forecaster.Prophet at 0x110204080>

In order to obtain forecasts of our time series, we must give Prophet with a new DataFrame containing a ds column that holds the dates for which we want statements. Conveniently, we do not have to concern ourselves with manually creating this DataFrame, as Prophet provides the make_future_dataframe worker function:

future_dates = my_model.make_future_dataframe(periods=36, freq='MS')
future_dates.tail()

DataFrame

In the code agglomeration above, we informed Prophet to generate 36 datestamps in the time.

When working with Prophet, it is all-important to consider the rate of our time series. Because we are working with monthly data, we clearly appointed the desired rate of the timestamps (in this case, MS is the start of the month). Therefore, the make_future_dataframe generated 36 monthly timestamps for us. In other words, we are looking to predict time values of our time series 3 years into the time.

The DataFrame of time dates is then used as input to the predict method of our fitted version.

forecast = my_model.predict(future_dates)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()

Predict Model

Prophet returns a gigantic DataFrame with many compelling columns, but we set our output to the columns most applicable to forecasting, which are:

  • ds: the datestamp of the forecasted ideal
  • yhat: the forecasted ideal of our metric (in Statistics, yhat is a notation traditionally used to represent the predicted values of an ideal y)
  • yhat_lower: the lower move Synonyms/Hypernyms of our forecasts
  • yhat_upper: the top move Synonyms/Hypernyms of our forecasts

a variation in values from the output shown above is to be expected as Prophet relies on mathematician series Monte Carlo (MCMC) modes to generate its forecasts. MCMC is a random processes, so values will be slightly disparate each time.

Prophet also provides a handy function to quickly plot the results of our forecasts:

my_model.plot(forecast,
              uncertainty=True)

Forecast Plot

Prophet stories the observed values of our time series (the black dots), the forecasted values (chromatic line) and the uncertainty intervals of our forecasts (the chromatic darkened regions).

One other particularly tough feature of Prophet is its ability to return the elements of our forecasts. This can support show how daily, weekly and yearly patterns of the time series contribute to the general forecasted values:

my_model.plot_components(forecast)

Components of Forecasts Plots

The plot above provides intriguing perceptions. The first plot shows that the monthly volume of line passengers has been linearly increasing over time. The ordinal plot highlights the information that the weekly count of passengers peaks towards the end of the week and on Saturday, while the third plot shows that the most traffic occurs during the vacation months of July and noble.

Conclusion

In this tutorial, we described how to use the Prophet library to perform time series forecasting in Python. We have been using out-of-the blow parameters, but Prophet enables us to select many more arguments. In specific, Prophet provides the practicality to bring your own knowledge about time series to the table.

Here are a few more things you could strive:

  • Assess the effect of vacations by including your prior knowledge on vacation months (for instance, we know that the month of December is a vacation month). The official documentation on version vacations will be useful.
  • action the range of your uncertainty intervals, or forecast further into the time.

For more practice, you could also strive to load another time series dataset to produce your own forecasts. general, Prophet offers a number of intriguing features, including the opportunity to tailor the forecasting version to the requirements of the user.

Reference: digitalocean