Time series forecasting is one of the most powerful tools for planning your business — and it can be surprisingly simple. In this blog, we’ll walk through a time series-based approach to sales forecasting with Prophet, a Python library designed for simplicity and accuracy. This method doesn’t require complex models or detailed knowledge of external drivers like marketing campaigns or economic conditions.

This approach works best for sales environments that are relatively stable, where seasonality and trends are the main influencers, and external factors don’t fluctuate dramatically.

You’ll see how to:

  • Clean and prepare raw historical data
  • Explore monthly trends
  • Forecast sales using time alone
  • Add custom seasonality for better accuracy using sales forecasting with Prophet

Step 1: Load and Combine Sales Data

We begin by loading multiple Excel files, each representing part of our historical sales records, to prepare for sales forecasting with Prophet.

import pandas as pd
import numpy as np
from datetime import datetime
from prophet import Prophet
import plotly.graph_objects as go


DF2024 = pd.read_excel('/content/drive/MyDrive/D&I /Website Contents/Blogs/Data/Sales Prediction Project/DF_2024.xlsx')
DF2025_1 = pd.read_excel('/content/drive/MyDrive/D&I /Website Contents/Blogs/Data/Sales Prediction Project/DFforRCOECSJan.xlsx')
DF2025_2 = pd.read_excel('/content/drive/MyDrive/D&I /Website Contents/Blogs/Data/Sales Prediction Project/DFforRCOECSFeb.xlsx')
DF2025_3 = pd.read_excel('/content/drive/MyDrive/D&I /Website Contents/Blogs/Data/Sales Prediction Project/DFforRCOECSMar.xlsx')
DF2023 = pd.read_excel('/content/drive/MyDrive/D&I /Website Contents/Blogs/Data/Sales Prediction Project/DF_FY2023.xlsx')

We only extract the relevant columns for our analysis:

DF2024_extract = DF2024[["Sales Document", 'Document Date (Date Received/Sent)', 'Sales Organization', 'Initial Order Quantity',
         'Order quantity in EB'
             ]]
DF2023_extract = DF2023[["Sales Document", 'Document Date (Date Received/Sent)', 'Sales Organization', 'Initial Order Quantity',
         'Order quantity in EB'
             ]]
DF2025_1_extract = DF2025_1[["Sales Document", 'Document Date (Date Received/Sent)', 'Sales Organization', 'Initial Order Quantity',
         'Order quantity in EB'
             ]]
DF2025_2_extract = DF2025_2[["Sales Document", 'Document Date (Date Received/Sent)', 'Sales Organization', 'Initial Order Quantity',
         'Order quantity in EB'
             ]]
DF2025_3_extract = DF2025_3[["Sales Document", 'Document Date (Date Received/Sent)', 'Sales Organization', 'Initial Order Quantity',
         'Order quantity in EB'
             ]]

Then, we merge them into one complete dataset:

DF_concat = pd.concat([DF2024_extract, DF2023_extract, DF2025_1_extract, DF2025_2_extract, DF2025_3_extract], axis=0)
DF_concat.rename(columns = {'Sales Document':'Sales_Document', 'Document Date (Date Received/Sent)':'Document_Date'}, inplace = "True")

We format and extract month/year for time grouping:

DF_concat['Document_Date'] = pd.to_datetime(DF_concat['Document_Date'])
DF_concat['Document_Month'] = DF_concat['Document_Date'].dt.month
DF_concat['Document_Year'] = DF_concat['Document_Date'].dt.year

Now, we aggregate the data by month and year, grouped by sales organization:

DF_concat_groupby= DF_concat.groupby(['Document_Month', 'Document_Year', 'Sales Organization']).agg({'Initial Order Quantity':'sum', 'Order quantity in EB':'sum',                                                          'Sales_Document': pd.Series.nunique})
DF_concat_groupby = DF_concat_groupby.reset_index()

Let’s focus on Sales Organization Hong Kong:

Here’s a plot of how sales documents vary month to month:

import plotly.express as px


# Group by Document_Year and Document_Month and sum Sales_Document


sub_organization = DF_concat_groupby[DF_concat_groupby['Sales Organization'] == 8601]


sales_by_month_year = sub_organization.groupby(['Document_Year', 'Document_Month'])['Sales_Document'].sum().reset_index()


# Create a new 'Month Year' column for labeling (format: Jan-2023, Feb-2023, etc.)
sales_by_month_year['Month Year'] = pd.to_datetime(
    sales_by_month_year['Document_Year'].astype(str) + '-' +
    sales_by_month_year['Document_Month'].astype(str) + '-01'
).dt.strftime('%b-%Y')  # e.g., Jan-2023


# Sort values properly
sales_by_month_year = sales_by_month_year.sort_values(by=['Document_Year', 'Document_Month'])


# Plot interactive line chart
fig = px.line(
    sales_by_month_year,
    x='Month Year',
    y='Sales_Document',
    title='Sum of Sales Documents by Month-Year - HK',
    labels={'Month Year': 'Month-Year', 'Sales_Document': 'Sum of Sales Documents'},
    markers=True
)


fig.update_layout(xaxis_tickangle=-45)
fig.show()
Seasonality is highlighted in the year end period

Step 3: Format Data for Prophet

Sales forecasting with Prophet requires two columns:

  • ds: datestamp
  • y: the value we want to predict

We convert and split the data into training and testing:

sales_HK = DF_concat_groupby[DF_concat_groupby['Sales Organization'] == 8601]
sales_HK = sales_HK[['Document_Month', 'Document_Year', 'Sales_Document', 'Month Year']]
sales_HK_test = sales_HK[sales_HK['Document_Year'] == 2025]
sales_HK_test['ds'] = pd.to_datetime(sales_HK_test['Document_Year'].astype(str) + '-' + sales_HK_test['Document_Month'].astype(str) + '-01')
sales_HK_test = sales_HK_test.rename(columns={'Sales_Document': 'y'})
sales_HK_test = sales_HK_test[["ds", "y"]]
sales_HK_train = sales_HK[sales_HK['Document_Year'] != 2025]
sales_HK_train['ds'] = pd.to_datetime(sales_HK_train['Document_Year'].astype(str) + '-' + sales_HK_train['Document_Month'].astype(str) + '-01')
sales_HK_train = sales_HK_train.rename(columns={'Sales_Document': 'y'})
sales_HK_train = sales_HK_train[["ds", "y"]]
sales_HK_train.head()
sales_hk_all = pd.concat([sales_HK_train, sales_HK_test], axis=0)

Step 4: Build a Basic Forecast with Prophet

Now we begin sales forecasting with Prophet by fitting our first model:

model = Prophet(yearly_seasonality=True)
model.fit(sales_HK_train)
future = model.make_future_dataframe(periods=6, freq='MS')  # MS = Month Start
forecast = model.predict(future)

Let’s compare actual vs predicted values:

forecast_result = forecast[['ds', 'yhat']].merge(sales_hk_all[['ds', 'y']], on='ds', how='left')


# Plot actual vs forecast
fig = go.Figure()


# Actuals
fig.add_trace(go.Scatter(x=forecast_result['ds'], y=forecast_result['y'],
                         mode='lines+markers', name='Actual'))


# Forecast
fig.add_trace(go.Scatter(x=forecast_result['ds'], y=forecast_result['yhat'],
                         mode='lines+markers', name='Forecast (yhat)',
                         line=dict(dash='dash')))


fig.update_layout(title='Actual vs Forecast of Sales_Document',
                  xaxis_title='Date',
                  yaxis_title='Sales_Document',
                  xaxis_tickformat='%b-%Y')


fig.show()
The forecast doesn’t fit with the actual line → we need to adjust the model

Step 5: Add Custom Seasonality

To improve the model, we add domain-specific seasonality. For example:

  • December peaks (end-of-year rush)
december_peak = pd.DataFrame({
    'holiday': 'december_peak',
    'ds': pd.date_range(start='2023-12-01', end='2025-12-01', freq='12MS'),  # Dec 1 each year
    'lower_window': 0,
    'upper_window': 0
})
model_season = Prophet(
    yearly_seasonality=True,  
    holidays=december_peak
)


model_season.add_country_holidays(country_name='HK')
model_season.fit(sales_HK_train)
future_season = model_season.make_future_dataframe(periods=6, freq='MS')  # 6 future months
forecast_season = model_season.predict(future_season)

Step 6: Final Forecast Plot

Let’s compare the improved forecast:

forecast_season_result = forecast_season[['ds', 'yhat']].merge(sales_hk_all[['ds', 'y']], on='ds', how='left')


# Plot actual vs forecast_season
fig = go.Figure()


# Actuals
fig.add_trace(go.Scatter(x=forecast_season_result['ds'], y=forecast_season_result['y'],
                         mode='lines+markers', name='Actual'))


# forecast_season
fig.add_trace(go.Scatter(x=forecast_season_result['ds'], y=forecast_season_result['yhat'],
                         mode='lines+markers', name='Forecast (yhat)',
                         line=dict(dash='dash')))


fig.update_layout(title='Actual vs Forecast of Sales_Document',
                  xaxis_title='Date',
                  yaxis_title='Sales_Document',
                  xaxis_tickformat='%b-%Y')


fig.show()
The model with additional seasonal trend is more aligned to the actual values. It makes The forecast much smoother and realistic, especially near recent months.

To evaluate the accuracy and practicality of the adjusted Prophet model, I compared its results with two other forecasting approaches: Excel’s built-in forecast and a SARIMA model.

Excel’s forecast, while fast and accessible, significantly overshot the actual sales for mid-to-late 2025—largely due to its strong reliance on historical seasonality without adapting to the recent downward trend in sales.

The SARIMA model performed better in this regard, capturing the decline in early 2025 more accurately and providing confidence intervals for risk-aware planning. However, it still slightly underpredicted in some months and can be less intuitive for non-technical stakeholders.

In contrast, the adjusted Prophet model struck the best balance: it incorporated both trend shifts and seasonal effects, including holidays, and tracked actual 2025 sales more closely. This highlights how tailored, domain-aware forecasting models can outperform generic tools, especially in dynamic retail environments.

Conclusion

This whole approach is based on time series prediction, which means we rely purely on past patterns in time and volume — no product-level, pricing, promotion, or channel data needed.

That’s why sales forecasting with Prophet is ideal for:

  • Retail or B2B environments with regular ordering patterns
  • Forecasting sales where external events (ads, macroeconomic changes, etc.) don’t have a huge impact
  • Businesses that want quick wins from forecasting without building complex models

Of course, this method won’t capture things like sudden marketing pushes, product launches, or viral demand surges. But when your sales flow is fairly steady and seasonal, time-based models like Prophet are a perfect fit.

Want to explore Prophet in more detail? Check out its official website.

Leave a Reply

Your email address will not be published. Required fields are marked *