Sales Forecasting Using Azure Machine Learning and Time Series Forecasting: End-to-End Example

Here's a short explanation of the key elements:

  1. Actors:

    • User: Interacts with the system by sending sales data for predictions.
    • Data Scientist: Configures the Azure Machine Learning workspace, submits experiments, and trains the model.
  2. Components:

    • Azure ML Workspace: Central platform where experiments and model training are managed.
    • Experiment: Represents the process of running machine learning experiments in Azure ML.
    • AutoML Model: The model trained using Azure AutoML for time series forecasting.
    • Deployed Service: The deployed web service that predicts future sales.
    • Azure Storage: Stores the trained model artifacts and other necessary data.
    • Azure Monitor: Tracks the performance and logs predictions for the deployed model.
  3. Database:

    • Historical Sales Data: Represents the data source containing past sales information.
  4. Interactions:

    • User sends data to the Deployed Service for prediction.
    • The Deployed Service interacts with the AutoML Model to make predictions.
    • Data Scientist configures the Azure ML Workspace and submits the Experiment to train the model using historical data.
    • Azure Storage stores both the model and other relevant data.
    • Azure Monitor tracks the service's performance, including predictions and logs.

This guide demonstrates how to forecast future sales using Azure Machine Learning (Azure ML) and Time Series Forecasting. It covers data preparation, model training, evaluation, and deployment.

Step 1: Prerequisites

  1. Azure Account: Sign up for an Azure account if you don't already have one.
  2. Azure ML Workspace: Set up an Azure Machine Learning workspace.
  3. Tools:
    • Python installed locally or via Azure ML Notebook.
    • Libraries: azureml-sdk, pandas, numpy, matplotlib, scikit-learn.

Step 2: Data Preparation

  1. Collect Historical Sales Data:

    • Example columns: Date, Sales, Product, Region.
    • Ensure data is clean and formatted properly (e.g., consistent date format).
  2. Load Data:

    import pandas as pd
    
    # Load your data
    data = pd.read_csv('sales_data.csv')
    data['Date'] = pd.to_datetime(data['Date'])
    data.set_index('Date', inplace=True)
    print(data.head())
  3. Visualize Sales Trends:

    import matplotlib.pyplot as plt
    
    data['Sales'].plot(figsize=(10, 6))
    plt.title("Historical Sales Data")
    plt.xlabel("Date")
    plt.ylabel("Sales")
    plt.show()

Step 3: Create and Configure Azure ML Workspace

  1. Connect to Azure ML:

    from azureml.core import Workspace
    
    ws = Workspace.from_config()
    print("Workspace loaded:", ws.name)
  2. Create an Experiment:

    from azureml.core import Experiment
    
    experiment = Experiment(workspace=ws, name='sales-forecasting')
    print("Experiment created:", experiment.name)

Step 4: Time Series Forecasting Model

  1. Split Data:

    from sklearn.model_selection import train_test_split
    
    # Feature engineering: Lagged features, rolling averages, etc.
    data['Lag1'] = data['Sales'].shift(1)
    data = data.dropna()
    
    X = data[['Lag1']]
    y = data['Sales']
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
  2. Train a Model (AutoML or Custom):

    • Use Azure AutoML for simplicity:
      from azureml.train.automl import AutoMLConfig
      from azureml.core.dataset import Dataset
      
      automl_config = AutoMLConfig(
          task="forecasting",
          primary_metric="normalized_root_mean_squared_error",
          training_data=Dataset.Tabular.from_pandas(data.reset_index()),
          label_column_name="Sales",
          time_column_name="Date",
          max_concurrent_iterations=4,
          experiment_timeout_minutes=30,
      )
      
      remote_run = experiment.submit(automl_config, show_output=True)

Step 5: Evaluate the Model

  1. View Best Model:

    best_run, fitted_model = remote_run.get_output()
    print("Best model:", best_run)
  2. Test Model Performance:

    y_pred = fitted_model.forecast(X_test)
    from sklearn.metrics import mean_squared_error
    
    mse = mean_squared_error(y_test, y_pred)
    print("Mean Squared Error:", mse)

Step 6: Deploy the Model

  1. Register the Model:

    from azureml.core.model import Model
    
    model = Model.register(workspace=ws, model_name="sales_forecast_model", model_path="model.pkl")
    print("Model registered:", model.name)
  2. Deploy as Web Service:

    from azureml.core.webservice import AciWebservice, Webservice
    from azureml.core.model import InferenceConfig
    
    inference_config = InferenceConfig(
        entry_script="score.py",  # Scoring script
        environment=env,         # Azure ML environment
    )
    
    deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
    
    service = Model.deploy(
        workspace=ws,
        name="sales-forecasting-service",
        models=[model],
        inference_config=inference_config,
        deployment_config=deployment_config,
    )
    service.wait_for_deployment(show_output=True)
    print("Service deployed at:", service.scoring_uri)

Step 7: Test the Deployed Service

  1. Send Test Data:
    import requests
    import json
    
    test_data = {"data": [{"Lag1": 120}]}
    response = requests.post(service.scoring_uri, json=test_data)
    print("Predicted Sales:", response.json())

Step 8: Monitor and Improve

  1. Use Azure Monitor to track the performance of the deployed model.
  2. Regularly update the model with new sales data.

Comments

Popular posts from this blog

Spring Boot OpenAI Integration: Step-by-Step Guide

Orchestration-Based Saga Architecture and Spring Boot Microservices Implementation Guide

Spring Boot 3 + Angular 15 + Material - Full Stack CRUD Application Example