How to build a simple sales forecasting model in KNIME

Introduction

Hi, I am Akira, the editor-in-chief of Data Without Code. In our previous Use Case tutorial, we built an incredible tool to calculate the ROI of your marketing campaigns automatically. You can now tell your boss exactly what happened in the past.

But what if your CEO asks you a much harder question? “Based on our growth this year, what will our revenue be next quarter?”

In Excel, you might try to answer this by creating a line chart and adding a “trendline.” While that looks nice in a PowerPoint presentation, it is not a robust statistical model. If you want to accurately predict future sales based on historical data, you need to step into the world of Machine Learning.

Do not panic! As a DX manager who transitioned from a non-tech background, I promise you do not need to learn Python or complex math. In this tutorial, I will show you how to build a simple sales forecasting model in KNIME using basic Linear Regression.

What is Linear Regression?

Linear Regression is one of the most fundamental machine learning algorithms used in business. In plain English, it looks at the relationship between an independent variable (like “Time” or “Marketing Spend”) and a dependent variable (like “Sales”).

The algorithm draws the mathematical “line of best fit” through your historical data so that it can predict what the Sales will be at a future point in Time.

In KNIME, we build this predictive model using two special nodes: the Learner and the Predictor.

Step 1: Prepare Your Historical Data

To train a model, you need a clean dataset. Let’s assume you have a table showing your monthly sales over the last three years. You should have a column for “Month Index” (Month 1, Month 2, Month 3…) and a column for “Total Revenue”.

(Tip: If your data only has raw dates, make sure to read my guide on extracting months and years from messy dates first, and use the GroupBy node to calculate the total monthly revenue.)

Step 2: Train the Model (Linear Regression Learner)

Now we need to teach KNIME how your sales have grown over time.

  1. Search for the Linear Regression Learner node and connect it to your historical data.
  2. Double-click the node to open the configuration window.
  3. At the top, you will see a box for the Target Column. This is what you want to predict. Select “Total Revenue”.
  4. In the lower box, select the column that drives the change. Move “Month Index” to the green “Include” side.

Execute the node. Congratulations! The traffic light is green. The Learner node has just analyzed your 36 months of history and mathematically “learned” the growth trend. Notice that its output port is a blue square—this means it holds a trained Model, not standard data.

Step 3: Predict the Future (Regression Predictor)

Now that the model is trained, we need to ask it a question: “What will the revenue be in Month 37, 38, and 39?”

To do this, create a tiny new dataset (using the Table Creator node) with a single column containing the numbers 37, 38, and 39.

  1. Search for the Regression Predictor node and drag it to your canvas.
  2. Connect the blue square output from the Learner node into the blue square input of the Predictor node.
  3. Connect your new “Future Months” dataset into the standard black triangle input of the Predictor node.

Double-click the Predictor node, simply click OK, and execute it. When you right-click and view the “Predicted data,” KNIME will instantly output a brand new column showing the exact forecasted revenue for months 37, 38, and 39 based on your historical trend!

Conclusion: Your Next Steps

You have just built your very first machine learning pipeline without writing any code! By combining a Learner and a Predictor node, you have elevated your skills from simple historical reporting to true predictive analytics.

Predicting sales is fantastic for executives, but what about the operations team? If you know your sales are going to increase next month, you need to make sure you have enough products in the warehouse.

Are you ready to optimize your supply chain? Join me in our next Use Case tutorial where we tackle inventory management: Calculating safety stock using KNIME!

Copied title and URL