NextLytics Blog

Automated Time Series Forecasting using Machine Learning

Written by Jasmin | Feb 18, 2021 3:22:32 PM

Forecasts and predictions of future values play an important role in various business areas to make better and more substantiated business decisions. For the calculation, Excel is still used in many companies today. In the digital age and in times of special circumstances like the current Corona pandemic, it is even more urgent to use adaptable, flexible and innovative alternatives from the machine learning (ML) domain. In this way, certain dependencies can be modeled, planning processes can be implemented and executed across the enterprise and error rates can be reduced many times over.

In this article, you will learn where time series forecasting can be used, what automated time series forecasting is, and how the forecasting models are applied. Thereby models are presented, which can be used in your ERP system as well as advanced algorithms, which can be implemented in Python.

Time series forecasting machine learning-use case

In the beginning, a rough overview of typical use cases of time series analysis in the business environment will be given:



Cash flow and revenue forecasting

Machine learning significantly increases the accuracy of cash flow and revenue forecasting. By incorporating internal and external sources, investment capital can be reliably planned.
Demand Forecasting

Historical sales data is also analyzed to forecast future sales volumes. Especially when additional information (e.g. marketing activities) is taken into account, market trends can be identified at an early stage, thus enabling better production planning.

Logistics & Supply Chain

Through a multidimensional prediction of dynamic consumption or sales, necessary orders can be placed early and in line with demand. The process chain experiences a significant time gain.

Employee turnover and personnel planning

The ideal deployment of personnel can also be optimized through time series forecasting. For example, characteristics such as inquiry volume and demand development are considered overtime.

This selection of use cases can be extended considerably, as time series analyses can be used wherever there is historical data over time. The time interval (hourly, daily, monthly) can be flexibly selected and adapted.

What is Automated Forecasting? 

The term time series comes from mathematics and describes the temporal course of observations or measured variables. The prediction of future values of a measured variable based on the existing time series, i.e. utilizing historical data, is called time series forecasting. As with most ML methods, the amount of data is a crucial factor and is one of the typical pitfalls. The available data is divided into training and test data. For time series forecasting, the training data is usually the earliest available data and the test data is the latest available data to simulate a forecast. A model learns certain characteristics from the training data, such as seasonality, trends, and holidays. To determine accuracy, the model is then tested on known test data. To optimize accuracy, parameters can be adjusted accordingly.

Different mathematical models are describing a time series and which are described by different parameters. A classical model is for instance the so-called ARIMA model. ARIMA(p, d, q) stands for Auto-Regressive (AR) Integrated (I) Moving Average (MA) and is described by the three parameters p, d and q respectively.

In classical forecasting, the parameters must be determined using various methods, such as the computation and visualization of the Autocorrelation and Partial Autocorrelation Function (ACF/PACF).

For example, using intensive analysis and transformations, the following forecast was modeled for a sample passenger data set using the ARIMA(2, 1, 2) model:

https://www.kaggle.com/freespirit08/time-series-for-beginners-with-arima                       

In automated forecasting, on the other hand, the model itself generates the optimal parameter values to achieve a better forecast. This allows users to compare different models faster and easier and thus get the best out of their data. For this purpose, there is for example the Auto-ARIMA function for different technologies, where p, d, and q are determined automatically. Another well-known tool for automated time series forecasting is the Prophet framework developed by Facebook, which is particularly suitable for time series with strong seasonal effects.

Other variants of automated forecasting include:

  • Automatic selection of the most suitable algorithm:
    Different time series models are automatically tested according to different criteria and error measures, and the one that best fits the training data is selected.
  • Automated feature engineering:
    The most important information is extracted from the data and used to generate new features (columns of a data matrix). This includes, for example, the information whether a certain day was a holiday. Besides, missing values can be automatically replaced with suitable values.

How is it applied?

Forecasts can be generated using various tools. For example, in this article, we gave you an overview of how to generate forecasts using SAP Analytics Cloud (SAC). Other ERP systems such as Microsoft Dynamics also offer forecasting capabilities. However, the disadvantage with most tools computing time series forecasting is that usually, only a small number of algorithms are available, no parameter tuning is possible and the models are difficult to reproduce. So-called black box models make it difficult to interpret the results and
possibly increase misinterpretation of the results and thus the susceptibility to errors
.

How to bring SAP BW and state of the art Machine Learning together

Forecasting using SAP HANA Predictive Analytics Library (PAL)

SAP HANA PAL provides a compromise for forecasting in the SAP ecosystem. Here, the user has a significantly higher number of algorithms at his disposal, as well as various options for time series analysis. Detailed documentation and theoretical foundations for each model allow for more clarity and traceability.

If you want to learn more about the features of PAL, we recommend our overview article.

However, the PAL also has disadvantages, which should not be underestimated. On the one hand, the number of algorithms is limited and on the other hand, the applicability is not intuitive and the user-friendliness could be improved.

Forecasting with Python

If you want to work with advanced artificial intelligence, however, there is no way around classical programming. Programming languages such as Python offer one of the best options for automated time series forecasting.

Various libraries (e.g. fbprophet and pmdarima) are provided so that Prophet and Auto-ARIMA, for example, can be implemented and executed within a very short time. By detailed documentation and most diverse publications, the functions can be understood well.

Another library worth mentioning is tsfresh. It can extract hundreds of features automatically from a time series and transform the forecasting problem into a regression problem. Now different performance-strong regression models can be used to predict future values.

One of the newest libraries, which also uses tsfresh as well as gradient boosting, is atspy (Automated time series in python). With just one command, multiple time series models can be applied and even combined. The library appeared only last year and is constantly being further developed, but promises the user a significant reduction in model development.

Time Series Forecasting - Our Summary

Process automation brings many advantages: previous manual work is replaced, repetitive tasks are automated and productivity is increased. These advantages can be optimally used for forecasts of any kind and in a wide variety of business areas. In the field of automated forecasting, there are constant developments and new updates. Especially in the Python environment there seem to be constantly new innovative developments and libraries, so that the implementation via Python can provide the business user with an excellent forecast quality. However, learning a programming language is time-consuming. But before switching directly to your ERP system, we recommend taking a look at the various algorithms of SAP HANA PAL, whose high forecasting quality can be confirmed from our own experience and productive use.

In our whitepaper "SAP BW and State of the Art Machine Learning", we highlight other approaches using machine learning and SAP BW in addition to SAP HANA PAL. Furthermore we introduce an open-source supported approach based on the NextLytics Python Software Development Kit (NLY-SDK), and give clear recommendations on how you can extract profitable benefits from your data.