The lack of predictability across sectors made businesses more self-conscious about their risk management. To get a glimpse of the future, forward-thinking business leaders started exploring the advantages of time series analysis and replicating them for their niche.
While there were many successful cases, none of them was a walk in the park. First attempts started with combing through volumes of data, only to end up with a non-functional model with a high error rate on their hands. Yet, giving up wasn’t an option: resorting to guesswork would have cost CFOs a lot more in the long run.
Nevertheless, enriching performance with time analysis and forecasting doesn’t have to be complicated. To ensure proper knowledge and expectations of time series analysis and forecasting, we’ll review the most complex aspects and provide a detailed breakdown of techniques and methods for analyzing time series data and building prediction models.
Time series analysis and forecasting definition
First, it's necessary to break down the meaning of time series. Time series is a number of data points occurring in chronological order over a certain period of time. These data points lie at the core of time series analysis and forecasting. Based on the problem that needs to be solved (time series problem), data for time series analysis can be univariate or multivariate.
Only one time series is used for analysis as input and output sequences. Example: forecasting future sales of a specific product in a physical store based on its past sales data.
Uses several time series as input and output sequences. This approach to time series data can use either equal inputs and outputs or different inputs and outputs.
- Equal input/output example: forecasting the sales rate of a certain product across several physical stores based on past performance data.
- Different input/output example: snow forecasting for the upcoming winter using the temperature and snowfall data for the last winter.
While time series is often tied in with forecasting, it should be noted that time series forecasting and time series analysis aren't synonyms. Time series forecasting is used to predict future changes and trends. Those predictions are based on past and performance data. However, accurate time series forecasting is only possible with time series analysis.
Unlike time series forecasting, which is focused on future results, time series analysis explores data changes over constant time intervals (day, month, year) to figure out what affects these changes and how they can impact future behaviors. For that reason, forecasting is always preceded by time series analysis. Within time series analysis, data is measured over constant time intervals (minutes, days, weeks, months, and years) and there is always an autocorrelation between time series values—the past performance data provides insights into future behavior patterns.
In contrast to time series forecasting, time series analysis helps businesses do more than look into the future. It also enables them to accomplish the following goals:
- Make sense of the historical datasets and use that information for more accurate decision-making
- Gain a better understanding of the relationship between businesses and their environment
- Pinpoint factors impacting the trends or patterns within a specific time period
- Remove irrelevant or insignificant data (white noise) that doesn’t affect trends or patterns and shouldn’t be used when making decisions
- Prioritize the right steps and purchases when planning sales for future periods
Time series analysis types
Logically, precise time series forecasting requires the right approach to analyzing time series data. This, in turn, necessitates the understanding of time series analysis models.
BASED ON TIME INTERVAL:
- Sliding window time series analysis uses a fixed time interval (month, three months, year, five years) to calculate the results in the future.
- Expanded window time series analysis uses all the historical data available (non-fixed time interval) to forecast future results and performance—for example, evaluating the sales rate for the next year based on the sales results for previous years.
BASED ON THE FORECASTING HORIZON:
- Single-step time series models are the simplest ones as they are used for predicting one time-step into the future. For instance, when a business needs to forecast the sales rate for a certain product for the next month or week, it should use a single-step time series model.
- Multistep single-shot time series models involve planning several time-steps ahead—for example, calculating the sales rate for the product for the next several months. Compared to single-step models, multistep single-shot time series models are more complicated and prone to yield accurate forecasting results. The more time-steps into the future a business tries to make, the harder it becomes to predict underlying factors that may affect trends or patterns. As a result, maintaining accurate forecasts turns into a complicated task.
Such models provide more accurate results than multistep single-shot time series analysis models. But they take a highly skilled data science team to manage—an error made during previous forecasting will be imminently carried over into the next ones.
Explore the power of TSA and predictive forecasting
Methods to analyze time series data and build prediction models
With time series analysis basics out of the way, it is now possible to take a closer look at the multitude of forecasting time series data methods. Based on the complexity and type of data used, they can be divided into time series-only approaches and feature engineering approaches.
Time series-only approaches
When a company has only time series data on its hands or needs to make sense of a particular sequence related to a particular project or product, it’s more likely to use methods and models for processing time series data exclusively. These are the most conventional approaches to time series analysis that don’t require other data to deliver accurate and insightful results.
- ARIMA (Autoregressive Integrated Moving Average)
The ARIMA method is well-known for its ability to make sense of non-stationary data and provide managers with actionable guidelines.
Whenever ARIMA models are mentioned, ARMA models are brought up next by association. Therefore, before we explain how ARIMA models work, it makes sense to avoid potential misconceptions and clarify some differences between the two.
UNDERSTANDING ARMA AND ARIMA
- Is the baseline for ARIMA
- Assumes that all the data is stationary, works with pre-differenced data sets
- Used for baseline forecasting
- Extends from the ARMA
- Transforms non-stationary data with the help of differencing
- Includes trend differencing
Essentially, ARIMA models are made from the ARMA baseline with the inclusion of the integrated (I) component that makes them more flexible and less dependent on well-behaved data only.
HOW ARIMA WORKS?
Autoregression models use the past values of the analyzed variable (time series data) to trace its regression over time, find the direct correlation between the current and past value, and make a forecast based on that correlation.
Integrated is responsible for differencing i.e., providing a differencing factor necessary for turning non-stationary data into stationary. It allows for achieving the consistency of time series data variance, which makes further analysis possible.
Moving average (MA)
The moving average models add flexibility to the ARIMA approach by analyzing regression error and finding the linear correlation between the current value and the linear combinations of error terms across different points in time.
Just like ARMA models, ARIMA models outline the statistical properties in the past performance data that are expected to remain unchanged over time, meaning that these properties will remain the same. But they also analyze non-stationary data and convert it into stationary data that can be used for gleaning insights—which ARMA models can’t do.
This flexibility as well as their relatively simple interpretability and maintenance make ARIMA models a go-to time series analysis method across a large number of industries. However, there are downsides to such simplicity—and the inability to account for turning points is one of them. Since ARIMA models are based on generalizing data, they require a limited set of parameters to build predictions. This makes them unfit for working seasonal time series, which is why data scientists use more robust SARIMA (Seasonal + ARIMA) models for analyzing non-stationary seasonal time series while preserving the convenience of ARIMA.
- Recurrent neural networks
Recurrent neural networks (RNNs) are designed for accurate sequence processing—since time series is such a sequence, RNNs are often implemented for time series forecasting. What makes RNNs applicable for working with time series is their ability to "memorize" input data similar to how brain neurons store information.
One of the most outstanding examples of RNN is Long Short-Term Memory (LSTM), which provides a polished and more efficient version of a vanilla recurrent neural network. LSTM models are able to identify a wide set of long-term and short-term patterns and account for non-linear relationships that can affect the forecast. These qualities make LSTM models a great choice for working with seasonal time series data and injecting accuracy into multistep single-shot time series analysis—which is why LSTM is actively used in healthcare (long-term prognosis building), long-term tourism demand forecasting, and sales forecasting.
- Forecast framework
Forecast frameworks are hybrid forecasting methods that combine techniques from different methods to offer more effective and flexible solutions for specific problems. The development of forecasting frameworks allows for bypassing the limitations of existing time series analysis methods and improving the strongest elements of the technique.
Speaking of what forecast frameworks can do, there is no better and more illustrative example than Prophet—a forecasting tool developed and open-sourced by Facebook. Having played a major role in Facebook’s decision-making process and making trustworthy forecasts, Prophet now does the same for businesses, providing benefits and features for covering their forecasting needs.
Mostly, Prophet is designed for seasonal time series forecasting. Based on an additive regression model that automatically identifies the changes in trends, it fits those trends with seasonality (daily, weekly, yearly) and operates on a customizable list of holidays. The latter provides users with a number of parameters they can adjust, allowing them to make forecasts for specific seasonal activities relevant to their businesses.
Feature engineering approaches
Within the framework of feature engineering, data scientists enrich the time series data set with large amounts of other data (location, price, weather, revenue) to increase forecasting accuracy. When using other features, time series tasks can be approached as ML models.
- Linear regression models
To explain how linear regression models work, their goal is to identify a linear relationship between the variables. This means they need to analyze a number of data points, find the correlation, and outline the relationship between the variables. All linear regression models are based on dependent and independent variables. The dependent variable is the one that needs a forecast, while the independent variable enables the most accurate estimation of the dependent variable.
Linear regression models are often used for complicated forecasting tasks, e.g., forecasting energy consumption by making it a dependent variable and using a day of the week as an independent variable.
Linear regression models can include more than one independent variable, meaning it's possible to use a linear regression model to forecast energy consumption using both day of the week and temperature as independent variables. Still, there can be only one dependent variable.
- Gradient boosting
Lauded by data scientists as an extremely effective ML algorithm for building time series analysis and forecasting models, gradient boosting indeed has many advantages to offer.
This ML algorithm is based on ensemble learning which involves the collection of weaker models that “learn” from each other to ultimately build a stronger model (that remembers the errors made by its previous versions and rectifies them). An ensemble learning model consists of multiple decision trees—each new decision tree contains the information gathered by previously integrated decision trees, enhancing the model. There are two most popular gradient boosting frameworks: LightGBM and XGBoost.
Another noteworthy feature of the gradient boosting algorithm is that it doesn't need data normalization, unlike linear regression. In turn, linear regression models are easier to interpret, compared to gradient boosting models.
Generally, gradient boosting is considered to be the most efficient algorithm for tabular data, but its compatibility with extra features made it a popular choice among data scientists wanting to make the most out of time series analysis data.
- Dense neural networks
Dense neural networks (DNN) provide a complicated structure of hidden layers that are interconnected, exchanging information similarly to the way a human brain would. Each DNN layer contains a node (a neuron) that sends input to the next layer. The final layer sends processed data to the output, providing accurate results.
Dense neural network neurons optimize themselves through gradient descent, which provides them with the versatility necessary for covering a wide range of forecasting tasks. However, what makes dense neural networks an effective baseline for time series analysis, is not their learning ability, but their potential to evolve and expand their capacity for more complex data.
While DNN models are initially designed for linear regression problems, they can be trained to find non-linear relationships. Naturally, that requires introducing a non-linear activation function that helps deepen the neural network by enabling the stacking of multiple layers, which ultimately increases the model’s capacity for learning and processing complex data.
One of the most frequently used non-linear activation functions is the Rectified Linear Unit (ReLU). Being very simple and easy to compute, ReLUs are used to accelerate DNN model training and improve testing performance.
Learn about the challenges of building an ML system
and how to avoid them
Don’t look for a silver bullet
The simplest yet the most relevant tactic by far.
It may seem that machine learning time series analysis is superior to statistical methods or vice versa, which is why some businesses prefer to chase after the most innovative models instead of figuring out what works for them. As a result, they forget that each method has its weaknesses and downsides that can negatively impact their business due to low accuracy or poor ROI.
The truth is, perfect time series analysis methods don’t exist. There are only methods that align with the specific forecasting problems of your business and the computational resources available. Selecting the right time series analysis method should be based on goals and priorities—and the volume of data you want to convert into future insights. It's unnecessary to invest in a costly forecast framework development when an ARIMA-based model can deliver the same results.
If you know your priorities and needs but need advice with choosing the most fitting time series analysis model—let’s chat. During a consultative session, our ML engineers and data scientists will fill in the gaps and help you map out the time series analysis model that will become your window into the future.