The lack of predictability across sectors made businesses more self-conscious about their risk management. To get a glimpse of the future, forward-thinking business leaders started exploring the advantages of time series analysis and replicating them for their niche.
While there were many successful cases, none of them was a walk in the park. First attempts started with combing through volumes of data, only to end up with a non-functional model with a high error rate on their hands. Yet, giving up wasn’t an option: resorting to guesswork would have cost CFOs a lot more in the long run.
Nevertheless, enriching performance with time analysis and forecasting doesn’t have to be complicated. To ensure proper knowledge and expectations of time series analysis and forecasting, we’ll review the most complex aspects and provide a detailed breakdown of techniques and methods for analyzing time series data and building prediction models.
First, it's necessary to break down the meaning of time series. Time series is a number of data points occurring in chronological order over a certain period of time. These data points lie at the core of time series analysis and forecasting. Based on the problem that needs to be solved (time series problem), data for time series analysis can be univariate or multivariate.
While time series is often tied in with forecasting, it should be noted that time series forecasting and time series analysis aren't synonyms. Time series forecasting is used to predict future changes and trends. Those predictions are based on past and performance data. However, accurate time series forecasting is only possible with time series analysis.
Unlike time series forecasting, which is focused on future results, time series analysis explores data changes over constant time intervals (day, month, year) to figure out what affects these changes and how they can impact future behaviors. For that reason, forecasting is always preceded by time series analysis. Within time series analysis, data is measured over constant time intervals (minutes, days, weeks, months, and years) and there is always an autocorrelation between time series values—the past performance data provides insights into future behavior patterns.
Logically, precise time series forecasting requires the right approach to analyzing time series data. This, in turn, necessitates the understanding of time series analysis models.
Such models provide more accurate results than multistep single-shot time series analysis models. But they take a highly skilled data science team to manage—an error made during previous forecasting will be imminently carried over into the next ones.
With time series analysis basics out of the way, it is now possible to take a closer look at the multitude of forecasting time series data methods. Based on the complexity and type of data used, they can be divided into time series-only approaches and feature engineering approaches.
When a company has only time series data on its hands or needs to make sense of a particular sequence related to a particular project or product, it’s more likely to use methods and models for processing time series data exclusively. These are the most conventional approaches to time series analysis that don’t require other data to deliver accurate and insightful results.
The ARIMA method is well-known for its ability to make sense of non-stationary data and provide managers with actionable guidelines.
Whenever ARIMA models are mentioned, ARMA models are brought up next by association. Therefore, before we explain how ARIMA models work, it makes sense to avoid potential misconceptions and clarify some differences between the two.
Essentially, ARIMA models are made from the ARMA baseline with the inclusion of the integrated (I) component that makes them more flexible and less dependent on well-behaved data only.
Just like ARMA models, ARIMA models outline the statistical properties in the past performance data that are expected to remain unchanged over time, meaning that these properties will remain the same. But they also analyze non-stationary data and convert it into stationary data that can be used for gleaning insights—which ARMA models can’t do.
This flexibility as well as their relatively simple interpretability and maintenance make ARIMA models a go-to time series analysis method across a large number of industries. However, there are downsides to such simplicity—and the inability to account for turning points is one of them. Since ARIMA models are based on generalizing data, they require a limited set of parameters to build predictions. This makes them unfit for working seasonal time series, which is why data scientists use more robust SARIMA (Seasonal + ARIMA) models for analyzing non-stationary seasonal time series while preserving the convenience of ARIMA.
Recurrent neural networks (RNNs) are designed for accurate sequence processing—since time series is such a sequence, RNNs are often implemented for time series forecasting. What makes RNNs applicable for working with time series is their ability to "memorize" input data similar to how brain neurons store information.
One of the most outstanding examples of RNN is Long Short-Term Memory (LSTM), which provides a polished and more efficient version of a vanilla recurrent neural network. LSTM models are able to identify a wide set of long-term and short-term patterns and account for non-linear relationships that can affect the forecast. These qualities make LSTM models a great choice for working with seasonal time series data and injecting accuracy into multistep single-shot time series analysis—which is why LSTM is actively used in healthcare (long-term prognosis building), long-term tourism demand forecasting, and sales forecasting.
Forecast frameworks are hybrid forecasting methods that combine techniques from different methods to offer more effective and flexible solutions for specific problems. The development of forecasting frameworks allows for bypassing the limitations of existing time series analysis methods and improving the strongest elements of the technique.
Speaking of what forecast frameworks can do, there is no better and more illustrative example than Prophet—a forecasting tool developed and open-sourced by Facebook. Having played a major role in Facebook’s decision-making process and making trustworthy forecasts, Prophet now does the same for businesses, providing benefits and features for covering their forecasting needs.
Mostly, Prophet is designed for seasonal time series forecasting. Based on an additive regression model that automatically identifies the changes in trends, it fits those trends with seasonality (daily, weekly, yearly) and operates on a customizable list of holidays. The latter provides users with a number of parameters they can adjust, allowing them to make forecasts for specific seasonal activities relevant to their businesses.
Within the framework of feature engineering, data scientists enrich the time series data set with large amounts of other data (location, price, weather, revenue) to increase forecasting accuracy. When using other features, time series tasks can be approached as ML models.
To explain how linear regression models work, their goal is to identify a linear relationship between the variables. This means they need to analyze a number of data points, find the correlation, and outline the relationship between the variables. All linear regression models are based on dependent and independent variables. The dependent variable is the one that needs a forecast, while the independent variable enables the most accurate estimation of the dependent variable.
Linear regression models are often used for complicated forecasting tasks, e.g., forecasting energy consumption by making it a dependent variable and using a day of the week as an independent variable.
Linear regression models can include more than one independent variable, meaning it's possible to use a linear regression model to forecast energy consumption using both day of the week and temperature as independent variables. Still, there can be only one dependent variable.
Lauded by data scientists as an extremely effective ML algorithm for building time series analysis and forecasting models, gradient boosting indeed has many advantages to offer.
This ML algorithm is based on ensemble learning which involves the collection of weaker models that “learn” from each other to ultimately build a stronger model (that remembers the errors made by its previous versions and rectifies them). An ensemble learning model consists of multiple decision trees—each new decision tree contains the information gathered by previously integrated decision trees, enhancing the model. There are two most popular gradient boosting frameworks: LightGBM and XGBoost.
Another noteworthy feature of the gradient boosting algorithm is that it doesn't need data normalization, unlike linear regression. In turn, linear regression models are easier to interpret, compared to gradient boosting models.
Generally, gradient boosting is considered to be the most efficient algorithm for tabular data, but its compatibility with extra features made it a popular choice among data scientists wanting to make the most out of time series analysis data.
Dense neural networks (DNN) provide a complicated structure of hidden layers that are interconnected, exchanging information similarly to the way a human brain would. Each DNN layer contains a node (a neuron) that sends input to the next layer. The final layer sends processed data to the output, providing accurate results.
Dense neural network neurons optimize themselves through gradient descent, which provides them with the versatility necessary for covering a wide range of forecasting tasks. However, what makes dense neural networks an effective baseline for time series analysis, is not their learning ability, but their potential to evolve and expand their capacity for more complex data.
While DNN models are initially designed for linear regression problems, they can be trained to find non-linear relationships. Naturally, that requires introducing a non-linear activation function that helps deepen the neural network by enabling the stacking of multiple layers, which ultimately increases the model’s capacity for learning and processing complex data.
One of the most frequently used non-linear activation functions is the Rectified Linear Unit (ReLU). Being very simple and easy to compute, ReLUs are used to accelerate DNN model training and improve testing performance.
The simplest yet the most relevant tactic by far.
It may seem that machine learning time series analysis is superior to statistical methods or vice versa, which is why some businesses prefer to chase after the most innovative models instead of figuring out what works for them. As a result, they forget that each method has its weaknesses and downsides that can negatively impact their business due to low accuracy or poor ROI.
The truth is, perfect time series analysis methods don’t exist. There are only methods that align with the specific forecasting problems of your business and the computational resources available. Selecting the right time series analysis method should be based on goals and priorities—and the volume of data you want to convert into future insights. It's unnecessary to invest in a costly forecast framework development when an ARIMA-based model can deliver the same results.
If you know your priorities and needs but need advice with choosing the most fitting time series analysis model—let’s chat. During a consultative session, our ML engineers and data scientists will fill in the gaps and help you map out the time series analysis model that will become your window into the future.