A survey of current chaotic trends and a discussion on managing unpredictability
“Hindsight is 20/20,” as the adage goes. Foresight is almost blind. Often little can be said about the future with confidence. Predictive analytics is an attempt to bridge this gap in future understanding. Through learning patterns from past data and building models of outcomes, trends are projected forward to future time points, and a best guess for the state of the future is constructed.
The simplest models build future predictions directly from past data. This can be highly effective when patterns of variation are consistent. When patterns change, however, these models must be updated to reflect the altered process. When patterns are volatile and constantly in flux, simple smoothing and forecasting models are unlikely to be useful. In these sorts of environments, more complex models that incorporate other variables may perform better. These new variables, covariates, should explain some or all of the volatility observed in the outcomes. These variables “co-vary” with the outcomes of interest, hence they are useful in predicting future outcome values. Effective covariates combined with past data give more accurate future predictions.
There are a few considerations when dealing with chaotic trends that this article highlights:
These days a plethora of unexpected, volatile patterns can be observed in everything from climate data, to real estate valuations, to financial markets. The next section looks at a few of the current wild trends, specifically relating to US labor markets. The three considerations listed above are highlighted as trends are discussed. The subsequent section turns to a discussion about quantitative analysis and general overview of some methods for handling such extreme data for purposes of prediction.
Many current trends show drastic departures from past behaviors. Below several abnormal trends and patterns relating the labor markets are reviewed and discussed. These abnormalities in turn can drive other irregularities in economic data. Abnormal trends, particularly toward extreme patterns, are typical in systems with reinforcement loops. That is, feedback from prior outcomes or actions drives continued intensification of trends (e.g., in anticipation of an angry response, an individual becomes defensive, inducing an angrier response from the other party, leading to more defensiveness). After a while, excessive positive reinforcement loops can trigger negative reinforcement loops (e.g., a global pandemic kills the most vulnerable, increasing death rates; but with reduced vulnerable populations, death rates post pandemic may decrease).
The labor data from the United States appears to show a mass disabling event occurring from 2020 through 2023. This data is collected by the US Bureau of Labor Statistics and aggregated and published by the Federal Reserve of St. Louis’ economic data research unit (FRED)
From July 2010 to July 2020, the total disabled population increased from about 26 million to 29 million. That is an increase of about 11.5% over 10 years, or an annual increase of about 1.1%. From July 2020 to July 2023, the disabled population increased from 29 million to about 34 million. That translates to an increase of 17.2% over 3 years, or an annual growth of 5.4%.
The causes of this sudden rise in the disabled population are contested. Some theories tie the increased disabling to long covid. While some studies have shown most long covid symptoms abate within a year, others contend that certain effects, such as chronic fatigue, may persist
Another troubling labor trend is the recent plummeting of labor productivity. The year 2022 saw severe drops in labor productivity, as measured by the Bureau of Labor Statistics. Declines were the fastest observed since 1947. Similar declines have been observed in other parts of the world, such as Europe. These declines are all the more perplexing in the context that worker productivity rose by record levels during the start of the pandemic in 2020 and maintained modest growth in 2021 (Q1 4.3%, Q3 2.3%). Proponents and opponents of remote work variously attribute the work from home trend to the productivity surge in 2020-21 and the productivity decline in 2022. There are many other possible explanations. Some attribute the initial rise of labor productivity to pandemic era layoffs removing the most unproductive employees and raising productivity averages. Others attribute the 2022 decline in productivity to the ‘great resignation,’ the mass quitting of many workers after the pandemic period, which creates a costly need for hiring and training. Mentions of ‘burnout’ and ‘overworking’ have increased, which some point to as another possible cause for the productivity declines. Still others question the measurement methodology and wonder whether some degree of measurment error may be driving the observed volatility
Labor productivity declines continued in Q1 of 2023 with 1.2% decline
Mental health trends can be indicative of the overall health of a society and directly impact labor productivity. Current US mental health trends present a bleak picture. One study aggregated data from the National Health Interview Survey from 1997 to 2016 to assess ADHD diagnoses in the US. The study found an increase in diagnoses from 6.1% in 1997-1998 to 10.2% in 2015-2016
Volatile patterns make prediction difficult. This does not mean nothing can be done. It simply requires a more sophisticated organizational structure and expanded data sources. Here several tools and tactics for dealing with volatile and shifting data patterns are mentioned. The discussion is high level and general; to learn more do some reading on each of these topics.
To deal with shifting patterns in data, we must identify shifting patterns in data. This is the idea of change point detection. It is closely related to outlier or anomaly detection. Both methods seek to detect large deviations from an established pattern. Outlier detection seeks to identify individual points that deviate beyond some threshold, whereas change point detection seeks to identify a wholesale shift in the underlying pattern. Hence change point detection is looking for a sustained departure from normal.
Volatility present in data may be partially explained by variables in other data sets. Then these variables may be used as covariate predictors to enhance accuracy in predictions of the original outcome. There are many models that can accommodate covariate variables to improve predictions. Common AR, ARMA, and ARIMA models have vectorized versions that incorporate several variables. Panel regression models from econometrics provide a statistical framework for modeling trends in panel-formatted data. Bayesian dynamic forecasting models provide a convenient methodology for continual model updating. Modern machine learning frameworks like neural networks and decision trees have also been applied to complex prediction problems.
Faced with volatile conditions and changing patterns, sensitivity analysis provides a means for assessing the impacts. Sensitivity analysis is fundamentally about analyzing how changes in certain input variables affect the values of output variables. Different ranges of inputs are used and different operating scenarios are built to wargame possible outcomes. This gives the enterprise some idea of how sensitive outcomes are to different magnitudes of fluctations in inputs. An example of a computational approach to sensitivity analysis is using simulation (e.g., simulating calls to a call center, simulating arrivals and departures at a hospital, etc.) to model different scenarios.
Simulation outputs are analyzed via statistics. Statistical methods are used to reveal patterns of variation in the data and quantify uncertainty. One subdomain of statistics that may be useful when dealing with volatile data is extreme value theory, a branch of statistics that analyzes how the most extreme observations in a data set behave. Instead of using variance-based methods typically used when dealing with normally distributed data, a common extreme value approach is to examine peaks over a threshold. The distribution of these extreme values tends to follow an extreme value distribution. This can be used to analyze extreme deviations in a data set and make predictions about the magnitude and quantity of extreme deviations in the future. Such extreme value approaches are common used in risk modeling.
In short, there are an array of different techniques and frameworks for analyzing changing patterns in volatile data. Leveraging sophisticated approaches and covariate data can help overcome data complexities. Any successful application will be rooted in familiarity with the core data sets and an understanding of the underlying trends.