How Data Scientists Can Train and Updates Models to Prepare for COVID-19 Recovery

The COVID-19 pandemic has affected everything, and building predictions during this time is difficult. Data science teams need to update their models to prepare for the recovery, and know how to properly train 2020 data models to learn from the coronavirus anomaly.

By Xuxu Wang, Chief Data Officer at PredictHQ.

World Events 2020 Changes

The COVID-19 pandemic has catapulted our world into unprecedented times. Beyond the terrifying health situation, the global economic impact is stark. Companies are dealing with exceptional circumstances, and it is imperative that data science teams know how to properly update their models to prepare for the forthcoming recovery.

While it is still too early to know exactly when economic recovery will start, it is important to remain transparent with your customers so they can prepare their demand forecasting models and associated strategies. Knowing in advance when demand will start to recover is critical as many companies will need to re-hire and upskill their team, and re-engage their supply chain.

Data shows that there will be a significant cluster of rescheduled events in September and October 2020, with many more new and rescheduled events to be logged over the coming months. This means companies need to develop recovery plans now and will require substantial updates to their models. Below are a few of the most common themes when looking at effective recovery planning.


Resist the temptation to throw out data from 2020


 Most teams have decided not to invest much time into short-term demand forecasting over the next one to three months as it will likely be chaotic as it is extremely difficult for models to forecast government responses, a key variable.

Instead, teams are focusing on updating their models for the final two quarters of this year, and into 2021. Several teams have mentioned dropping data from 2020 out of their demand forecasting models entirely, and instead relying on 2019 and 2018 data.

However, this is not something you want to do. This tactic is based on the incorrect assumption that the world will return to exactly what it used to be, prior to COVID-19.

Even if we can quickly contain the virus, recovery will take time. The ongoing fear of the virus, broken businesses, and millions of unemployed people will have long-term impacts. Data-driven companies will be able to navigate this, but effective strategies cannot be built on incorrect assumptions.


Don’t ignore anomalies – train your models to guide you through them


Training your models to recognize and understand anomalies build intelligence for when a company encounters a similar impact. While we are unlikely to face such a severe global impact anytime soon, city or regional shutdowns due to severe weather and natural disasters can have a huge localized impact (they are far more common than pandemics).

Investing time into training your models to understand the impact, duration, and recovery rate for your business during abnormal circumstances – such as shelter-in-place mandates – will enable your team to make smarter decisions at scale faster.

You will need to decompose the COVID-19 anomaly to be able to build models to steer your business through it. The COVID-19 anomaly has three main aspects, and clearly understanding each will enable you to build models that work well.

  • The first aspect was the big drop in demand, so your model needs to detect the downward shift in the normal demand curve, including long-term and short-term business trends as well as seasonalities.
  • The second aspect will be tracking the recovery rates and creating models to identify what your company is likely to resemble so you can be prepared.
  • The third aspect will be an increase of demand into the recovery, which will be spurred by the sizable volume of rescheduled events. Your company needs to move swiftly to prepare for rescheduled events, so your model needs to pick these up immediately and accurately.


Track demand data in recovering markets even if you don’t operate in them


In times of chaos and new situations, data scientists look to precedents and horizontal trends to construct new models or update their existing ones.

The most valuable source of insight for the recovery rate in key markets will be tracking how demand returns in economies that are further along in their recovery journey. There are many variables, but only as more countries recover will we begin to gather insights into what revised baselines should be. A few key inputs for identifying your business's recovery rate include:

  • Data scientists will need to build a temporally dynamic changing feature by learning from the national economies that appear to be recovering earlier than others, such as China and South Korea.
  • Companies need to shift expectations about the frequency of demand forecasting model iteration. The post-COVID-19 era will require a much higher frequency of updates and reviews of models compared to normal demand forecasting models. Its requirements are likely to be closer in frequency to stock trading than set-and-forget demand forecasting algorithms from more productive and stable times.
  • Industry-specific rebounds will be essential data sources because the scale and velocity of impacts of COVID-19 on different industries have been shown to vary considerably. We can assume that the recovery rates will be similarly different as well.


Update your baselines and identify your demand catalysts


Once companies have an updated baseline for the first few weeks and months of the COVID-19 recovery, they will be able to re-engage with suppliers and to re-hire or train staff to meet that demand.

Including events into your demand forecasting -- as they are both indicators of demand you can prepare for in advance, as well as catalysts for demand -- should be a focus area during this stage. Optimizing for these is even more important when demand is low, and businesses have weathered months of dwindling revenue.

Enterprises need to be able to focus your efforts on the surges of demand that occur. Events drive these so tracking events from massive down to minor is key. Even the small events can cluster to create significant impact – these are often referred to as perfect storms of demand.

With hundreds of thousands of events postponed worldwide, enterprises will need a programmatic way to know when events are rescheduled. Because there are so many events, it is impossible to track these manually, and a waste of data science team time to be finding, verifying and standardizing this data when they could be building and iterating on models.


Prepare for demand returning + more perfect storms of demand


The coming recovery period is going to be a dynamic time for the events industry as a whole. Before the pandemic, thousands of high-impact events worldwide took place every week. This frequency will nonetheless resume with time, but the level of rescheduling and changes will make a recovery particularly intense.


Bio: Dr. Xuxu Wang is the Chief Data Officer at PredictHQ. She brings comprehensive data science and advanced machine learning expertise, as well as industry-leading intelligence product R&D insights after more than a decade of experience of leading R&D teams at Baidu in Beijing and Workday in California. Within her role with PredictHQ, she is responsible for data intelligence R&D, machine learning features, and product R&D and intelligence product development. She both grows and leads a team of data scientists, engineers, and analysts that work on complex challenges such as event impact prediction models to make sense of millions of events worldwide. Dr. Wang is passionate about using data, algorithms, and building intelligence to make an influence on the world. Outside of work, she is an avid lover of hiking, history, and food.