Interview: Xia Wang, AstraZeneca on Big Data and the Promise of Effective Healthcare
We discuss challenges in analyzing text data, Big Data impact on translational bioinformatics, advice, desired skills in data scientists, and more.
Prior to stepping into the clinical domain, Xia was with the AstraZeneca the innovational medicines unit, focused on the areas of informatics and computational modeling to accelerate candidate drug identification and optimization in the early discovery phases. Xia holds a Ph.D. in computational chemistry and has extensive training in broad areas of Informatics.
First part of interview
Here is second and last part of my interview with her:
Anmol Rajpurohit: Q5. A vast amount of clinical information is stored as text. How structured is this text data? What are the major challenges in analyzing this clinical data in text format?
Xia Wang: How text data are structured varies across data sources and its particular usage. Typically for clinical notes you can find the text within structured individual sections e.g. family history or medication history. Further under each section, there is generally free text recorded to reflect the physician’s specific observation or decisions.
While NLP proves to be quite powerful in pulling well defined concepts and corresponding numerical or category data elements from the text, there are still huge challenges remaining, in order to apply NLP approaches to pull out deeper language structures like the details around decision-making.
AR: Q6. What is the best advice you have got in your career?
XW: I started my industry career as a computational modeler to implement computer aided drug design in the very early phase of Medicine Discovery. I feel very fortunate in this industry that I was able to move further into research of translational safety, then into late phase of medicine development.