Interview: Nicholas Marko, Geisinger on the Skills Needed for Healthcare Analytics

We discuss challenges of dealing with healthcare data, trends in healthcare analytics, important skills for data scientists and more.

nicholas-markoNicholas Marko, MD is Chief Data Officer for Geisinger Health System in Danville, PA. He is a data scientist with expertise in predictive analytics, organizational data strategy, data-driven collaboration, and data science consulting. He heads the Department of Data Science & Engineering in Geisinger’s Division of Applied Research and Clinical Informatics (DARCI) and co-directs the High Performance Computing Center in the organization’s Institute for Advanced Applications (IAA). His specific academic interests include integration of heterogeneous data sources and application of advanced mathematical methods and modeling strategies to generate measurable value for organizations in the healthcare sector and beyond.

Dr. Marko is also a practicing neurosurgeon and serves as Geisinger Medical Center’s Director of Neurosurgical Oncology. His clinical practice focuses on surgical management of patients with malignant brain and spine tumors.

First part of interview

Second part of interview

Here is third and last part of my interview with him:

Anmol Rajpurohit: Q8. What are the most underrated challenges of working with healthcare data?

Nicholas Marko: Healthcare data is often unstructured, sparse, noisy, and heterogeneous. It contains a lot of latent knowledge and frequently involves healthcare-dataan important temporal dimension. We almost never see a complete data table of numeric measurements that directly capture what they are trying to measure. This can make analyses challenging, but it also makes it a great place to do data science research.

I like to say that this is what makes healthcare a great place to learn about data science – there is nothing you won’t see, no challenge you won’t encounter, and no rule that won’t be violated. It’s a great place for data scientists to hone their craft.

AR: Q9. Which trends in Healthcare Analytics are of the most interest to you? Why?

NM: I think the most forward-thinking healthcare analytics experts are starting to look at nontraditional data sources. I like this, because 99% of a patient’s life happens outside the view of the electronic health record. By looking at data that is generated as a part of life outside of the hospital (and there is a lot of it!) we can better understand what makes our patient who they are. This is key, because good health doesn’t happen in a vacuum. It happens in the context of real life, and that is where we should be looking if we want to improve patient experience and outcomes.

I also like the evolution of tools that are particularly useful for analyzing unstructured data. Graph analytics and other topological methods are particularly interesting, because they help us to see relationships that we might not have previously expected. Knowledge discovery and data-driven hypothesis generation is critically important to our work, and it is exciting to see algorithms, tools, and systems evolving that can really support this for the first time.

AR: Q10. What is the best advice you have got in your career? Innovate-or-die

NM: Innovate or die. It’s that simple. In this space there is no room for resting on previous accomplishments, no room for the comfort that stability brings, and no role for slowing down. If you want to be a leader in the data and analytics world then you have to think fast, move fast, and decide fast.

AR: Q11. What key qualities do you look for when interviewing for Data Science related positions on your team?

data-scientistNM: I will start by clarifying that when I say “data science” I mean data science. Not data architecture, not computer systems engineering, not information management, and not data engineering. Those are all different jobs with different requirements that require different skill sets.

To me a data scientist is the person who looks at a pile of data and studies it. They know everything about the data, everything about the question, and everything about the path that they are taking to extract actionable knowledge from this pile. It is a very algorithmic, very mathematical, and very computational field. The data scientists are the “Seal Team Six” of the information world. The buck stops with them.

I look for two general types of data scientists, but admittedly these types overlap (and overlap a lot in the best data scientists). The first are programmers. By this I mean real programmers – they speak multiple programming languages, they would rather be on the command line than anywhere else, and they almost refuse to turn on a box that is running Windows or Mac OS. They can move information from any language to any language, and they understand when and why it is appropriate to do so. There is often a big distinction between a “computer scientist” and a real “programmer,” and I will take the latter every time.

The second type are mathematicians. When I look for people to solve problems with data, build data models, answer questions, and really innovate in the algorithmic space then I want someone who knows mathematics inside and out. I will go for either theoretical or applied math people, depending on how they Mathematicianconceptualize the work that they do. Occasionally someone with an engineering or physics background can fill this role, but these are usually the kind of engineers or physicists who are really mathematicians who were scared that they couldn’t make a living from doing math. Either way, these are the people who write more in greek symbols than they do in their native language. They usually have a pile of books on their desk, a white board full of writing on the wall, and an inherent aversion to small talk. They’re awesome people with awesome minds.

I generally avoid more applied disciplines, like computer science, operations research, or some of these new “data science,” “informatics,” or “predictive analytics” type degree programs that have surfaced lately. Not saying these are bad ideas, just that they usually aren’t a good fit for the kind of data science that my group does.

AR: Q12. What was the last book that you read and liked? What do you like to do when you are not working? Paul-Dirac

NM: I recently read the biography of Dirac (by Farmello) which I thoroughly enjoyed. When I’m not working I read, I travel as much as I possibly can, and I do some photography. I also like anything to do with tech gadgets. And I’ve recently started scuba diving again, which is basically a vehicle for playing with tech gadgets under water.