Interview: Taylor Phillips, Square on Why Finance Needs Machine Learning and Data Science

We discuss the role of data science at Square, common machine learning use cases, transition to real-time architecture, major challenges, expectations from data science, key qualities for data scientists, and more.

Taylor PhillipsTaylor Phillips leads the machine learning teams at Square, which focus on fraud detection, helping our sellers grow their businesses and improving customer experiences. Taylor brings a back-end, infrastructure lens to the machine learning domain, helping to architect a reliable, highly-available, real-time machine learning pipeline to support the rigorous requirements of the payments industry. Before Square, Taylor spent several years building and commanding an army of online poker bots.

Here is my interview with him:

Anmol Rajpurohit Q1. What does Square do? What role does Data Science play in your firms strategy and day-to-day operations?

Taylor Phillips: Square LogoSquare makes commerce easy for everyone.  We got started by enabling anyone to instantly accept credit card payments on their mobile device. Now we are focused on providing an unparalleled suite of business tools around our Square Register, but have also grown into peer-to-peer payments with Square Cash, and now extending small business financing with Square Capital. Data science is essential for scaling underwriting and risk for our entire suite of products, and we also use it to generate powerful insights and analytics.

AR: Q2. What are the most common Machine Learning (ML) use cases at Square?

TP: Risk, fraud and underwriting are the bread-and-butter use cases for ML at Square.  Machine LearningIf you look at our major innovations of 1) enabling people who couldn't use credit card processing and 2) settling with our sellers in 24 hours, they are both enabled by powerful machine learning behind the scenes. We are also starting to explore new ways to add value with machine learning. One awesome example of this is Square Capital, which we just recently launched. Square’s unique understanding of our merchants allows us to offer capital to growing businesses in a simple and fast way.

AR: Q3. What was the motivation behind transitioning from batch, offline ML pipeline to highly-available, real-time architecture? What was the key learning from this experience?

TP: This isn't a decision to be taken lightly.  At our smaller scale, batch offline worked great - it's a Real Timeconceptually and technically simple model.  At larger scale, real-time gives us tons of operational flexibility: we can detect/correct issues sooner, we can further increase the speed at which we are giving sellers their funds, and we can make more efficient use of our operations team doing manual review. Going real-time will also enable us to get our risk models into the synchronous payment flow, which can provide us with new and creative opportunities to use ML.

The key learning experience is that real-time architecture to move money is hard.  You need engineers who can build reliable systems, data scientists to build innovative models and the magical people who span both to glue everything together and make sure nothing is lost in translation.

AR: Q4. In your experience, what are the biggest challenges on Machine Learning and other Data Science projects?

ChallengesTP: The biggest challenge with ML projects in the wild is the data. These days it’s relatively easy to download a static dataset onto your laptop, train a one-off model and then use that model to predict stuff - that’s hardly representative of doing industrial ML.

In practice, there’s a lot of contradicting goals with data:

  • Engineers working with data care about a lot of different things (e.g. data immutability and data temporality) than the engineers working on shipping products that generate the data.
  • Easy access to the data is essential, but as data gets larger, the need for different types of data storage arrives (e.g. cold and hot storage), which introduces derivative challenges like new technologies and maintaining data consistency across stores.
  • When your data scales up, so do your model training and evaluation speeds. Keeping those times as low as possible is important to enable data scientists to try out ideas and iterate quickly.
  • Cutting edge data tools offer new features, but old tools are reliable and time-tested.
  • Ad-hoc data exploration requires different methodologies from automating the stuff you know you need.  The former wants speed and random access while the latter wants repeatability and stability - very different engineering optimization problems.

AR: Q5. What are your major recommendations to data scientists working in the financial domain on problems such as loss prevention and risk management?

TP: Focusing on driving down the same error metric for an extended period of time is going to yield diminishing returns.  Sure, at huge scale tweaking a button color can makeRecommendations a big difference to the bottom line, but thinking outside the box has unbounded potential.  Along these lines, I think it’s important to periodically revisit assumptions, redefine metrics and goals, and find new areas where existing techniques can apply. One way to do this is to check in with customers and get their feedback. Another example is around tool-sets - it’s easy to fall into a rut and accept slow tools or brittle processes, but data science is evolving so quickly right now that I think more organizations would benefit from baking in tool exploration into their normal process.

AR: Q6. How do you think the expectations from Data Science have evolved over time? Where do you see them headed in the future?

TP: Data Science is a loaded term and can mean a lot of different things depending on the context. Off the top of my head, I’d break them into a few different roles:
  • Data Science Research - Explores data and prototypes new features and models to improve business metrics and provide new insights. They use tools like R, Matlab and Python.
  • Data Engineer - Focuses on obtaining and maintaining the data in a variety of usable forms.  They own the data pipelines (e.g. Kafka, Flume) and data storage (e.g. HDFS, MySQL).
  • Data Science Engineer - Implements the features and models and makes them go live in production. These are the unicorns who bridge the gap between research and practice.

AR: Q7. What key qualities do you look for when interviewing for Data Science related positions on your team?

TP: The best hires are the people who can do the ML work and the engineering work, but that’s a lot to ask for.  They can have a massive impact very quickly because they are able to single-handedly prototype new ideas and then do the engineering legwork to ship them to production.

Hiring pure R or Python hackers can be great, but ability to write real code and interface with engineers is essential.  Likewise, hiring pure software engineers can be great, but basic skills in math, stats and ML go a long way.

AR: Q8. What are your favorite books on Data Science? What do you like to do when you are not working?

TP: My favorite place to learn new things related to data is Mike Bostock’s homepage.  I had the privilege of sitting next to Mike at Square for a bit - his work is elegant, inspiring and often accessible to non-techies.

Outside of work, I love automating things I enjoy, like online poker and video games. Besides that, it’s great to get out in the sun and behave like humans used to!