Interview: Thomas Levi, PlentyOfFish on What does Big Data tell us about Romance

We discuss interesting research on the state of romance in US, how PlentyOfFish is managing competition, personal journey from String Theory to Data Science, career advice and more.

Thomas LeviThomas Levi started out with a doctorate in Theoretical Physics and String Theory from the University of Pennsylvania in 2006. His postdoctoral studies in cosmology and string theory, where he wrote 19 papers garnering 650+ citations, then took him to NYU and finally UBC. In 2012, he decided to move into industry, and took on the role of Senior Data Scientist at POF. Thomas has been involved in diverse projects such as behavior analysis, social network analysis, scam detection, bot detection, matching algorithms, topic modelling and semantic analysis.

First part of interview.

Here is second and last part of my interview with him:

Anmol Rajpurohit: Q5. How does PlentyOfFish differentiate itself from the competition such as Match, OkCupid and Lavalife?

Thomas Levi: It’s free: This has always been one of our strongestOnline Dating differentiators, particularly in an industry in which some of our biggest competitors follow a subscription model. We’ve been able to acquire users at a faster rate because we offer a free service.

It offers selection: We’re the largest online dating site in the world, which means a large selection of potential matches for our users. Even when you filter your search down by location (if you live in small town), or by religion (if you want to meet someone who shares your religious beliefs), you’ll still find a huge selection of users.

It’s always generated revenue: As one of the Google AdSense pioneers, we were the first to break the $1M/month milestone with Google. Years later, we built a proprietary self-serve online ad platform that is widely used by affiliate marketers and advertisers.

It’s a privately held company: Having achieved this level of success with no investment, Markus Frind remains the sole shareholder and helms one of the few independently held dating sites in the world. Today, unlike any of its competitors, Markus Frind continues to lead the company's day-to-day operations as founding CEO. With just 70 employees, PlentyOfFish is still in “startup” mode the team can iterate quickly and effectively, and continues to dominate in a space where the top competitors have TV marketing budgets exceeding 250 million dollars.

AR: Q6. A few months ago, PlentyOfFish released an interesting research on "Where do the most romantic US singles live?" on the occasion of Valentine's Day. What were the key attributes in identifying the romance level of singles? Was it based on their profiles or on their activities over PlentyOfFish or both? Romance Research TL: This study was based on the LDA interests algorithm mentioned above. When the model was being built out, we noticed that there were categories focused around romantic interests, i.e.  “Candlelight dinners” and “long walks on the beach”. Another researcher I work with had the clever idea that we could average over peoples’ archetypes based on location to determine which states had the highest membership in those categories. The result is that study. In the future we could look at a host of other things, including which states are good “matches” for each other or where a person should live based on their interests.

AR: Q7. You have a very interesting background. After your PhD in string theory and cosmology, how did you land up on a Data Scientist role for an online dating site? Do you see any connection between Data Science and String Theory?

A large chunk of it was negative things having to do String Theorywith academia and string theory. The job market for full time faculty positions on the tenure track at elite research universities is pretty grim these days. In string theory there have been a few years where only two or three people are hired in that specialty in all of North America. The alternative, and where I was career wise was to continue doing postdoctoral fellowships which are quite low paying, two to three year contract positions. The work life balance for me was way out of whack and I found myself wanting more control over where I lived and some stability. It’s frowned on to talk about money, but I won’t pretend that wasn't a factor. I've eaten a lot of ramen in my life.

I also found myself wanting to work on something more concrete and connected to the everyday world. String theory is amazing, and I loved spending my time thinking about how the universe began and the building blocks of space and time. That said, it’s not likely to produce a testable result in my lifetime (never say never though!). I wanted something where I could actually see the impact of my work on regular people. I thought a bit about finance, but tech just appealed to me more. I liked the idea of creating something, and in the case of PlentyOfFish being able to look at how people interact and meet each other is incredibly interesting to me. I’m also a closet romantic, so it sounds corny, but I really like bringing people together, especially ones who otherwise wouldn't have met.

There are a lot of connections between data science and string theory, though not at the obvious level. Yes, both things involve math and analysis, but I haven’t gotten to the stage of doing algebraic topology to understand dating quite yet. I think the approach to asking questions and solving problems is very similar. Nearly every problem I've solved at PlentyOfFish I had to go off and learn a lot of things. For the interest matching we’re talking about here, I taught myself LDA and some Monte Carlo techniques to solve it. In addition, I didn’t just hit on LDA, I spent a couple of months trying out different approaches and exploring various options, learning as I went. That has a lot in common with how I worked in physics. When I decided to move more towards cosmology, I had to teach myself modern cosmology and inflationary theory, and rapidly come to grips with the current state of the art. I also had to boil large questions down to something I could actually write equations for and make progress. The same is true in data science.

AR: Q8. What is the best advice you have got in your career?

TL: The best advice I've received is to constantly make sure you’re happy doing what you’re doing. I decided at 16 to be a theoretical physicist. I waited until I was in my 30s to question if that was still what I wanted and whether it was making me happy.

It’s never too late to pursue what you’re passionate about or make a change, just be prepared to work for it.

Reading that over, it sounds a bit cliche, but I really do believe it.

AR: Q9. On a personal note, are there any good books that youre reading lately, and would like to recommend?

All of StatisticsTL: On the technical side, I’m a big fan of “All of Statistics” by Larry Wasserman. It gives a great crash course in statistical thinking, with proofs and lots of examples for all of the key concepts. I also really like David Barber’s “Bayesian Reasoning and Machine Learning” as I can’t emphasize enough how important conditional probability and Bayesian statistics are for my job. I couldn't make a list of books for a Data Scientist and not throw in Tom Mitchell’s “Machine Learning”. It’s a classic. On the less technical side I think every aspiring Data Scientist and just about anyone in business should read Nate Silver’s “The Signal and the Noise” It’s a popular audience level book on how to understand probability, think about it and what goes wrong if you don’t. It’s also a pretty entertaining read.

On the just for fun side of things, Patrick Rothfuss and his Kingkiller Chronicles is possibly the best series I've read, and the first book “The Name of the Wind” might just be the greatest novel I've ever read. I could say similar things about Ernie Cline’s “Ready Player One”.

Right now, I’m reading Michael Lewis’ new book “Flash Boys” about high frequency trading, and his earlier effort “The Big Short” is another great read.