KDnuggets Home » News » 2017 » Oct » Opinions, Interviews » It Only Takes One Line of Code to Run Regression ( 17:n41 )

It Only Takes One Line of Code to Run Regression


I learned how important to understand data before running algorithms, how important it is to know the context and the industry before jumping on getting insights, how it is very easy to make models but tough to get them to work for you, and finally, how it only takes one line of code to run linear regression on your dataset.



By Kritika Jalan

I am a usual IT graduate from an Indian tier-2 engineering college. And like a usual student, I used to read just enough course material so as to not flunk the semester-end exams. I was the kind of girl who believed ‘College gets you ready for real world’. So wrong, I know! As a result of my belief, I learned nothing substantial during my college years. But I always wanted to stand out from the crowd, always wanted to be the black sheep and thus began my journey to become relevant.

I will try to be relevant
And participant in an event
I will apply for GSoC
And submit a code-block
I will give my cent percent
And continue to be relevant

I started trying every other thing, like taking a course on parallel programming, registering with HackerEarth, building applications using Java, learning android and creating apps, taking up project in NLP for word sense disambiguation using SVM, writing paper on GCD circuits, participating in hack-a-thons. Phew! Anything and Everything. You guessed it right, I was going nowhere.

Source

Amidst all this, I got placed in MuSigma, Bangalore. These guys gave amazing pre-placement talk and they made sure we knew this from HBR. This instilled in my mind and I loved how I started playing with data. I switched jobs, worked with different clients, different data sources, and different teams but there was this one constant, DATA!

When you work with data-centric companies, you keep on hearing about advanced machine learning techniques. Terms that fascinate you, and scare you. Regression was my Pennywise! So much so, I never dared use Google to learn about it. Although I had read theory behind almost every machine learning technique in college, I had never implemented any of them. I thought this is the coolest thing there is. I always assumed it takes a lot of learning, coding and understanding. But I had to start somewhere. I had to learn how to make data say stories about itself.

I came back to taking online courses. I started with, yes you guessed it right, Machine Learning by Andrew Ng on Coursera. It got a little too much for me and I went astray for a few months. I had to pull myself back and I started with another course and a few more, before I stumbled upon The Analytics Edge on edX. The course synopsis says ‘Through inspiring examples and stories, discover the power of data and use analytics to provide an edge to your career and your life’. This course broke a lot of myths that I had been carrying for years. This is precisely what got me started on my journey to discovering the power of data and analytics. From there on,

I learned how important it is to understand data before running algorithms on it, how important it is to know the context and the industry before jumping on getting insights, how it is very easy to make models but tough to get them to work for you, and finally, how it only takes one line of code to run linear regression on your dataset

Below is my myth buster. The one-liner to find linear patterns in your data

#Build the model on training data –
lmMod <- lm(DependentVariable ~ IndependentVariable, data=trainingData)

#Run prediction on test data –
testPred<- predict(lmMod, testData)

It was easier to learn everything else after I killed my Pennywise. I completed that course, enrolled in another one, participated in Kaggle competitions, played with Twitter data, created wordclouds, made chatbots and never stopped exploring. I was becoming relevant, finally! If not for the world, for myself. And if I stop learning, I cease being relevant.

If you are scared to do that one thing which you always assumed to be the coolest on planet Earth, let me tell you one thing, Elon Musk’s Mars mission is the coolest thing right now and you are probably just scared of your regression model. Go out, talk with people, come back, exploit internet, learn, create, showcase.

Hope this post helped you get started with your coolest thing. Tell me more about your Pennywise in the comments below and we can together starve him to death!

Bio: Kritika Jalan is an experienced business analyst working in management consulting. She is skilled in R, Python, SQL, and other data analysis tools and machine learning techniques.

Related: