How to Set Up Your Data Science Stack on a Budget
Whether you’re working independently or setting up a stack for a company, you need an affordable stack option. Here’s how you can set up your stack without spending too much.
Photo by Tianyi Ma on Unsplash
Data science can be an expensive undertaking. Physical infrastructure and devices, cloud hosting services, database access, and more can quickly amount to considerable costs. That can make it difficult to get started in the industry.
Most small businesses spend upwards of $10,000 a year on data analytics, but most individuals can’t afford that. Whether you’re working independently or setting up a stack for a company, you need a more affordable option. Here’s how you can set up your stack without spending too much.
1. Look for Service Providers With Free Tiers
Service providers like web hosting companies are an essential but often costly part of data science. Thankfully, many of them also offer free or low-cost tiers for entry-level users. Even industry leaders like AWS provide features like S3 and AWS Lamba for free, with limits.
You won’t be able to use all of a provider’s services in a free tier, and you may have limited storage or access frequency. Determine what you need for your projects, then compare various options to see which best fits your needs.
2. Prefer Web-Based Software
When shopping for software tools to use, aim for web-based options over traditional, on-device apps. If you move most or all of your operations to the web, your physical device needs won’t be as high. You can then spend less on a computer, server, or other infrastructure because you won’t need as much storage or processing power.
While you’re looking for web-based options, make sure you know how they’ll charge you. Many billing options for Kubernetes operations charge per cluster per hour, which can quickly grow expensive. Make sure the as-a-service option won’t cost you more than an on-premises solution.
3. Rethink What’s Necessary
Another way you can cut down your stack’s costs is by leaving some options out. Many features and processes can be expensive, but you may not need them. For example, web hosting often ranges between $1,000 and $4,000, but you don’t necessarily need a unique domain.
When reviewing your budget and goals, rethink whether you need each item on your list. Some features may be helpful but won’t impact your end product significantly, so it’s best to leave them out for now.
4. Use Open-Source Databases
Another aspect of data science that can incur high expenses is your database. Gathering your own data is slow and requires extensive infrastructure costs, and many publicly available databases are costly. You can avoid these costs by training your programs on open-source databases instead.
Many open-source databases will give you limited access for free. Some service providers’ free tiers, like Supabase, will even offer free and full access to their databases, often based on open-source options. When using these open databases, though, be sure to review their security and clean the data before processing.
5. Start Small
Finally, you can keep your costs down by tempering your ambitions. Large, groundbreaking, or disruptive projects will likely have complexity and storage needs beyond your limited budget. Focus on smaller, less intensive projects at first, planning to expand as you pull in more revenue.
Smaller projects will make the relative limited utility of free resources feel less restrictive. If you can hold back until you make more money to expand, free databases and hosting tools can take you a long way.
Data Science Doesn’t Have to Be Expensive
Data science can be an imposing field at first, especially considering how much some businesses spend on it. While these expenses can grow to be astronomical, they don’t have to be, especially for new data science operations.
Following these five steps will help you establish your stack without spending much. If you already have some tools, you may even be able to start working for free. You can then begin to grow your operation to move on to bigger things in the future.
Devin Partida is a big data and technology writer, as well as the Editor-in-Chief of ReHack.com