- Complex logic at breakneck speed: Try Julia for data science - May 20, 2020.
We show a comparative performance benchmarking of Julia with an equivalent Python code to show why Julia is great for data science and machine learning.
Benchmark, Data Science, Julia, numpy, Python
- Fundamental Breakthrough in 2 Decade Old Algorithm Redefines Big Data Benchmarks - Sep 28, 2017.
Read on to find out how the two-decade-old minwise hashing computational barrier has been overcome with a significantly efficient alternative.
Algorithms, Benchmark, Big Data, Hashing, Research
- Benchmarking Big Data SQL Platforms in the Cloud - Sep 21, 2017.
TPC-DS benchmarks demonstrate Databricks Runtime 3.0's superior performance. Sign-up for a Databricks account to get fastest performance.
Apache Spark, AWS, Benchmark, Cloud Computing, Databricks, Presto
- Lessons Learned From Benchmarking Fast Machine Learning Algorithms - Aug 16, 2017.
Boosted decision trees are responsible for more than half of the winning solutions in machine learning challenges hosted at Kaggle, and require minimal tuning. We evaluate two popular tree boosting software packages: XGBoost and LightGBM and draw 4 important lessons.
Benchmark, Decision Trees, Kaggle, Machine Learning, Microsoft, XGBoost
- Grunion, Query Optimization Tool for Data Science and Big Data - Mar 14, 2017.
Grunion is a patent-pending query optimization, translation, and federation framework built to help bridge the gap between data science and data engineering teams. Read more to request access.
Apache Spark, Benchmark, Data Workflow, Datascience.com, NoSQL, SQL
- How is my county doing? Benchmarking 3,143 US counties - Oct 26, 2016.
Could you have imagined a few years back how open data could be useful to get the insights about your county? Its changes, population, health, crime, education and many other aspects? How are other counties doing compared to yours? This article presents just such a benchmarking case study of US counties.
Benchmark, OnlyBoth, Raul Valdes-Perez, USA
- Avoiding Tunnel Vision in Peer Comparisons - Nov 12, 2015.
Comparing yourself to peers (benchmarking) lets you understand how you’re doing and identify performance gaps. Benchmarking is widespread but frequently misses useful and actionable insights. The proposed approach helps avoid the tunnel vision in benchmarking.
Benchmark, Insight Quality, OnlyBoth, Raul Valdes-Perez
- How to Lead a Data Science Contest without Reading the Data - May 17, 2015.
We examine a “wacky” boosting method that lets you climb the public leaderboard without even looking at the data . But there is a catch, so read on before trying to win Kaggle competitions with this approach.
Accuracy, Benchmark, Competition, Kaggle, Model Performance