- What Comes After HDF5? Seeking a Data Storage Format for Deep Learning - Nov 9, 2021.
In this article we are discussing that HDF5 is one of the most popular and reliable formats for non-tabular, numerical data. But this format is not optimized for deep learning work. This article suggests what kind of ML native data format should be to truly serve the needs of modern data scientists.
Data Management, Deep Learning, Python
- CSV Files for Storage? No Thanks. There’s a Better Option - Aug 31, 2021.
Saving data to CSV’s is costing you both money and disk space. It’s time to end it.
Data Management, Pandas, Parquet, Python
- Overcoming the Simplicity Illusion with Data Migration - Jun 1, 2021.
What’s the key to a smooth data migration experience? It comes down to this primary issue: whether or not you can rapidly determine your dataset composition.
Data Management, Metadata
- The Best Tool for Data Blending is KNIME - Jan 13, 2021.
These are the lessons and best practices I learned in many years of experience in data blending, and the software that became my most important tool in my day-to-day work.
Data Exploration, Data Management, ETL, Knime
- Data Versioning: Does it mean what you think it means? - Aug 26, 2020.
Does data versioning mean what you think it means? Read this overview with use cases to see what data versioning really is, and the tools that can help you manage it.
Data Lake, Data Management, Data Science, Version Control
- How A Single Source of Truth Can Benefit Your Organization - Aug 7, 2020.
A single source of truth provides stakeholders with a clear picture of the enterprise assets and the potential complications that can disrupt the data strategy. Find out how you can implement this single source of truth in your enterprise ecosystem.
Business Intelligence, Data Management, Data Quality, Decision Making
- A Holistic Framework for Managing Data Analytics Projects - May 22, 2020.
Agile project management for Data Science development continues to be an effective framework that enables flexibility and productivity in a field that can experience continuous changes in data and evolving stakeholder expectations. Learn more about the leading approaches for developing Data Science models, and apply them to your next project.
Agile, CRISP-DM, Data Analytics, Data Management, Data Mining, Decision Management, Development, Software Engineering
- The Benefits & Examples of Using Apache Spark with PySpark - Apr 21, 2020.
Apache Spark runs fast, offers robust, distributed, fault-tolerant data objects, and integrates beautifully with the world of machine learning and graph analytics. Learn more here.
Apache Spark, Data Management, Python, SQL
- How Bad Data is Affecting Your Organization’s Operational Efficiency - Mar 5, 2020.
Despite recognizing the importance of data quality, many companies still fail to implement a data quality framework that could protect them from making costly mistakes. Poor data does not just cause revenue loss – it’s the reason your company could lose employees, customers and reputation!
Business, Data Management, Data Operations, Data Quality, Efficiency
- Top KDnuggets tweets, Oct 30 – Nov 05: Everything a Data Scientist Should Know About Data Management - Nov 6, 2019.
Which Data Science Skills are core and which are hot/emerging ones?; The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral DataViz; Microsoft open sources #SandDance, a visual data exploration tool.
Data Management, Data Visualization, Microsoft, Top tweets

Everything a Data Scientist Should Know About Data Management - Oct 22, 2019.
For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.
Data Management, Data Scientist, Hadoop
- How LinkedIn, Uber, Lyft, Airbnb and Netflix are Solving Data Management and Discovery for Machine Learning Solutions - Aug 22, 2019.
As machine learning evolves, the need for tools and platforms that automate the lifecycle management of training and testing datasets is becoming increasingly important. Fast growing technology companies like Uber or LinkedIn have been forced to build their own in-house data lifecycle management solutions to power different groups of machine learning models.
AirBnB, Data Management, LinkedIn, Machine Learning, Netflix, Uber
- Manual Coding or Automated Data Integration – What’s the Best Way to Integrate Your Enterprise Data? - Aug 19, 2019.
What’s the best way to execute your data integration tasks: writing manual code or using ETL tool? Find out the approach that best fits your organization’s needs and the factors that influence it.
Advice, Data Integration, Data Management, Data Science, Data Science Platform, ETL
- Bayer: Principal Clinical Data Manager [Whippany, NJ] - Jun 24, 2019.
Seeking a candidate to assume ownership and leadership for all Clinical Data Management owned deliverables within assigned compound, project and study and provide leadership to respective CDM (Clinical Data Management) staff, interfacing function and team in order to support and achieve defined business goals. Your success will be driven by your demonstration of our LIFE values.
Bayer, Data Management, Manager, NJ, Whippany
- Advance Your Data and Analytics Skills, Your Way - Apr 8, 2019.
Find the topics and learning style that resonate with you and your team! Join us for essential training in analytics, data management, business intelligence, machine learning, and more. Save 20% on TDWI seminars with code KD20.
Analytics, BI, Boston, Data Management, DC, Kansas City, MA, Machine Learning, MO, New York City, NY, TDWI, Washington
- Tomorrow, Nov 8 Webinar: Transform Your Stagnant Data Swamp into a Pristine Data Lake - Nov 1, 2018.
We explore how to implement an Enterprise Data Management strategy that will unleash your data to power decisions, examine a real-world digital-transformation use case from a Tier-1 bank, and see a demo of Trifacta Wrangler.
Caserta, Data Lake, Data Management, Enterprise, Trifacta
- Customer Data Unicorns: Why how we manage their data is the secret to finding, taming and riding them - Sep 20, 2018.
The process of how we listen, think, talk and do using this data is not possible without the effective management thereof. This skill enables the business to exploit this asset and ride these Majestic Unicorns.
Customer Engagement, Data Management, Unicorn
- Sell Your Boss on TDWI Orlando, where
in-depth vendor-neutral analytics
+ data management training
= immediate impact - Sep 10, 2018.
Hey Boss, I was hoping to attend TDWI Orlando... Where in-depth vendor-neutral analytics + data management training = immediate impact, and I can save a lot with code KD20.
Analytics, Data Management, FL, Orlando, TDWI, Training
- We Speak Data at TDWI Las Vegas, Feb 11-16. Save w. code KD30 thru Dec 15 - Nov 17, 2017.
TDWI provides the in-depth, vendor-neutral training in business analytics, data science, and data management, including a certificate track. Save 30% thru Dec 15, 2017 with code KD30.
Analytics, Certificate, Data Management, Data Science, Las Vegas, TDWI, Training
- Updates & Upserts in Hadoop Ecosystem with Apache Kudu - Oct 27, 2017.
A new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
Apache, Big Data, Data Management, Hadoop, Java, NoSQL
- TDWI Orlando, where we bring the future of data and analytics to life, Dec 3-8 - Sep 21, 2017.
Our comprehensive agenda covers the most important topics and success factors for high-impact data insights, with expert instructors whose only goal is to get you to the next level. Big savings when you register by Oct 13 with priority code KD20.
Analytics, Big Data, Data Management, FL, Orlando, TDWI, Training
- Simplifying Data Pipelines in Hadoop: Overcoming the Growing Pains - May 18, 2017.
Moving to Hadoop is not without its challenges—there are so many options, from tools to approaches, that can have a significant impact on the future success of a business’ strategy. Data management and data pipelining can be particularly difficult.
Data Management, Data Platform, Hadoop, SVDS
- You Scored 200 Dollars Off Open Source Data Event in Boston - May 2, 2017.
Use code KDPV17 to save on Postgres Vision, June 26-28, 2017, at the Royal Sonesta Boston. Co-hosted by EnterpriseDB and MIT, the event sponsors include Amazon Web Services, Avnet, credativ, EnterpriseDB, IBM, Microsoft, MIT, NEC, Palisade Compliance, Quest, TechData, and The Executive Council.
Boston, Data Management, MA, Open Source, Postgres
- The dynamics between AI and IoT - Apr 18, 2017.
We see the need for a new type of Engineer who will combine knowledge from Electronics & IoT with Machine learning, AI, Robotics, Cloud and Data management (devops).
AI, Cloud Computing, Data Management, DevOps, Engineer, IoT, Robots
- Open Source is Central to the Data Management Conversation, Boston, June 26-28 - Apr 18, 2017.
Open source dominates the data management conversation. Postgres Vision, June 26-28, Boston, explores the business value realized from innovative solutions and strategies. Use code KDPV17 to save.
Boston, Data Management, MA, Open Source, Postgres
- Help Define the Future of Open Source Data Management, Boston, June 26-28 - Apr 10, 2017.
Postgres Vision, June 26-28, Boston, will be a forum for the sharpest minds in open source as organizations strive to harvest greater strategic value and actionable insight from their data. Use code KDPV17 to save.
Boston, Data Management, MA, Open Source, Postgres
- How To Stay Competitive In Machine Learning Business - Jan 4, 2017.
To stay competitive in machine learning business, you have to be superior than your rivals and not the best possible – says one of the leading machine learning expert. Simple rules are defined here to make that happen. Let’s see how.
Business, Business Analytics, Data Management, Machine Learning, Research
- CRN Top Data Management Technologies Vendors 2016 - May 26, 2016.
The CRN editorial team has released its annual Big Data 100 report for 2016. Check out which companies made the list of Data Management Vendors.
Big Data, Big Data Vendors, CRN, Data Management
- 5 Ways in Which Big Data Can Help Leverage Customer Data - May 25, 2016.
Every business enterprise realizes the importance of big data but rarely puts the customer data that they possess to good use. Here are few ways enterprises can leverage customer data.
Analytics, Big Data, Data Management, Data Mining
- Data Science Data Architecture - Sep 10, 2015.
Data scientists are kind of a rare breed, who juggles between data science, business and IT. But, they do understand less IT than an IT person and understands less business than a business person. Which demands a specific workflow and data architecture.
Pages: 1 2
Big Data Architecture, Data Management, Data Science, Olav Laudy
- InformationWeek 9 NoSQL Pioneers Who Modernized Data Management - Sep 7, 2015.
The age of Big Data wouldn’t have been possible without these NoSQL. Learn more about NoSQL Pioneers, who changed the data landscape and revolutionized the big data movement.
Data Management, InformationWeek, NoSQL
- Data Hierarchy of Needs - Aug 28, 2015.
Data Hierarchy of Needs helps understand the steps in Big Data processing. Before going to advanced data modeling (top of the pyramid), organizations need to fill huge holes they frequently have in the base of the pyramid, lacking reliable complete data flow.
Data Management, Data-Driven Business, Yanir Seroussi
- Statistics Denial Myth: Repackaging Statistics With Straddling Terms - Jul 16, 2015.
Data science is nothing but the old wine in new bottle versions of the statistics with different fields. Here, we are busting the myth which states data scientist is new and different than traditional statisticians.
Data Analysis, Data Management, Data Science Skills, Myths, Randy Bartlett, Statistics
- Interview: Linda Powell, Consumer Financial Protection Bureau (CFPB) on Data Governance for Finance Industry - May 22, 2015.
We discuss the chief data officer role at CFPB, big data opportunities and challenges, ontology, vintage data, data governance trends, advice, and more.
Pages: 1 2
CFPB, Data Governance, Data Management, Interview, Linda Powell, Ontology, Policies, Standards
- CRN 2015 Big Data Management Companies - May 21, 2015.
Big Data and it's ease-of-use plays a key role in this year’s ‘CRN Big Data 100: Top 30 Data Management companies’. New additions include At Scale, Databricks, and Tamr. A majority of these companies develop open-source NoSQL database technology.
Big Data, CA, CRN, Data Management, Israel, MA
- Geisinger Health System: Associate VP, Enterprise Data Management - May 19, 2015.
Set the strategic direction and manage a high functioning, world class analytic platform and service department.
AVP, Danville, Data Management, Geisinger Health System, PA
- Interview: Michael Lurye, Time Warner Cable on Big Data and the Insatiable Demand for BI - Apr 13, 2015.
We discuss EDM at Time Warner Cable, data sources, complementing legacy data warehouses with Big Data solutions, vendor selection and build vs. buy decision.
Big Data, Business Intelligence, Data Management, Data Warehouse, Hadoop, Interview, Mike Lurye, Time Warner Cable
- Interview: Anthony Bak, Ayasdi on Managing Data Complexity through Topology - Jan 28, 2015.
We discuss the definition of Topology, its relevance to Big Data and compare Topological Data Analysis (TDA) with other approaches.
Anthony Bak, Ayasdi, Data Analysis, Data Management, Predictive Modeling, Statistical Analysis, Topological Data Analysis, Topology
- U. Tartu: Professor of Data Management and Analytics - Dec 19, 2014.
U. of Tartu, the highest ranked university in the Baltic States, invites applications from outstanding academics for a full-time permanent position of Professor of Data Management and Analytics.
Analytics, Data Management, Estonia, Faculty, Tartu
- Interview: Christophe Toum, Talend on Why Big Data Needs Big Governance - Aug 2, 2014.
We discuss the priority order of data governance for Big Data initiatives, impact of increasing shift towards Hadoop and NoSQL, data quality, current trends, talent crunch, advice and more.
Christophe Toum, Data Governance, Data Management, Data Quality, Hiring, Talend, Trends
- 100 Big Data Companies Analyzed - Jun 29, 2014.
We analyze the CRN Big Data 100 for insights into trends in the future of Big Data companies, including changes in database solutions, active regions, and what industries are undergoing the most change right now.
Big Data, Big Data Vendors, Business Analytics, Companies, CRN, Data Management, Hadoop, NoSQL
- CRN 25 Big Data Management Companies - Jun 26, 2014.
We examine top 25 Big Data Management companies, part of CRN Big Data 100, including Actian, Couchbase, and MemSQL. A large fraction of these companies develop NoSQL solutions.
Big Data, Companies, CRN, Data Management, NoSQL
- Interview: Dale Russell, CTO, Talksum on Building Talksum Router and Real-time Anaytics - May 21, 2014.
We discuss challenges in building Talksum data stream solution, current trends in real-time analytics, advice for Data Science aspirants and more.
Advice, Analytics, Dale Russell, Data Management, Interview, Real-time, Talksum
- Interview: Dale Russell, CTO, Talksum on Winning the IE Big Data Startup Award - May 20, 2014.
We discuss Talksum data stream router and cross-domain networking with real-time data management using data streams.
Awards, Dale Russell, Data Management, IE Group, Interview, Startup, Talksum
- Forrester Research: Transform Your Organization with Strong Data Management - May 13, 2014.
New Forrester Research report shows how to build a more elastic and flexible data management practice to meet the new data demands. Free download compliments of Lavastorm Analytics.
Data Management, Forrester, Lavastorm, Report