When it comes to explaining how the most successful internet businesses in the world got to where they are today, and how they’ll continue to dominate in the future, many experts point to their proficiency in data mining:

  • Amazon’s data mining capabilities fuel its robust recommendation engines. It’s also why their decision to enter the grocery store business isn’t as crazy as it sounds.
  • Facebook became a multi-billion dollar company in large part by data mining its massive user base in order to serve hyper-specific, high-converting advertisements.
  • Google owns a staggering 80 percent of the global search engine market share by providing more relevant search results than the competitors—thanks to data mining.

As a small business, you have a ton of data at your disposal: invoices, inventory records and spreadsheets on top of spreadsheets on top of spreadsheets. Surely you can data mine your way to the top too, right?

What most people think data mining is.

And it’s here where the term “data mining” becomes a massive misnomer. It implies that all you need to do it successfully is effort; that if you keep manually picking away at your data, eventually gold nuggets of information and insight that can better your organization will surely appear.

Sadly, that’s not how it works. However, that doesn’t mean data mining is out of reach for your small business. On the contrary, like Amazon and Google, data mining is vital to your future.

Here, we’ll look at what data mining really entails, how it can be applied and highlight some affordable data mining tools that small businesses should consider.

What is data mining?

Gartner defines data mining as “the process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data stored in repositories.”

Put more simply, data mining uses statistics and modeling to analyze significant chunks of disparate information and see if they connect in some way.

How do I mine for data? Or, wait, am I mining for data?

Isn’t that frustrating? The term “data mining” implies that you’re mining for new data, but it’s actually the data you already have that’s being mined for those correlations, patterns and trends mentioned earlier. Welcome to the world of technology terminology.

As for how you do it, data mining in the business world is generally a three-step process:

Step 1: Identify your data sources

If you’re a retail store, every time a customer inserts their credit card in your POS system, you’re collecting data like the customer’s name, the time of purchase, and what they bought. If you’re an online business, every time someone visits your website, you can learn if they came from Twitter or Youtube and what pages they viewed. That’s valuable information! If you’re not collecting it, do so—be it in spreadsheets or in software. Afterwards, you need to decide which sources you want to date mine for possible trends.

Step 2: Pick the data points from your sources that you want to analyze

It would be great if you could just feed two gigantic databases into a computer and learn every which way they correlate, but we’re not there yet technology-wise. Software needs instructions; it needs to know where to look for patterns. A best practice is to pick data points that likely have some sort of cause-and-effect relationship. Your monthly revenue and the number of customers you have named Daniel are almost certainly unrelated. The time of day and how many people are in your store though? There’s likely something there.

Step 3: Apply and test a model that best connects the data points

This is the step that loses a lot of people, so it’s best explained with a simplified example using Microsoft Excel. Let’s say I drive an ice cream truck, and I wanted to know how the weather was affecting my sales. For the next 30 days, I record the highest temperature and the number of ice cream cones I sell:

To the naked eye, it looks like there might be a trend here. But it’s not certain. More importantly, just looking at these dots, it’s not quantifiable how related these data sets really are.

Adding a linear trendline—a straight line that Excel can add to a plot like this to best show how the number of ice creams I’m selling is changing as the temperature changes—reveals a bit more. This is the model we’re applying:

Now we can see that as the temperature increases, my ice cream sales tend to as well, outside of a few outliers. But we can do better. We can also apply a simple formula to these two data sets (=CORREL) to get what’s known as a correlation coefficient: a number in statistics between +1 and -1 that indicates how dependent two data sets are on each other. +1.00 indicates a perfect positive correlation (when the temperature goes up one degree, I sell one more ice cream cone), -1.00 indicates a perfect negative correlation (when the temperature goes up one degree, I sell one less ice cream cone) and 0.00 indicates no correlation at all.

The correlation coefficient for my data is 0.67, which indicates a small positive correlation.

And that, in essence, is data mining! We just took two disparate sets of data and found the pattern that connects them. From here, we would keep adding data to test the model and make sure it’s accurate.

I want to stress that it is a very simplified example. Imagine that instead of 30 days, we looked at data from 30,000. We then accounted for other variables affecting these numbers, like whether it rained or not each day or if I ran out of chocolate ice cream. Oh, and the best fit line isn’t linear, it’s a crazy curve. It starts to make sense why software is necessary to data mine effectively (which we’ll get to in a bit).

How can I use data mining to my advantage?

Finding hidden patterns in your business data can be a fascinating venture, but it’s ultimately all for naught if you can’t apply what you’ve learned to optimize your business. Here are just some of the many ways that companies can use data mining to gain a competitive advantage:

  • Identify seasonality. Businesses can use data mining to determine the best times to raise or lower prices. Macy’s uses advanced data mining software data to analyze the demand and inventory numbers for its online store and adjust the pricing for over 73 million items in real time.
  • Encourage more spending. With data mining, businesses can learn which items are often bought or viewed together in certain scenarios to encourage bulk purchases. Besides batteries and emergency equipment, Walmart also learned through data mining that when customers were expecting bad weather they buy more…strawberry Pop Tarts. They increased their Pop Tart supply in locations with inclement weather and increased revenue.
  • Hire better workers. Employers are utilizing data mining more and more to find the qualities that best predict which job candidates will become high-performing employees. Xerox learned that “creative types” identified through a personality test lasted longer through their job training programs, cutting attrition rates by 20 percent.
  • Improve marketing efforts. Data mining customer data will reveal new ways to market towards different customer segments with email campaigns and social media. Intermix used data mining to segment its customers into three distinct groups, and targeted each with different email savings offers—increasing annual revenue 15 percent.
  • Lower costs. Besides making money, data mining can also be utilized to become more efficient in operations and save money as well. UPS famously discovered how to save 39 million gallons of fuel a year simply by eliminating left turns, thanks to data mining.

What are some good data mining tools for small businesses?

Overwhelmed yet? I don’t blame you! Data mining is incredibly complex, and on the surface, it seems like something only suited for highly-qualified data scientists that work for the largest companies in the world.

That’s simply not true anymore. Gartner estimates that the business intelligence and analytics software market will grow to more than $22 billion by 2020, with much of that growth coming from small and mid-sized businesses (SMBs). More and more, affordable data mining tools are becoming available to help the little guys make sense of their data.

Here are a few top-rated small business data mining tools you should consider:

For marketers: Google Analytics

Lucky for us, Google isn’t hogging all that data mining prowess to themselves. With Google Analytics, online businesses can link their website for free to uncover all sorts of valuable web traffic metrics: from total traffic to time on site, preferred browser and more. Data mining experts can take the platform further by integrating it with the popular statistical programming language, R. Here’s a tutorial to help you get started.

For sales teams: InsightSquared

InsightSquared can take your CRM data and mine it every which way to better visualize sales patterns and trends over time. Gain better insight into your sales funnel, understand where your biggest opportunities are coming from and generate a myriad of reports for stakeholders. Staffing and recruiting agencies can also use InsightSquared to place better job candidates, faster and cheaper.

For power users: Periscope Data

Those with more advanced needs should check out Periscope Data. Data teams can integrate multiple databases with Periscope to get access to real-time data visualizations, dashboards and reports that can be shared with customized user groups throughout the company. An embedded SQL editor can facilitate query writing while data mining capabilities can allow businesses to dive deeper into their historical data than ever before.

This is just a small sample. Head to our small business data mining software page to see more products.

Additional reading material

If you’re not ready to take the data mining dive just yet, that’s OK. Hopefully this article has at least given you some of the basics to help moving forward. If you’re interested in learning more, here are some additional resources:

Share This

Share this post with your friends!