Originally posted on Data Science Central
Big data has been a big topic for a few years now, and it’s only going to grow bigger as we get our hands on more sophisticated forms of technology and new applications in which to use them. The problem now is beginning to shift; originally, tech developers and researchers were all about gathering greater quantities of data. Now, with all this data in tow, consumers and developers are both eager for new ways to condense, interpret, and take action on this data.
One of the newest and most talked-about methods for this is data visualization, a system of reducing or illustrating data in simplified, visual ways. The buzz around data visualization is strong and growing, but is the trend all it’s cracked up to be?
The Need for Data Visualization
There’s no question that data visualization can be a good thing, and it’s already helped thousands of marketers and analysts do their jobs more efficiently. Human abilities for pattern recognition tend to revolve around sensory inputs—for obvious reasons. We’re hard-wired to recognize visual patterns at a glance, but not to crunch complex numbers and associate those numbers with abstract concepts. Accordingly, representing complex numbers as integrated visual patterns would allow us to tap into our natural analytic abilities.
The Problems With Visualization
Unfortunately, there are a few current and forthcoming problems with the concept of data visualization:
The oversimplification of data. One of the biggest draws of visualization is its ability to take big swaths of data and simplify them to more basic, understandable terms. However, it’s easy to go too far with this; trying to take millions of data points and confine their conclusions to a handful of pictoral representations could lead to unfounded conclusions, or completely neglect certain significant modifiers that could completely change the assumptions you walk away with. As an example not relegated to the world of data, consider basic real-world tests, such as alcohol intoxication tests, which try to reduce complex systems to simple “yes” or “no” results—as Monder Law Group points out, these tests can be unreliable and flat-out inaccurate.
The human limitations of algorithms. This is the biggest potential problem, and also the most complicated. Any algorithm used to reduce data to visual illustrations is based on human inputs, and human inputs can be fundamentally flawed. For example, a human developing an algorithm may highlight different pieces of data that are “most” important to consider, and throw out other pieces entirely; this doesn’t account for all companies or all situations, especially if there are data outliers or unique situations that demand an alternative approach. The problem is compounded by the fact that most data visualization systems are rolled out on a national scale; they evolve to become one-size-fits-all algorithms, and fail to address the specific needs of individuals.
Overreliance on visuals. This is more of a problem with consumers than it is with developers, but it undermines the potential impact of visualization in general. When users start relying on visuals to interpret data, which they can use at-a-glance, they could easily start over-relying on this mode of input. For example, they may take their conclusions as absolute truth, never digging deeper into the data sets responsible for producing those visuals. The general conclusions you draw from this may be generally applicable, but they won’t tell you everything about your audiences or campaigns.
The inevitability of visualization. Already, there are dozens of tools available to help us understand complex data sets with visual diagrams, charts, and illustrations, and data visualization is too popular to ever go away. We’re on a fast course to visualization taking over in multiple areas, and there’s no real going back at this point. To some, this may not seem like a problem, but consider some of the effects—companies racing to develop visualization products, and consumers only seeking products that offer visualization. These effects may feed into user overreliance on visuals, and compound the limitations of human errors in algorithm development (since companies will want to go to market as soon as possible).
There’s no stopping the development of data visualization, and we’re not arguing that it should be stopped. If it’s developed in the right ways, it can be an extraordinary tool for development in countless different areas—but collectively, we need to be aware of the potential problems and biggest obstacles data visualization will need to overcome.