When it comes to data analytics, there are the four Vs to consider, which are volume, variety, velocity, and veracity. Volume refers to how much data there is, variety refers to different categorization of data, velocity refers to how fast the data can be processed, and veracity refers to the reliability of the data. While it’s great to max out in four different Vs, we’re not in a perfect world.
Analysts determines what data and how much of it is needed for a specific project. You will usually start off with a baseline set of data and work to either cut down or add to the volume. Also, analyst needs to ensure veracity by validating the data and its source. Veracity is critical in data analytics, because critical business decision will be made on it. Velocity includes everything from data being generated to collection and analysis and will be dependent on hardware and the type of data it is capturing. Depending on what the data is being analyzed for, it may be time critical. Variety is simply reference points and they may grow or decrease depending on business needs.
Let’s talk example. Imagine you are capturing data from your store. A customer comes in and makes a purchase. Volume will be dependent on how much foot traffic you have at that store, plus any other stores that the company may own to include online. There will be variety of data types like email and phone number. Velocity will be dependent on how active the store is and veracity will be dependent on how accurate the sensors capture data. For example, traffic should be based on per unique visitor but the sensor may capture same person multiple times and count them as different entities if they go in and out multiple times. Even on online store, they may use VPN, which may skew the number.

You must be logged in to post a comment.