Big Data Characteristics

We differentiate Big Data characteristics from traditional data by one or more of the four V’s: Volume, Velocity,Variety and variability.

1. Volume:

Volume is the amount of data generated that must be understood to make data based decisions.

A text file is a few kilobytes, a sound file is a few megabytes while a full-length movie is a few gigabytes.

Example for Big Data - Amazon

Example:

Amazon handles 15 million customer click stream user data per day to recommend products.

Extremely large volume of data is major characteristic of big data online training

2). Velocity:

Velocity measures how fast data is produced and modified and the speed with which it needs to be processed. An increased number of data sources both machine and human generated drive velocity.

Youtube - Example for Big Data

Example:

72 hours of video are uploaded to YouTube every minute this is the velocity.
Extremely high velocity of data is another major big data characteristics

3) Variety:

Variety defines data coming from new source both inside and outside of an enterprise It can be structured, semi-structured or unstructured.

Structured data:

It is typically found in tables with columns and rows of data. The intersection of the row and the column in a cell has a value and is given a “key,” which it can be referred to in queries. Because there is a direct relationship between the column and the row, these databases are commonly referred to as relational databases. A retail outlet that stores their sales data (name of person, product sold, amount) in an Excel spreadsheet or CSV file is an example of structured data.

Example:

A Product table in a database is an example of Structured Data

Product_id Product_name Product_price
1 Pen $5.95
2 Paper $8.95

 

Semi-structured data also has an organization, but the table structure is removed so the data can be more easily read and manipulated. XML files or an RSS feed for a webpage are examples of semi-structured data.

Example: XML file

Example:

<product>

<name>Pen </name>

<price>$7.95</price>

</product>

<product>

<name>Paper </name>

<price>$8.95</price>

</product>

Unstructured data:

Unstructured data generally has no organizing structure, and Big Data technologies use different ways to add structure to this data. Typical example of unstructured data is, a heterogeneous data source containing a combination of simple text files, images, videos etc

Example:

Output returned by ‘Google Search

Unstructered Data Example for Big Data

4) Variability

This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

You can see that few values are missing in the below table

Department Year Minimum sales Maximum sales
1 2010 ? 1500
2 2011 10000 ?

Data available can sometimes get messy and maybe difficult to trust. With wide variety in big data types generated, quality and accuracy are difficult to control.

Example:  A Twitter post has hashtags, typos and abbreviations.

Keep Learning:

Subscribe
to our newsletter

Drop Us A Query

Trending Courses
  • Selenium with python
    Selenium with Python Training
  • As we know, that Selenium with Python Web Browser Selenium Automation is Gaining Popularity Day by Day. So many Frameworks and Tools Have arisen to get Services to Developers.

  • Deep learning course
    Deep Learning Course
  • Artificial Intelligence, Deep mastering (DL) is completely about, many levels of Representation and sort of abstraction. That guide to design a sense of Information like Images, sound and text format.

  • machine learning with python
    Machine Learning with Python Training
  • Over last few years, Big Data and analysis have come up, with Exponential and modified Direction of Business. That operate Python, emerged with a fast and strong Contender for going with Predictive Analysis.

  • Data science with R
    Data Science With R Training
  • Understanding and using Linear, non-linear regression Models and Classifying techniques for stats analysis. Hypothesis testing sample methods, to get business decisions.

  • data science with python
    Data Science with Python Training
  • Everyone starts Somewhere, first you learn basics of Every Scripting concept. Here you need complete Introduction to Data Science python libraries Concepts.

  • devops with azure
    Devops with Azure Training
  • As we Know Azure DevOps is a Bunch of Services, in guiding Developers. It contains CI/CD, pipelines, code Repositories, Visual Reporting Tools and more code management with version control.


100% Secure Payments. All major credit & debit cards accepted.