Big data refers to massive complex structured and unstructured data sets that are rapidly generated and transmitted from a wide variety of sources. These attributes make up the three Vs of big data:
Volume: The huge amounts of data being stored.
Velocity: The lightning speed at which data streams must be processed and analyzed.
Variety: The different sources and forms from which data is collected, such as numbers, text, video, images, audio and text.
The importance of big data doesn’t revolve around how much data you have, but what you do with it. You can take data from any source and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smart decision making. When you combine big data with high-powered analytics, you can accomplish business-related tasks such as:
Determining root causes of failures, issues and defects in near-real time.
Generating coupons at the point of sale based on the customer’s buying habits.
Recalculating entire risk portfolios in minutes.
Detecting fraudulent behavior before it affects your organization
The diversity of big data makes it inherently complex, resulting in the need for systems capable of processing its various structural and semantic differences.
Big data requires specialized NoSQL databases that can store the data in a way that doesn't require strict adherence to a particular model. This provides the flexibility needed to cohesively analyze seemingly disparate sources of information to gain a holistic view of what is happening, how to act and when to act.
When aggregating, processing and analyzing big data, it is often classified as either operational or analytical data and stored accordingly.
Operational systems serve large batches of data across multiple servers and includes such input as inventory, customer data and purchases — the day-to-day information within an organization.
The New York Stock Exchange generates about one terabyte of new trade data per day.
The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.
A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.
Personalized e-commerce shopping experiences
Financial market modelling
Compiling trillions of data points to speed up cancer research
Media recommendations from streaming services like Spotify, Hulu and Netflix
Predicting crop yields for farmers
Analyzing traffic patterns to lessen congestion in cities
Data tools recognizing retail shopping habits and optimal product placement
Big data helping sports teams maximize their efficiency and value
Recognizing trends in education habits from individual students, schools and districts