Data Science, Machine Learning, and Artificial Neural Networks

Posted · Add Comment

Data Science is used in the Discovery of Data Insight. Data Science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems, to extract knowledge and insight from structured and unstructured data.

Data Scientists must become detectives when trying to figure out patterns, they must investigate leads and try and understand characteristics within their data sets which requires a significant amount of analytical creativity.

Data Science is an approach to unify statistics, data analysis, and machine learning in order to understand and analyze the actual narrative found within the data.

Data Science is about diving in at a granular level to mine, observe, and comprehend complex behaviors, trends, and inferences in order to uncover insight found within the data.

Data Science employs techniques and theories drawn from many fields: mathematics, statistics, information science, and computer science. Data Science is often used interchangeably with earlier concepts like Business Analytics, Business Intelligence, Predictive Modeling, and Statistics. 

Data Analytics is used in many industries to allow companies and organizations to make better business decisions and in the sciences to verify or disprove existing models or theories.

Data Analytics is the discovery, interpretation, and communication of meaningful patterns in data and applying those patterns in order to draw conclusions and make effective decisions. Data Analytics is the science of examining raw data with the purpose of drawing conclusions about that information increasingly with the aid of specialized systems and software.

Data Analytics focuses on inference, the process of deriving a conclusion based solely on what is already known by the researcher. Data Analytics involves many processes that include extracting, categorizing, and analyzing the various patterns, connections, and relationships within various data sets. 

Business Analytics initiatives can help businesses increase revenues, improve operational efficiency, optimize marketing campaigns and customer service efforts, respond more quickly to emerging market trends, gain a competitive edge over rivals, and boost overall business performance.

Organizations Apply Analytics to Business Data to Describe, Predict, and Improve Business Performance.

Since analytics can require extensive computation, algorithms, and software used for analytics harness the most current methods in computer science, statistics, and mathematics.

Big Data deals with Data Sets that are too large or complex to be dealt with by traditional data-processing application software. 

Big Data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source. 

Current usage of the term Big Data tends to refer to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. “There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem.”

Analysis of Data Sets can find new correlations to “Discover Business Trends, Prevent Diseases, Combat Crime, and so on.”

Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet searches, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology and environmental research.

Data sets grow rapidly, in part because they are increasingly gathered by cheap and numerous information-sensing Internet of Things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks. The world’s technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s. By 2025, IDC predicts there will be 163 zettabytes of data.

Relational Database Management Systems, desktop statistics and software packages used to visualize data often have difficulty handling big data. The work may require “massively parallel software running on tens, hundreds, or even thousands of servers”.

What qualifies as being “big data” varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. “For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration.”

Data Mining is the process of Discovering Patterns in Large Data Sets involving methods at the Intersection of Machine Learning, Statistics, and Database Systems.

Data Mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use.Data mining is the analysis step of the “knowledge discovery in databases” process or KDD. 

The goal of Data Mining is the extraction of patterns and knowledge from large amounts of data.

Data Mining is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence (e.g., machine learning) and business intelligence. 

The actual data mining task is the semi-automatic or automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. 

The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine-learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.

Machine Learning refers to Systems that can learn by themselves, Systems that get Smarter and Smarter over time without Human Intervention. 

Machine Learning (ML) is commonly used alongside AI but they are not the same thing. ML is a subset of AI.

Deep Learning (DL) is Machine Learning but applied to large data sets.

Most AI work now involves Machine Learning because intelligent behavior requires considerable knowledge, and learning is the easiest way to get that knowledge. The image below captures the relationship between AI, ML, and DL.

Sonix - Artificial Intelligence, Machine Learning, and Deep Learning and how they all stack up

Artificial intelligence (AI) is the overarching discipline that covers anything related to making machines smart. 

Whether it’s a robot, a refrigerator, a car, or a software application, if you are making them smart, then it’s AI. Below we attempt to explain the important parts of artificial intelligence and how they fit together. 

Here are the most commonly used acronyms and their definitions:

  • Artificial Intelligence (AI) -the broad discipline of creating intelligent machines
  • Machine Learning (ML) -refers to systems that can learn from experience
  • Deep Learning (DL) -refers to systems that learn from experience on large data sets
  • Artificial Neural Networks (ANN) -refers to models of human neural networks that are designed to help computers learn
  • Natural Language Processing (NLP) -refers to systems that can understand language
  • Automated Speech Recognition (ASR) -refers to the use of computer hardware and software-based techniques to identify and process human voice

The principle underlying technologies are Automated Speech Recognition (ASR) and Natural Language Processing (NLP). ASR is the processing of speech to text whereas NLP is the processing of the text to understand meaning. Because humans speak with informalities and abbreviations, it takes extensive computer analysis of natural language to drive accurate outputs.

Automated Speech Recognition and Natural Language Processing fall under Artificial Intelligence. Machine Learning and Natural Language Processing have some overlap as Machine Learning is often used for Natural Language Processing tasks. Automated Speech Recognition also overlaps with Machine Learning. It has historically been a driving force behind many Machine Learning Techniques.

There are many techniques and approaches to Machine Learning. One of those approaches is Artificial Neural Networks (ANN), sometimes just called Neural Networks. 

A good example of an Artificial Neural Network is is Amazon’s Recommendation Engine.

Amazon uses Artificial Neural Networks to generate recommendations for its customers. Amazon suggests products by showing you “customers who viewed this item also viewed” and “customers who bought this item also bought”. Amazon assimilates data from all its users browsing experiences and uses that information to make effective product recommendations.

Neural Networks are built to replicate the web of neurons in the human brain. By analyzing vast amounts of digital data, these neural nets can learn all sorts of useful tasks, like identifying photos, recognizing commands spoken into a smartphone, and, as it turns out, responding to Internet search queries. In some cases, they can learn a task so well that they outperform humans. They can do it better. They can do it faster. And they can do it at a much larger scale.

This approach, called Deep Learning, is rapidly reinventing so many of the Internet’s most popular services, from Facebook to Twitter, and even Google.


Deep Learning and Artificial Neural Networks have reinvented the way Google Search Operates.

The Google Search Controls MOST of what we See When we use Our Devices. OVER 92% of ALL THE WORLD’S INTERNET USERS use Google Search to find the “RIGHT ANSWERS TO THEIR QUESTIONS” because “EVERYONE TRUSTSGoogle Search and its ability to consistently provide users the RIGHT SOLUTIONS TO THEIR PROBLEMS.”

63,000 Searches per second
3.8 Million Searches per minute
228 Million Searches per hour
5.6 Billion Searches per day

Google’s objective is to provide its users with the
Most Quality Content from the Most Credible Sources.