Sign In
Not register? Register Now!
You are here: HomeCourseworkIT & Computer Science
Pages:
6 pages/≈1650 words
Sources:
6 Sources
Level:
APA
Subject:
IT & Computer Science
Type:
Coursework
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 38.88
Topic:

Machine Learning and Big Data (Coursework Sample)

Instructions:

Machine learning and big data

source..
Content:


CLA 1
Name
Institutional Affiliation
CLA 1
Introduction
The globe is witnessing an incredible technological advancement and an ever-mounting need to comprehend data, as currently, data is money and data is power. The current era is when people witness extraordinary data amount being generated from various unseen and unheard sources. There is currently a technology developed to manage, process, and capture these unforeseen data, though there are numerous issues and challenges that require tackling. Many scientists are focusing on those directions to comprehend better and have numerous significant insights from big data. Today in almost every field of study, be it applied sciences, basic science, social science, engineering, and biomedical sciences, big data is involved. All these sectors are dealing with massive datasets. Most work is being undertaken to better process and harness big data, utilizing machine learning domains that hold vital potential in handling current data challenges.
Machine learning and big data
The machine learning concept in the computing field is not a new thing, but it has changed to a new avatar due to the world's ever-changing nature of requirements. Machine learning is an artificial intelligence subset, where computer algorithms are utilized to learn from information and data autonomously. With the advancement of the internet, numerous digital information is being developed, implying that there is more available data for machines to learn and analyze from. Thus, as an outcome, we see the machine learning resurgence. Currently, machine learning permits communication between humans and computers, autonomously drive cars, and publish and write sport match reports and spot terrorist suspects and actions. Machine learning in computer science is the most growing field (George et al., 2016). The various machine learning methods comprise time series analysis, topic modeling, regression, collaborative filtering, classification, association rules, dimensionality reduction, and cluster analysis. These are utilized to perform analytics and predict future trends on the current correlations and patterns among data in the specified dataset.
Big data processing depicts the focus areas of study, and numerous techniques and frameworks have been suggested in the present past by different scientists. Big data is currently significant as numerous private and public firms collect numerous amounts of domain-specific information that can comprise useful information regarding problems like cybersecurity, national intelligence, medical informatics, marketing, and fraud detection. Firms like Microsoft and Google are analyzing huge data volumes for business decisions and analysis, impacting future and existing technology. Big data analytics helps to aid firms to advance their business efficiency though there are numerous issues and challenges linked to big data analytics and processing.
Machine learning-based techniques and their applications are a vital portion of big data processing. With the rise in big data, an opportunity for machine learning and big data has sprout for these aspects to come together and create machine learning methods that have the capability of handling modern data kinds by drawing computational and statistical intelligence for navigation for massive amounts of information with no or minimal human supervision (L’heureux et al., 2017). Algorithms linked with machine learning have been pushed to the lead and help ensure accurate and timely predictions. Algorithms linked to machine learning are utilized to know the big data value and process the massive data volume at an extraordinary velocity than ever before, indorsing unprecedented and tremendous changes.
Neural networks and machine learning
Machine learning is required for complex tasks that humans cannot code directly. Some chores are so difficult that it is unrealistic, if not impossible, for people to work out all of the code and nuances for them clearly. Instead, a huge data amount is given to a machine learning algorithm and let the algorithm work it out through data exploration and searching for a method that will attain what the programmers aimed to attain. Machine learning algorithms using neural networks normally do not require to be programmed with particular rules defining what to anticipate from the input. The neural net learning algorithm instead learns from processing numerous labeled instances that are during training are supplied and utilizing the answer key to learn what input traits are required to construct the exact output. Once enough instances have been processed, the neural network commences to process unseen, new inputs and triumphantly return precise results (Kim et al., 2018). The more instances and array of inputs the program sees, the more precise the results naturally as the program learns with experience.
To understand the concept, an instance would be a simple problem of deciding if an image contains an object. Whereas that is rather simple for a person to figure out, it is much more complex to train a computer to recognize an object in a picture utilizing classical models. Considering the varied probabilities of how an object might look in an image, writing a code to account for each scenario is nearly impossible. But utilizing machine learning, and more precisely, neural networks, the program could utilize a generalized strategy to comprehend the content in a picture. Utilizing several functions layers to decompose the picture into information and data points that a computer could utilize, the neural network could commence identifying trends existing across numerous instances that it processes and categorize pictures by their similarities (Tajbakhsh, & Suzuki, 2017). During the evaluation of a new picture, the neural net contrasts the data points regarding the new picture to its model based on all past evaluations. It then utilizes some simple statistics to determine if the picture contains an object based on how carefully it matches the model. Currently, firms like Facebook, Google, and Microsoft utilize neural networks and machine learning algorithms.
Different types of statistical methods applied to Big Data analysis
The current big data methodologies could be loosely categorized into online updating, resampling-based, and divide and conquer. In the subsampling-based model, there are various statistical methods utilized comprising little bootstrap bags. The little bootstraps bag method offers both quality measures and point estimates like confidence or variance intervals. It is a blend of subsampling the bootstrap and m-out-of-n bootstrap to attain computational efficiency. The little bootstrap bags first draw the subsamples of the given size from the specified size's original data. The method draws bootstrap samples of a given size instead of random size for every subset and attains their quality measures and point estimates from the given bootstrap samples. The bootstrap quality measures and point estimates are then combined to give the overall quality measures and point estimates (Qiu et al., 2016). As a summary, the little bootstrap bags have two nested processes comprising the inner process that applies to a subsample of the bootstrap and the outer procedure combining those multiple bootstrap estimates. In the subsampling-based model, another sampling type utilized is leveraging. Leveraging was proposed to enable scientific discoveries from big data utilizing limited computing resources. There is a sampling of a small data proportion with definite weights from the entire sample in the leveraging model and then undertakes intended computations for the entire sample utilizing the small subsample as the surrogate. The basic to the leveraging model's success is constructing the non-uniform sampling probabilities and weights so that there is a sampling of the influential data points with high probabilities.
The second statistical method used in big data is the divide and conquer. The divide and conquer algorithm commonly has three phases comprising partitioning a big dataset into specified blocks, processing every block separately, and aggregating every block's solutions to create a final answer to the full data. In the divide and conquer, we use the aggregated estimating equations. For any linear regression method, the regression coefficient least squares estimators for the entire data could be outlined as the weighted average for each block of the least-squares estimator with weight implying the inverse of the projected variance matrix (L’heureux et al., 2017). The triumph of the model for linear regression depends on the estimating equations linearity. The projecting equation for the entire data is a simple summary of that for the entire block.
Compare different types of databases in terms of data visualization

...
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

Other Topics:

  • System Approach IT & Computer Science Coursework Paper
    Description: It is important to choose the most suitable project management approach for any given project as each approach has its own functionality. For the Melbourne Student travel Service, it would be required to adopt the System Approach to project management....
    3 pages/≈825 words| No Sources | APA | IT & Computer Science | Coursework |
  • Cryptography IT & Computer Science Coursework Paper
    Description: A digital certificate enables hosts, users, and organizations to exchange information securely over the internet. It verifies and authenticates that users sending a message are who they claim to be. It can also provide confidentiality for the receiver with the means to encrypt a reply...
    3 pages/≈825 words| No Sources | APA | IT & Computer Science | Coursework |
  • Creating an Internet and E-mail Acceptable Use Policy
    Description: You are a networking intern at Richman Investments. An employee of the company used employer-owned equipment to access the Internet and check his personal Web-based e-mail account. He followed a link in a spam e-mail, downloaded games to his hard disk, and inadvertently infected the computer with malware, ...
    25 pages/≈6875 words| No Sources | APA | IT & Computer Science | Coursework |
Need a Custom Essay Written?
First time 15% Discount!