Sign In
Not register? Register Now!
You are here: HomeResearch PaperSocial Sciences
Pages:
5 pages/≈1375 words
Sources:
3 Sources
Level:
APA
Subject:
Social Sciences
Type:
Research Paper
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 21.6
Topic:

What Data Mining Approach should be used? (Research Paper Sample)

Instructions:
This task involved researching the Data mining techniques used in obtaining and analyzing data from many sources to create information that can increase revenues and decrease costs. Data mining is the forerunner of business intelligence. Drawing inferences from information frequently collected in a disorganized and raw condition may be complex. the business intelligence team can use extensive data mining to make inferences. Data mining is a tool used by businesses to turn meaningless data into knowledge. Companies may be capable of better comprehending their clients by looking for trends in vast amounts of data. Companies may boost sales, cut costs, and develop better marketing strategies. The PageRank algorithm, the Apriori algorithm, the kNN method, and the support vector machines algorithm are a few of the most commonly used data mining algorithms. The AdaBoost algorithm, the C4.5 program, the EM algorithm, the PageRank algorithm, and the K-means algorithm will all be presented in this paper as examples of data mining techniques businesses use for business intelligence. source..
Content:
Data Mining Techniques Students Name Institutional Affiliation Introduction Data mining is obtaining and analyzing data from many sources to create information that can increase revenues and decrease costs. Data mining is the forerunner of business intelligence. Drawing inferences from information frequently collected in a disorganized and raw condition may be complex. Parker (2018) asserts that the business intelligence team can use extensive data mining to make inferences. Data mining is a tool used by businesses to turn meaningless data into knowledge. Companies may be capable of better comprehending their clients by looking for trends in vast amounts of data. Companies may boost sales, cut costs, and develop better marketing strategies. The PageRank algorithm, the Apriori algorithm, the kNN method, and the support vector machines algorithm are a few of the most commonly used data mining algorithms. The AdaBoost algorithm, the C4.5 program, the EM algorithm, the PageRank algorithm, and the K-means algorithm will all be presented in this paper as examples of data mining techniques businesses use for business intelligence.  C4.5 Data Mining Algorithm: The decision tree variant of the C4.5 technique is applied to data mining. We can rely on this data mining to support our recommendations for the current situation because the decision tree's output will help us properly articulate the facts. Inferences can be drawn from the data set using this data mining technique because of how the data collection is organized (Witten et al., 2011). This kind of data classifier uses patterns to categorize data, making it possible to extract useful information from the collection. Data collection included sales figures for individual items and market segments. The data is separated by product sales in each region so that we may understand which items work well in which locations for more significant sales. Kmeans Clustering: The KMeans Clustering approach divides objects into groups as dissimilar as possible while being as similar as possible within each group. KMeans Clustering is used to classify similar items or bits of data. The properties of an entity decide what belongs in groups. This strategy can identify underrepresented groups in large, complex data sets and validate business hypotheses about feasible classifications. It identifies previously known data categories. After the algorithm has been run and the groups have been produced, any new data can be easily and quickly assigned to the relevant group. For example, a bank may classify loan applicants as low, medium, or high risk. Additionally, KMeans Clustering can divide a population into groups according to preferences, construct demographic details based on those choices, and identify market trends. After each segment is created, it is possible to send customized advertising messages and even items (Parker, 2018). A corporation is predicted to perform better in the market if it concentrates on higher-quality sectors. EM algorithm: A statistical model called a latent variable model provides the local probabilistic elements of a prediction method. It was created in 1977 by Arthur Dempster, Nan Laird, and Donald Rubin. Machine learning typically uses the term EM (Expectation-Maximization) to construct maximum likelihood estimates of frequently visible variables and sporadically not. Latent data, also called unobserved data, are likewise covered by this. It can be used in applications like data mining and machine learning, among other things, to establish the mode of the bayesian peripheral distribution of attributes. Additionally, it is a method for figuring out maximum likelihood estimation when latent variables are present. The EM approach uses observable data from datasets to identify missing latent variables. There are several uses for the EM method and latent variable model in machine learning. Here are a few illustrations: In statistical genomics and multiple regression analysis, such as the Gaussian Mixture Model, the EM technique is utilized to estimate parameter values. It has to do with data clustering in machine learning and is widely used in computer vision and natural language processing (Witten et al., 2011). It is also used to evaluate item features and latent skills in IRT models in the medical and healthcare sectors, such as camera calibration and structural engineering. AdaBoost Algorithm: The AdaBoost algorithm, commonly referred to as adaptive boosting, is a machine-learning ensemble technique that uses augmenting approach. The process of allocating higher weights to incorrectly categorized occurrences is known as "adaptive boosting," which entails transferring loads to each occurrence. Before applying AdaBoost to any dataset, it should be divided into a test set and a model railway. The training set is eager to test the AdaBoost model after being partitioned into train and test sets. Input and output are both included in this data. Our algorithm will utilize the test data after the training data to try and forecast the outcome (Coenen, 2011). The inputs are the only part of the test data. The test data's outcome is unknown to the model. Depending on the issue statement, this can help us make conclusions about the efficiency of our modeling and the level of precision that can be accepted. If there is a clinical emergency, accuracy should be more than 90%. 70% accuracy is considered good. Aside from model type, several factors affect accuracy. A potent ensemble technique for regression and classification issues is adaptive boosting. It is frequently employed to address classification problems. The following procedures demonstrate that it surpasses all other designs in model coherence. Starting with decision trees and moving on to the random forest are two methods for utilizing the boost and AdaBoost. Accuracy increases as we go through the method defined above. The AdaBoost approach uses a weight-assigning mechanism after each iteration, which sets it apart from other augmenting algorithms. PageRank Algorithm: Google ranks websites in results from search engines using a system called Page Rank. This claim is a play by Larry Page, one of the founders of Google. Web page significance is gauged by page rank. According to Google, Page Rank analyzes the quality and quantity of inbound links to each page to determine the relat...
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

Other Topics:

  • Benefit-Cost Analysis FEMA Hazard Mitigation
    Description: Benefit-Cost Analysis FEMA Hazard Mitigation Social Sciences Research Paper...
    6 pages/≈1650 words| 4 Sources | APA | Social Sciences | Research Paper |
  • Ethical Dilemmas
    Description: Ethical Dilemmas Social Sciences Research Paper...
    5 pages/≈1375 words| 10 Sources | APA | Social Sciences | Research Paper |
  • African-American Culture Runs On Solidarity
    Description: African-Americans have had a complicated history in the United States. Despite the abuse the community has suffered over the years, it has been helpful in forming part of the American culture. While a lot could be discussed about this group, the following is a literature review focusing on the African...
    3 pages/≈825 words| 5 Sources | APA | Social Sciences | Research Paper |
Need a Custom Essay Written?
First time 15% Discount!