Big Data R program
Week 14 Assignment – 100 points
Objective: Utilize Naïve Bayes to predict the flight delay.
Given the FlightDelay.csv file, use Naïve Bayesian Analysis model to determine whether the
various flights experience delay or arrive at their destination on time.
We start by clicking the “install” on your R plot window (as shown below) to type and install the
following packages: naivebayes, dplyr, ggplot2, and psych; one at a time.
After the installation of all the packages, load them into the memory through these commands:
Next, we load the .csv file and check the statistical properties of the csv
file as follow:
setwd(“C:/RData”) # your working directory
tumor <- read.csv(“FlightDelay.csv”) # loading the file
str(FlightDelay) # check the properties of the file
. . . continue from here!
• You need to split your data into test-data (tdata) and validated-data (vdata).
• Use tdata to build Naïve Bayes’ model and use vdata to predict your model.
• The dependent variable (y) of the model is delay.
• The independent variables are dest, origin, carrier, deptime, weather, & dayweek.
• Show your conclusion.
Mandatory video on Naïve Bayer classification using R programming:
The post Big Data R program appeared first on graduatepaperhelp.