What is Hadoop?
A. Java classes for HDFS types and MapReduce job management and HDFS
B. Java classes for HDFS types and MapReduce job management and the MapReduce paradigm
C. MapReduce paradigm and HDFS
D. MapReduce paradigm and massive unstructured data storage on commodity hardware
Consider these itemsets:
(hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
What is the confidence of the rule (hat, scarf) -> gloves?
A. 66%
B. 40%
C. 50%
D. 60%
Your customer provided you with 2, 000 unlabeled records and asked you to separate them into three groups. What is the correct analytical method to use?
A. K-means clustering
B. Linear regression
C. Naive Bayesian classification
D. Logistic regression
What describes a true property of Logistic Regression method?
A. It is robust with redundant variables and correlated variables.
B. It handles missing values well.
C. It works well with discrete variables that have many distinct values.
D. It works well with variables that affect the outcome in a discontinuous way.
When is the GROUP BY ROLLUP clause used in an OLAP query?
A. All subtotals and grand totals are to be included in the output
B. Subtotals are only to be included in the output
C. Grand totals are only to be included in the output
D. Specific subtotals and grand totals for a combination of variables are only to be included in the output
On analyzing your time series data you suspect that the data represented as y1, y2, y3, ... , yn-1, yn may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in
the time series data is quadratic in nature?
A. (y3-y2) ?(y2-y1) = .........= (yn-yn-1)-(yn-1-yn-2)
B. (y2-y1) = (y3-y2) = ....... = (yn-yn-1)
C. ((y2-y1) /y1 ) * 100% = .......((yn-yn-1)/yn-1) * 100%
D. (y4-y2) ?(y3-y1) = .........= (yn-yn-2)-(yn-1-yn-3)
If R factors are categorical variables, which data classification level are they most closely related?
A. Nominal
B. Ordinal
C. Interval
D. Ratio
What is one modeling or descriptive statistical function in MADlib that is typically not provided in a standard relational database?
A. Linear regression
B. Expected value
C. Variance
D. Quantiles
Which of the following is an example of quasi-structured data?
A. OLAP
B. OLTP
C. Customer record table
D. Clickstream data
Consider the following itemsets: (hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
What is the confidence of the rule (hat, scarf) => gloves?
A. 40%
B. 50%
C. 60%
D. 66%