Exam2pass
0 items Sign In or Register
  • Home
  • IT Exams
  • Guarantee
  • FAQs
  • Reviews
  • Contact Us
  • Demo
Exam2pass > Snowflake > Snowflake Certifications > DSA-C02 > DSA-C02 Online Practice Questions and Answers

DSA-C02 Online Practice Questions and Answers

Questions 4

Which of the following process best covers all of the following characteristics?

Collecting descriptive statistics like min, max, count and sum.

Collecting data types, length and recurring patterns. ?Tagging data with keywords, descriptions or categories.

Performing data quality assessment, risk of performing joins on the data.

Discovering metadata and assessing its accuracy.

Identifying distributions, key candidates, foreign-key candidates,functional dependencies, embedded value dependencies, and performing inter-table analysis.

A. Data Visualization

B. Data Virtualization

C. Data Profiling

D. Data Collection

Buy Now

Correct Answer: C

Explanation:

Data processing and analysis cannot happen without data profiling--reviewing source data for con-tent and quality. As data gets bigger and infrastructure moves to the cloud, data profiling is increasingly important.

What is data profiling?

Data profiling is the process of reviewing source data, understanding structure, content and interrelationships, and identifying potential for data projects.

Data profiling is a crucial part of:

Data warehouse and business intelligence (DW/BI) projects--dataprofiling can uncover data quality issues in data sources, and what needs to be corrected in ETL.

Data conversion and migration projects--data profiling can identify data quality issues, which you can handle in scripts and data integration tools copying data from source to target. It can also un-cover new requirements for the target system.

Source system data quality projects--data profiling can highlight data which suffers from serious or numerous quality issues, and the source of the issues (e.g. user inputs, errors in interfaces, data corruption).

Data profiling involves:

Collecting descriptive statistics like min, max, count and sum.

Collecting data types, length and recurring patterns.

Tagging data with keywords, descriptions or categories.

Performing data quality assessment, risk of performing joins on the data.

Discovering metadata and assessing its accuracy.

Identifying distributions, key candidates, foreign-key candidates, functional dependencies, embedded value dependencies, and performing inter-table analysis.

Questions 5

Which command is used to install Jupyter Notebook?

A. pip install jupyter

B. pip install notebook

C. pip install jupyter-notebook

D. pip install nbconvert

Buy Now

Correct Answer: A

Explanation:

Jupyter Notebook is a web-based interactive computational environment. The command used to install Jupyter Notebook is pip install jupyter. The command used to start Jupyter Notebook is jupyter notebook.

Questions 6

Which Python method can be used to Remove duplicates by Data scientist?

A. remove_duplicates()

B. duplicates()

C. drop_duplicates()

D. clean_duplicates()

Buy Now

Correct Answer: D

Explanation:

The drop_duplicates() method removes duplicate rows. dataframe.drop_duplicates(subset, keep, inplace, ignore_index) Remove duplicate rows from the DataFrame:

1.import pandas as pd

2.data = {

3."name": ["Peter", "Mary", "John", "Mary"],

4."age": [50, 40, 30, 40],

5."qualified": [True, False, False, False]

6.}

7.

8.df = pd.DataFrame(data)

9.newdf = df.drop_duplicates()

Questions 7

As Data Scientist looking out to use Reader account, Which ones are the correct considerations about Reader Accounts for Third-Party Access?

A. Reader accounts (formerly known as "read-only accounts") provide a quick, easy, and cost-effective way to share data without requiring the consumer to become a Snowflake customer.

B. Each reader account belongs to the provider account that created it.

C. Users in a reader account can query data that has been shared with the reader account, but cannot perform any of the DML tasks that are allowed in a full account, such as data loading, insert, update, and similar data manipulation operations.

D. Data sharing is only possible between Snowflake accounts.

Buy Now

Correct Answer: D

Explanation:

Data sharing is only supported between Snowflake accounts. As a data provider, you might want to share data with a consumer who does not already have a Snowflake account or is not ready to be-come a licensed Snowflake customer. To

facilitate sharing data with these consumers, you can create reader accounts. Reader accounts (formerly known as "read-only accounts") provide a quick, easy, and cost- effective way to share data without requiring the consumer to become

a Snowflake customer.

Each reader account belongs to the provider account that created it. As a provider, you use shares to share databases with reader accounts; however, a reader account can only consume data from the provider account that created it. So,

Data Sharing is possible between Snowflake and Non-snowflake accounts via Reader Account.

Questions 8

Which method is used for detecting data outliers in Machine learning?

A. Scaler

B. Z-Score

C. BOXI

D. CMIYC

Buy Now

Correct Answer: B

Explanation:

What are outliers?

Outliers are the values that look different from the other values in the data. Below is a plot high-lighting the outliers in `red' and outliers can be seen in both the extremes of data.

Reasons for outliers in data

Errors during data entry or a faulty measuring device (a faulty sensor may result in extreme readings).

Natural occurrence (salaries of junior level employees vs C-level employees) Problems caused by outliers

Outliers in the data may causes problems during model fitting (esp. linear models). Outliers may inflate the error metrics which give higher weights to large errors (example, mean squared error, RMSE).

Z-score method is of the method for detecting outliers. This methodis generally used when a variable' distribution looks close to Gaussian. Z-score is the number of standard deviations a value of a variable is away from the variable' mean.

Z-Score = (X-mean) / Standard deviation

IQR method , Box plots are some more example of methods used to detect data outliers in Data science.

Questions 9

Consider a data frame df with 10 rows and index [ 'r1', 'r2', 'r3', 'row4', 'row5', 'row6', 'r7', 'r8', 'r9', 'row10']. What does the aggregate method shown in below code do?

g = df.groupby(df.index.str.len())

A. aggregate({'A':len, 'B':np.sum})

B. Computes Sum of column A values

C. Computes length of column A

D. Computes length of column A and Sum of Column B values of each group

E. Computes length of column A and Sum of Column B values

Buy Now

Correct Answer: C

Explanation:

Computes length of column A and Sum of Column B values of each group

Questions 10

Which of the following is a Python-based web application framework for visualizing data and analyzing results in a more efficient and flexible way?

A. StreamBI

B. Streamlit

C. Streamsets

D. Rapter

Buy Now

Correct Answer: B

Explanation:

Streamlit is a Python-based web application framework for visualizing data and analyzing results in a more efficient and flexible way. It is an open source library that assists data scientists and academics to develop Machine Learning (ML)

visualization dashboards in a short period of time. We can build and deploy powerful data applications with just a few lines of code.

Why Streamlit?

Currently, real-world applications are in high demand and developers are developing new libraries and frameworks to make on-the-go dashboards easier to build and deploy. Streamlit is a library that reduces your dashboard development

time from days to hours. Following are some reasons to choose the Streamlit:

It is a free and open-source library.

Installing Streamlit is as simple as installing any other python package It is easy to learn because you won't need any web development experience, only a basic under-standing of Python is enough to build a data application. It is compatible

with almost all machine learning frameworks, including Tensorflow and Pytorch, Scikit-learn, and visualization libraries such as Seaborn, Altair, Plotly, and many others.

Questions 11

In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?

A. by 1

B. no change

C. by intercept

D. by its slope

Buy Now

Correct Answer: D

Explanation:

What is linear regression?

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is

called the independent variable.

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatoryvariable, and the other is considered to be a dependent variable. For

example, a modeler might want to relate the weights of individuals to their heights using a linear regression model. A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent

variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).

For linear regression Y=a+bx+error.

If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

Questions 12

Which of the Following is not type of Windows function in Snowflake? Choose 2.

A. Rank-related functions.

B. Window frame functions.

C. Aggregation window functions.

D. Association functions.

Buy Now

Correct Answer: CD

Explanation:

Window Functions

A window function operates on a group ("window") of related rows. Each time a window function is called, it is passed a row (the current row in the window) and the window of rows that contain the current row. The window function returns one

output row for each input row. The output depends on the individual row passed to the function and the values of the other rows in the window passed to the function. Some window functions are order-sensitive. There are two main types of

order-sensitive window functions:

Rank-related functions.

Window frame functions.

Rank-related functions list information based on the "rank" of a row. For example, if you rank stores in descending order by profit per year, the store with the most profit will be ranked 1; the second-most profitable store will be ranked 2, etc.

Window frame functions allow you to perform rolling operations, such as calculating a running total or a moving average, on a subset of the rows in the window.

Questions 13

Which of the following is a common evaluation metric for binary classification?

A. Accuracy

B. F1 score

C. Mean squared error (MSE)

D. Area under the ROC curve (AUC)

Buy Now

Correct Answer: D

Explanation:

The area under the ROC curve (AUC) is a common evaluation metric for binary classification, which measures the performance of a classifier at different threshold values for the predicted probabilities. Other common metrics include

accuracy, precision, recall, and F1 score, which are based on the confusion matrix of true positives, false positives, true negatives, and false negatives.

Exam Code: DSA-C02
Exam Name: SnowPro Advanced: Data Scientist Certification (DSA-C02)
Last Update: May 26, 2026
Questions: 65

PDF (Q&A)

$45.99
ADD TO CART

VCE

$49.99
ADD TO CART

PDF + VCE

$59.99
ADD TO CART

Exam2Pass----The Most Reliable Exam Preparation Assistance

There are tens of thousands of certification exam dumps provided on the internet. And how to choose the most reliable one among them is the first problem one certification candidate should face. Exam2Pass provide a shot cut to pass the exam and get the certification. If you need help on any questions or any Exam2Pass exam PDF and VCE simulators, customer support team is ready to help at any time when required.

Home | Guarantee & Policy |  Privacy & Policy |  Terms & Conditions |  How to buy |  FAQs |  About Us |  Contact Us |  Demo |  Reviews

2026 Copyright @ exam2pass.com All trademarks are the property of their respective vendors. We are not associated with any of them.