Exam2pass
0 items Sign In or Register
  • Home
  • IT Exams
  • Guarantee
  • FAQs
  • Reviews
  • Contact Us
  • Demo
Exam2pass > Databricks > Databricks Certifications > DATABRICKS-CERTIFIED-DATA-ENGINEER-ASSOCIATE > DATABRICKS-CERTIFIED-DATA-ENGINEER-ASSOCIATE Online Practice Questions and Answers

DATABRICKS-CERTIFIED-DATA-ENGINEER-ASSOCIATE Online Practice Questions and Answers

Questions 4

A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:

DROP TABLE IF EXISTS my_table;

After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.

Which of the following describes why all of these files were deleted?

A. The table was managed

B. The table's data was smaller than 10 GB

C. The table's data was larger than 10 GB

D. The table was external

E. The table did not have a location

Buy Now

Correct Answer: A

Explanation: managed tables files and metadata are managed by metastore and will be deleted when the table is dropped . while external tables the metadata is stored in a external location. hence when a external table is dropped you clear off only the metadata and the files (data) remain.

Questions 5

A data engineer is working with two tables. Each of these tables is displayed below in its entirety.

The data engineer runs the following query to join these tables together:

Which of the following will be returned by the above query?

A. Option A

B. Option B

C. Option C

D. Option D

E. Option E

Buy Now

Correct Answer: C

Questions 6

A data engineering team has noticed that their Databricks SQL queries are running too slowly when they are submitted to a non-running SQL endpoint. The data engineering team wants this issue to be resolved.

Which of the following approaches can the team use to reduce the time it takes to return results in this scenario?

A. They can turn on the Serverless feature for the SQL endpoint and change the Spot Instance Policy to "Reliability Optimized."

B. They can turn on the Auto Stop feature for the SQL endpoint.

C. They can increase the cluster size of the SQL endpoint.

D. They can turn on the Serverless feature for the SQL endpoint.

E. They can increase the maximum bound of the SQL endpoint's scaling range

Buy Now

Correct Answer: C

Explanation: https://www.databricks.com/blog/2022/03/10/top-5-databricks-performance- tips.html

Questions 7

A dataset has been defined using Delta Live Tables and includes an expectations clause:

CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

A. Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.

B. Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.

C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.

D. Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.

E. Records that violate the expectation cause the job to fail.

Buy Now

Correct Answer: C

Explanation: With the defined constraint and expectation clause, when a batch of data is processed, any records that violate the expectation (in this case, where the timestamp is not greater than '2020-01-01') will be dropped from the target dataset. These dropped records will also be recorded as invalid in the event log, allowing for auditing and tracking of the data quality issues without causing the entire job to fail. https://docs.databricks.com/en/delta-live-tables/expectations.html

Questions 8

A data engineer wants to schedule their Databricks SQL dashboard to refresh once per day, but they only want the associated SQL endpoint to be running when it is necessary.

Which of the following approaches can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?

A. They can ensure the dashboard's SQL endpoint matches each of the queries' SQL endpoints.

B. They can set up the dashboard's SQL endpoint to be serverless.

C. They can turn on the Auto Stop feature for the SQL endpoint.

D. They can reduce the cluster size of the SQL endpoint.

E. They can ensure the dashboard's SQL endpoint is not one of the included query's SQL endpoint.

Buy Now

Correct Answer: C

Questions 9

Which of the following describes a scenario in which a data engineer will want to use a single-node cluster?

A. When they are working interactively with a small amount of data

B. When they are running automated reports to be refreshed as quickly as possible

C. When they are working with SQL within Databricks SQL

D. When they are concerned about the ability to automatically scale with larger data

E. When they are manually running reports with a large amount of data

Buy Now

Correct Answer: A

Explanation: A Single Node cluster is a cluster consisting of an Apache Spark driver and no Spark workers. A Single Node cluster supports Spark jobs and all Spark data sources, including Delta Lake. A Standard cluster requires a minimum of one Spark worker to run Spark jobs.

Questions 10

A data engineer has been given a new record of data:

id STRING = 'a1'

rank INTEGER = 6

rating FLOAT = 9.4

Which of the following SQL commands can be used to append the new record to an existing Delta table my_table?

A. INSERT INTO my_table VALUES ('a1', 6, 9.4)

B. my_table UNION VALUES ('a1', 6, 9.4)

C. INSERT VALUES ( 'a1' , 6, 9.4) INTO my_table

D. UPDATE my_table VALUES ('a1', 6, 9.4)

E. UPDATE VALUES ('a1', 6, 9.4) my_table

Buy Now

Correct Answer: A

Questions 11

A data engineer that is new to using Python needs to create a Python function to add two integers together and return the sum? Which of the following code blocks can the data engineer use to complete this task?

A. Option A

B. Option B

C. Option C

D. Option D

E. Option E

Buy Now

Correct Answer: D

https://www.w3schools.com/python/python_functions.asp

Questions 12

A data engineer needs access to a table new_table, but they do not have the correct permissions. They can ask the table owner for permission, but they do not know who the table owner is. Which of the following approaches can be used to identify the owner of new_table?

A. Review the Permissions tab in the table's page in Data Explorer

B. All of these options can be used to identify the owner of the table

C. Review the Owner field in the table's page in Data Explorer

D. Review the Owner field in the table's page in the cloud storage solution

E. There is no way to identify the owner of the table

Buy Now

Correct Answer: C

Questions 13

In which of the following file formats is data from Delta Lake tables primarily stored?

A. Delta

B. CSV

C. Parquet

D. JSON

E. A proprietary, optimized format specific to Databricks

Buy Now

Correct Answer: C

Explanation: https://docs.delta.io/latest/delta-faq.html

Exam Code: DATABRICKS-CERTIFIED-DATA-ENGINEER-ASSOCIATE
Exam Name: Databricks Certified Data Engineer Associate
Last Update: Jun 08, 2025
Questions: 132

PDF (Q&A)

$45.99
ADD TO CART

VCE

$49.99
ADD TO CART

PDF + VCE

$59.99
ADD TO CART

Exam2Pass----The Most Reliable Exam Preparation Assistance

There are tens of thousands of certification exam dumps provided on the internet. And how to choose the most reliable one among them is the first problem one certification candidate should face. Exam2Pass provide a shot cut to pass the exam and get the certification. If you need help on any questions or any Exam2Pass exam PDF and VCE simulators, customer support team is ready to help at any time when required.

Home | Guarantee & Policy |  Privacy & Policy |  Terms & Conditions |  How to buy |  FAQs |  About Us |  Contact Us |  Demo |  Reviews

2025 Copyright @ exam2pass.com All trademarks are the property of their respective vendors. We are not associated with any of them.