Skip to content
DEA
Data Processing & Transformations
medium
Question 14 of 62

A data engineer has a PySpark DataFrame orders and wants to calculate the total order amount grouped by customer_id. Which PySpark code correctly achieves this?

Aorders.select(col(\customer_id\), sum(col(\amount\)))
Borders.groupBy(\customer_id\).agg(sum(col(\amount\)).alias(\total_amount\))
Corders.filter(col(\customer_id\)).count()
Dorders.withColumn(\total_amount\, sum(col(\amount\)))

Educational Content — CertQnA practice questions are written against official exam objectives, covering the same domains tested on the real exam. All content is original and independent — not actual exam questions, not affiliated with any certification vendor. Learn more about our content policy

Discussion

Be the first to share your understanding of this concept

⚠️ Discussion is for concept clarification only. Do not share or request actual exam questions or answers.

Sign in to join the discussion