Let’s say we have input dataframe
with 3 columns (user: Int
, item: String
and purchased: Int
) as shown below:
+----+----+---------+
|user|item|purchased|
+----+----+---------+
| 1| A| 1|
| 1| B| 2|
| 2| A| 3|
| 2| C| 4|
| 3| A| 3|
| 3| B| 2|
| 3| D| 6|
+----+----+---------+
Task: We need to produce output dataframe which should look like below:
+----+----+---------+
|user|item|purchased|
+----+----+---------+
| 1| A| 1|
| 1| B| 2|
| 1| C| 0|
| 1| D| 0|
| 2| A| 3|
| 2| B| 0|
| 2| C| 4|
| 2| D| 0|
| 3| A| 3|
| 3| B| 2|
| 3| C| 0|
| 3| D| 6|
+----+----+---------+
Here output dataframe
should be calculated as:
All possible combinations of (user, item) should be shown in output dataframe. If purchased
is missing for any combination of (user,item) then we should consider it as “0”.
Hope you understand the question!
Happy Learning!