If you’d like another practice problem to get ready for your perfect 10 on tomorrow’s quiz, go to ChatGPT (or other generative AI) and paste in this prompt:
here's a dataset my class used to help us learn how to perform decision
tree induction:
caffeine_cups,free_hours_sat,dorm,attends_movie
0,6,Eagle,Yes
4,5,Jefferson,Yes
2,4,Randolph,No
1,7,Eagle,Yes
2,3,Jefferson,No
3,2,Randolph,No
2,1,Randolph,Yes
5,8,Eagle,Yes
0,1,Jefferson,No
3,4,Randolph,No
1,6,Jefferson,Yes
we were to split each numeric attribute only once (splitting into high and
low groups) considering all possible split points. the target label is the
last one (attends_movie). the decision tree algorithm we used is greedy, and
uses "# of examples correct if we stop branching there" as the metric for
determining what should go at a node (not entropy or information gain).
for the example above, it turned out that splitting free_hours_sat between
1-4 and 5-8 was the best feature to put at the root of the tree, since that
got 10 out of 11 examples correct just with one feature.
can you please make up another data set for me so I can practice this
decision tree induction for an upcoming quiz? also tell me what "the right
answer" is (i.e., what node should be at the root, at each of the branches,
etc., when using the greedy decision tree induction algorithm.)
Then, see how well you do on whatever data it gives you.