(23 points) this question considers randomforest (RFs)
(a) (5 points) in the context of classification, clearly describe how RFs Forest-RI (random input selection) are trained and how prediction is done on a test point. Your answer can assume the use of the CART methodology without describing the mothodology
(b) (3 points) Briefly describe the three key parameters in RFs Forest-RI: d, the tree depth; m, the number of attributes randomly selected as candidates for splits; and T, the total number of trees
(c) (8 points) RFs are build by bootstrap sampling, i.e., given an original set of sample of size n, the bootstrapped sample is obtained by sampling with replacement n times. Assuming n is large, what is the expected number of unique samples from the original set of n samples in the bootstrapped sample?
(d) (5 points) professor very random forest claims to have a brilliant idea to make rfs forest-ri more powerful: since rfs prefers trees which are diverse, i.e., not strongly correlated, professor forest proposes setting m=1 for forest-RI, whene m is the number of random feature used in each node of each decision tree. Professor forest claim that that this will improve accuracy while reducing variance, do you agree withprofessor forest’s claims? Clearly explain your answer