Suppose we have a linearly separable dataset, and we divide the data into training and validation sets. Will a perceptron learned on the training dataset (assuming gradient decent works perfectly well) be guaranteed to have i) 0 error on the training dataset ii) 0 error on the validation dataset. Brie y explain. (10 points)