We would like to try k-NN and random forest. To evaluate these methods, you randomly select 520000 as training and the remaining as the testing. Since the problem is unbalanced, you would like to conduct a "stratified" spliting.
Similar to earlier work, if the data set is too large to be handled, try some subsets only and gradually increase the size.