When dealing with data of terabyte-size scale and beyond, computing even basic descriptive statistics can be a challenge, and computing finer statistical properties such as correlations can be very non-trivial. We will describe recent work implementing Randomized Linear Algebra algorithms for this feature selection problem (as well as related NMF and PCA problems) in parallel and distributed environments on inputs of size ranging from ones to tens of terabytes, as well as the application of these implementations to specific scientific problems in areas such as mass spectrometry imaging and climate modeling.