Ideally I'd be able to write this with only one pass of data, but it's not possible in one pass (as far as I know)
def separate(r: RDD[A \/ B]): (RDD[A], RDD[B]) = ???I'd settle for something like this where the As are dumped to a file and the Bs are still in the RDD. It's kind of like observeW from scalaz-stream.
def observeLefts(r: RDD[A \/ B], filename: String): RDD[B] = ???Best I can find is carrying it as a tuple to the end and using a multiple outputs or to manually handle the writers in a mutable map both of which were discussed in this SO answer.
Can I do better?