Pre-processing and text normalization