Introduction to tokenizer factories – finding words in a character stream