Subword Tokenization: Byte Pair Encoding