LUCENE-2906: filter to process output of Standard/ICUTokenizer and create overlapping bigrams for CJK