PDFBOX-2247: fixed encoding issue when extracting text by taking a char offset (FirtsChar-1) into account