Tokenizer implementation for literal blocks, preserving whitespaces.
Source for this file: /Document/src/document/pdf/tokenizer/literal.php
ezcDocumentPdfTokenizer | --ezcDocumentPdfLiteralTokenizer
Version: | //autogen// |
From ezcDocumentPdfTokenizer: | |
---|---|
ezcDocumentPdfTokenizer::FORCED
|
Constant indicating a forced breaking point without rendering a space character. |
ezcDocumentPdfTokenizer::SPACE
|
Constant indicating a breaking point, including a rendered space. |
ezcDocumentPdfTokenizer::WRAP
|
Constant indicating a possible breaking point without rendering a space character. |
protected string |
convertTabs(
$string
, [ $tabwidth
= 8] , [ $offset
= 0] )
Convert tabs to spaces. |
public array |
tokenize(
$string
)
Split string into words. |
From ezcDocumentPdfTokenizer | |
---|---|
public abstract array |
ezcDocumentPdfTokenizer::tokenize()
Split string into words |
Convert tabs to spaces.
Convert all tabs to spaces, using a 8 spaces for a tab.
Name | Type | Description |
---|---|---|
$string |
string | |
$tabwidth |
int | |
$offset |
int |
Split string into words.
This function takes a string and splits it into words. There are different mechanisms which indicate possible splitting points in the resulting word stream:
Non breaking spaces should not be splitted into multiple words, so there will be no break applied.
Name | Type | Description |
---|---|---|
$string |
string |
Method | Description |
---|---|
ezcDocumentPdfTokenizer::tokenize() |
Split string into words |