String Manipulation

You can use the following string functions in Drill queries:

Function Return Type
BYTE_SUBSTR BINARY or VARCHAR
CHAR_LENGTH INTEGER
CONCAT VARCHAR
INITCAP VARCHAR
LENGTH INTEGER
LOWER VARCHAR
LPAD VARCHAR
LTRIM VARCHAR
POSITION INTEGER
REGEXP_REPLACE VARCHAR
RPAD VARCHAR
RTRIM VARCHAR
STRPOS INTEGER
SUBSTR VARCHAR
TRIM VARCHAR
UPPER VARCHAR

BYTE_SUBSTR

Returns in binary format a substring of a string.

BYTE_SUBSTR Syntax

BYTE_SUBSTR( string-expression, start  [, length [(string-expression)]] )

string-expression is the entire string, a column name having string values for example. start is a start position in the string. 1 is the first position. length is the number of characters to the right of the start position to include in the output expressed in either of the following ways: * As an integer. For example, 19 includes 19 characters to the right of the start position in the output. * AS length(string-expression). For example, length(my_string) includes the number of characters in my_string minus the number of the start position.

BYTE_SUBSTR Usage Notes

Combine the use of BYTE_SUBSTR and CONVERT_FROM to separate parts of a HBase composite key for example.

BYTE_SUBSTR Examples

A composite HBase row key consists of strings followed by a reverse timestamp (long). For example: AMZN_9223370655563575807. Use BYTE_SUBSTR and CONVERT_FROM to separate parts of a HBase composite key.

SELECT CONVERT_FROM(BYTE_SUBSTR(row_key,6,19),'UTF8') FROM root.`mydata` LIMIT 1;
+---------------------+
|       EXPR$0        |
+---------------------+
| 9223370655563575807 |
+---------------------+
1 rows selected (0.271 seconds)

SELECT CONVERT_FROM(BYTE_SUBSTR(row_key,6,length(row_key)),'UTF8') FROM root.`mydata` LIMIT 1;
+---------------------+
|       EXPR$0        |
+---------------------+
| 9223370655563575807 |
+---------------------+
1 rows selected (0.271 seconds)

CHAR_LENGTH

Returns the number of characters in a string.

CHAR_LENGTH Syntax

CHAR_LENGTH(string)

CHAR_LENGTH Usage Notes

You can use the alias CHARACTER_LENGTH.

CHAR_LENGTH Example

SELECT CHAR_LENGTH('Drill rocks') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| 11         |
+------------+
1 row selected (0.127 seconds)

CONCAT

Concatenates arguments.

CONCAT Syntax

CONCAT(string [, string [, ...] )

CONCAT Example

SELECT CONCAT('Drill', ' ', 1.0, ' ', 'release') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| Drill 1.0 release |
+------------+
1 row selected (0.221 seconds)

Alternatively, you can use the string concatenation operation to concatenate strings.

INITCAP

Returns the string using initial caps.

INITCAP Syntax

INITCAP(string)

INITCAP Examples

SELECT INITCAP('apache drill release 1.0') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| Apache Drill Release 1.0 |
+------------+
1 row selected (0.124 seconds)

LENGTH

Returns the number of characters in the string.

LENGTH Syntax

LENGTH( string [, encoding] )

LENGTH Example

SELECT LENGTH('apache drill release 1.0') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| 24         |
+------------+
1 row selected (0.127 seconds)

SELECT LENGTH(row_key, 'UTF8') FROM root.`students`;

+------------+
|   EXPR$0   |
+------------+
| 8          |
| 8          |
| 8          |
| 8          |
+------------+
4 rows selected (0.259 seconds)

LOWER

Converts characters in the string to lowercase.

LOWER Syntax

LOWER (string)

LOWER Example

SELECT LOWER('Apache Drill') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| apache drill |
+------------+
1 row selected (0.113 seconds)

LPAD

Pads the string to the length specified by prepending the fill or a space. Truncates the string if longer than the specified length. .

LPAD Syntax

LPAD (string, length [, fill text])

LPAD Example

SELECT LPAD('Release 1.0', 27, 'of Apache Drill 1.0') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| of Apache Drill Release 1.0 |
+------------+
1 row selected (0.112 seconds)

LTRIM

Removes any characters from the beginning of string1 that match the characters in string2.

LTRIM Syntax

LTRIM(string1, string2)

LTRIM Examples

SELECT LTRIM('Apache Drill', 'Apache ') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| Drill      |
+------------+
1 row selected (0.131 seconds)

SELECT LTRIM('A powerful tool Apache Drill', 'Apache ') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| owerful tool Apache Drill |
+------------+
1 row selected (0.07 seconds)

POSITION

Returns the location of a substring.

POSITION Syntax

POSITION('substring' in 'string')

POSITION Example

SELECT POSITION('c' in 'Apache Drill') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| 4          |
+------------+
1 row selected (0.12 seconds)

REGEXP_REPLACE

Substitutes new text for substrings that match POSIX regular expression patterns.

REGEXP_REPLACE Syntax

REGEXP_REPLACE(source_char, pattern, replacement)

source is the character expression to be replaced.

pattern is the regular expression.

replacement is the string to substitute for the source.

REGEXP_REPLACE Examples

Replace a's with b's in this string.

SELECT REGEXP_REPLACE('abc, acd, ade, aef', 'a', 'b') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| bbc, bcd, bde, bef |
+------------+

Use the regular expression a followed by a period (.) in the same query to replace all a's and the subsequent character.

SELECT REGEXP_REPLACE('abc, acd, ade, aef', 'a.','b') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| bc, bd, be, bf |
+------------+
1 row selected (0.099 seconds)

RPAD

Pads the string to the length specified. Appends the text you specify after the fill keyword using spaces for the fill if you provide no text or insufficient text to achieve the length. Truncates the string if longer than the specified length.

RPAD Syntax

RPAD (string, length [, fill text])

RPAD Example

SELECT RPAD('Apache Drill ', 22, 'Release 1.0') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| Apache Drill Release 1 |
+------------+
1 row selected (0.15 seconds)

RTRIM

Removes any characters from the end of string1 that match the characters in string2.

RTRIM Syntax

RTRIM(string1, string2)

RTRIM Examples

SELECT RTRIM('Apache Drill', 'Drill ') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| Apache     |
+------------+
1 row selected (0.135 seconds)

SELECT RTRIM('1.0 Apache Tomcat 1.0', 'Drill 1.0') from sys.version;
+------------+
|   EXPR$0   |
+------------+
| 1.0 Apache Tomcat |
+------------+
1 row selected (0.088 seconds)

STRPOS

Returns the location of the substring in a string.

STRPOS Syntax

STRPOS(string, substring)

STRPOS Example

SELECT STRPOS('Apache Drill', 'Drill') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| 8          |
+------------+
1 row selected (0.22 seconds)

SUBSTR

Extracts characters from position 1 - x of the string an optional y times.

SUBSTR Syntax

SUBSTR(string, x, y)

SUBSTR Usage Notes

You can use the alias SUBSTRING for this function.

SUBSTR Example

SELECT SUBSTR('Apache Drill', 8) FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| Drill      |
+------------+
1 row selected (0.134 seconds)

SELECT SUBSTR('Apache Drill', 3, 2) FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| ac         |
+------------+
1 row selected (0.129 seconds)

TRIM

Removes any characters from the beginning, end, or both sides of string2 that match the characters in string1.

TRIM Syntax

TRIM ([leading | trailing | both] [string1] from string2)

TRIM Example

SELECT TRIM(trailing 'l' from 'Drill') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| Dri        |
+------------+
1 row selected (0.172 seconds)

SELECT TRIM(both 'l' from 'long live Drill') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| ong live Dri |
+------------+
1 row selected (0.087 seconds)

SELECT TRIM(leading 'l' from 'long live Drill') FROM sys.version;
+------------+
|   EXPR$0   |
+------------+
| ong live Drill |
+------------+
1 row selected (0.077 seconds)

UPPER

Converts characters in the string to uppercase.

UPPER Syntax

UPPER (string)

UPPER Example

SELECT UPPER('Apache Drill') FROM sys.version;

+------------+
|   EXPR$0   |
+------------+
| APACHE DRILL |
+------------+
1 row selected (0.104 seconds)