PREHOOK: query: create table url_t (key string, fullurl string) PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@url_t POSTHOOK: query: create table url_t (key string, fullurl string) POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@url_t PREHOOK: query: insert overwrite table url_t select * from ( select '1', 'http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1' from src tablesample (1 rows) union all select '2', 'https://www.socs.uts.edu.au:80/MosaicDocs-old/url-primer.html?k1=tps#chapter1' from src tablesample (1 rows) union all select '3', 'ftp://sites.google.com/a/example.com/site/page' from src tablesample (1 rows) union all select '4', cast(null as string) from src tablesample (1 rows) union all select '5', 'htttp://' from src tablesample (1 rows) union all select '6', '[invalid url string]' from src tablesample (1 rows) ) s PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: default@url_t POSTHOOK: query: insert overwrite table url_t select * from ( select '1', 'http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1' from src tablesample (1 rows) union all select '2', 'https://www.socs.uts.edu.au:80/MosaicDocs-old/url-primer.html?k1=tps#chapter1' from src tablesample (1 rows) union all select '3', 'ftp://sites.google.com/a/example.com/site/page' from src tablesample (1 rows) union all select '4', cast(null as string) from src tablesample (1 rows) union all select '5', 'htttp://' from src tablesample (1 rows) union all select '6', '[invalid url string]' from src tablesample (1 rows) ) s POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@url_t POSTHOOK: Lineage: url_t.fullurl EXPRESSION [] POSTHOOK: Lineage: url_t.key EXPRESSION [] PREHOOK: query: describe function parse_url_tuple PREHOOK: type: DESCFUNCTION POSTHOOK: query: describe function parse_url_tuple POSTHOOK: type: DESCFUNCTION parse_url_tuple(url, partname1, partname2, ..., partnameN) - extracts N (N>=1) parts from a URL. It takes a URL and one or multiple partnames, and returns a tuple. All the input parameters and output column types are string. PREHOOK: query: describe function extended parse_url_tuple PREHOOK: type: DESCFUNCTION POSTHOOK: query: describe function extended parse_url_tuple POSTHOOK: type: DESCFUNCTION parse_url_tuple(url, partname1, partname2, ..., partnameN) - extracts N (N>=1) parts from a URL. It takes a URL and one or multiple partnames, and returns a tuple. All the input parameters and output column types are string. Partname: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY: Note: Partnames are case-sensitive, and should not contain unnecessary white spaces. Example: > SELECT b.* FROM src LATERAL VIEW parse_url_tuple(fullurl, 'HOST', 'PATH', 'QUERY', 'QUERY:id') b as host, path, query, query_id LIMIT 1; > SELECT parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') as (ho, pa, qu, re, pr, fi, au, us, qk1) from src a; PREHOOK: query: explain select a.key, b.* from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 order by a.key PREHOOK: type: QUERY POSTHOOK: query: explain select a.key, b.* from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 order by a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Forward Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: key Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col0, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col5 (type: string), _col6 (type: string), _col7 (type: string), _col8 (type: string), _col9 (type: string), _col10 (type: string), _col11 (type: string), _col12 (type: string), _col13 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: string), _col7 (type: string), _col8 (type: string), _col9 (type: string) Select Operator expressions: fullurl (type: string), 'HOST' (type: string), 'PATH' (type: string), 'QUERY' (type: string), 'REF' (type: string), 'PROTOCOL' (type: string), 'FILE' (type: string), 'AUTHORITY' (type: string), 'USERINFO' (type: string), 'QUERY:k1' (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE function name: parse_url_tuple Lateral View Join Operator outputColumnNames: _col0, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col5 (type: string), _col6 (type: string), _col7 (type: string), _col8 (type: string), _col9 (type: string), _col10 (type: string), _col11 (type: string), _col12 (type: string), _col13 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: string), _col7 (type: string), _col8 (type: string), _col9 (type: string) Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string), VALUE._col1 (type: string), VALUE._col2 (type: string), VALUE._col3 (type: string), VALUE._col4 (type: string), VALUE._col5 (type: string), VALUE._col6 (type: string), VALUE._col7 (type: string), VALUE._col8 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select a.key, b.* from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 order by a.key PREHOOK: type: QUERY PREHOOK: Input: default@url_t #### A masked pattern was here #### POSTHOOK: query: select a.key, b.* from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 order by a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@url_t #### A masked pattern was here #### 1 facebook.com /path1/p.php k1=v1&k2=v2 Ref1 http /path1/p.php?k1=v1&k2=v2 facebook.com NULL v1 2 www.socs.uts.edu.au /MosaicDocs-old/url-primer.html k1=tps chapter1 https /MosaicDocs-old/url-primer.html?k1=tps www.socs.uts.edu.au:80 NULL tps 3 sites.google.com /a/example.com/site/page NULL NULL ftp /a/example.com/site/page sites.google.com NULL NULL 4 NULL NULL NULL NULL NULL NULL NULL NULL NULL 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL 6 NULL NULL NULL NULL NULL NULL NULL NULL NULL PREHOOK: query: explain select parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') as (ho, pa, qu, re, pr, fi, au, us, qk1) from url_t a order by ho, pa, qu PREHOOK: type: QUERY POSTHOOK: query: explain select parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') as (ho, pa, qu, re, pr, fi, au, us, qk1) from url_t a order by ho, pa, qu POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: fullurl (type: string), 'HOST' (type: string), 'PATH' (type: string), 'QUERY' (type: string), 'REF' (type: string), 'PROTOCOL' (type: string), 'FILE' (type: string), 'AUTHORITY' (type: string), 'USERINFO' (type: string), 'QUERY:k1' (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE function name: parse_url_tuple Reduce Output Operator key expressions: c0 (type: string), c1 (type: string), c2 (type: string) sort order: +++ Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE value expressions: c3 (type: string), c4 (type: string), c5 (type: string), c6 (type: string), c7 (type: string), c8 (type: string) Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: string), VALUE._col0 (type: string), VALUE._col1 (type: string), VALUE._col2 (type: string), VALUE._col3 (type: string), VALUE._col4 (type: string), VALUE._col5 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') as (ho, pa, qu, re, pr, fi, au, us, qk1) from url_t a order by ho, pa, qu PREHOOK: type: QUERY PREHOOK: Input: default@url_t #### A masked pattern was here #### POSTHOOK: query: select parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') as (ho, pa, qu, re, pr, fi, au, us, qk1) from url_t a order by ho, pa, qu POSTHOOK: type: QUERY POSTHOOK: Input: default@url_t #### A masked pattern was here #### NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL facebook.com /path1/p.php k1=v1&k2=v2 Ref1 http /path1/p.php?k1=v1&k2=v2 facebook.com NULL v1 sites.google.com /a/example.com/site/page NULL NULL ftp /a/example.com/site/page sites.google.com NULL NULL www.socs.uts.edu.au /MosaicDocs-old/url-primer.html k1=tps chapter1 https /MosaicDocs-old/url-primer.html?k1=tps www.socs.uts.edu.au:80 NULL tps PREHOOK: query: -- should return null for 'host', 'query', 'QUERY:nonExistCol' explain select a.key, b.ho, b.qu, b.qk1, b.err1, b.err2, b.err3 from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1', 'host', 'query', 'QUERY:nonExistCol') b as ho, pa, qu, re, pr, fi, au, us, qk1, err1, err2, err3 order by a.key PREHOOK: type: QUERY POSTHOOK: query: -- should return null for 'host', 'query', 'QUERY:nonExistCol' explain select a.key, b.ho, b.qu, b.qk1, b.err1, b.err2, b.err3 from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1', 'host', 'query', 'QUERY:nonExistCol') b as ho, pa, qu, re, pr, fi, au, us, qk1, err1, err2, err3 order by a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Forward Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: key Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col0, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col5 (type: string), _col7 (type: string), _col13 (type: string), _col14 (type: string), _col15 (type: string), _col16 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: string) Select Operator expressions: fullurl (type: string), 'HOST' (type: string), 'PATH' (type: string), 'QUERY' (type: string), 'REF' (type: string), 'PROTOCOL' (type: string), 'FILE' (type: string), 'AUTHORITY' (type: string), 'USERINFO' (type: string), 'QUERY:k1' (type: string), 'host' (type: string), 'query' (type: string), 'QUERY:nonExistCol' (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE function name: parse_url_tuple Lateral View Join Operator outputColumnNames: _col0, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col5 (type: string), _col7 (type: string), _col13 (type: string), _col14 (type: string), _col15 (type: string), _col16 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: string) Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string), VALUE._col1 (type: string), VALUE._col2 (type: string), VALUE._col3 (type: string), VALUE._col4 (type: string), VALUE._col5 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 12 Data size: 426 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select a.key, b.ho, b.qu, b.qk1, b.err1, b.err2, b.err3 from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1', 'host', 'query', 'QUERY:nonExistCol') b as ho, pa, qu, re, pr, fi, au, us, qk1, err1, err2, err3 order by a.key PREHOOK: type: QUERY PREHOOK: Input: default@url_t #### A masked pattern was here #### POSTHOOK: query: select a.key, b.ho, b.qu, b.qk1, b.err1, b.err2, b.err3 from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1', 'host', 'query', 'QUERY:nonExistCol') b as ho, pa, qu, re, pr, fi, au, us, qk1, err1, err2, err3 order by a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@url_t #### A masked pattern was here #### 1 facebook.com k1=v1&k2=v2 v1 NULL NULL NULL 2 www.socs.uts.edu.au k1=tps tps NULL NULL NULL 3 sites.google.com NULL NULL NULL NULL NULL 4 NULL NULL NULL NULL NULL NULL 5 NULL NULL NULL NULL NULL NULL 6 NULL NULL NULL NULL NULL NULL PREHOOK: query: explain select ho, count(*) from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 where qk1 is not null group by ho PREHOOK: type: QUERY POSTHOOK: query: explain select ho, count(*) from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 where qk1 is not null group by ho POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Forward Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col5 (type: string) outputColumnNames: _col5 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() keys: _col5 (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint) Select Operator expressions: fullurl (type: string), 'HOST' (type: string), 'PATH' (type: string), 'QUERY' (type: string), 'REF' (type: string), 'PROTOCOL' (type: string), 'FILE' (type: string), 'AUTHORITY' (type: string), 'USERINFO' (type: string), 'QUERY:k1' (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 6 Data size: 213 Basic stats: COMPLETE Column stats: NONE function name: parse_url_tuple Filter Operator predicate: c8 is not null (type: boolean) Statistics: Num rows: 3 Data size: 106 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col5 (type: string) outputColumnNames: _col5 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() keys: _col5 (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 9 Data size: 319 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 4 Data size: 141 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 4 Data size: 141 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select ho, count(*) from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 where qk1 is not null group by ho PREHOOK: type: QUERY PREHOOK: Input: default@url_t #### A masked pattern was here #### POSTHOOK: query: select ho, count(*) from url_t a lateral view parse_url_tuple(a.fullurl, 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'FILE', 'AUTHORITY', 'USERINFO', 'QUERY:k1') b as ho, pa, qu, re, pr, fi, au, us, qk1 where qk1 is not null group by ho POSTHOOK: type: QUERY POSTHOOK: Input: default@url_t #### A masked pattern was here #### facebook.com 1 www.socs.uts.edu.au 1