测试HIVE如何处理NUMBER类型数据
- 以TEST.NUMTEST为例
Environment
Product | Version |
---|---|
HDP | 2.4.0.0-169 |
Sqoop | 1.4.6.2.4.0.0-169 |
Hive | 1.2.1000.2.4.0.0-169 |
Oracle | 11g.203 |
TEST.NUMTEST表结构:
表内容:
Purpose
测试出Hive处理NUMBER类型的结果
Procedure
1、不进行map-column-hive,直接导入,sqoop语句如下:1
sqoop import --connect 'jdbc:oracle:thin:@10.143.90.12:1521:infodb' --username bdc --password bdc --table TEST.NUMTEST --hive-import --hive-database TEST --hive-table numtest --fields-terminated-by "\t" --lines-terminated-by "\n" --hive-overwrite --delete-target-dir --hive-drop-import-delims --null-string '\\N' --null-non-string '\\N' -m 1
HIVE处理结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14hive> desc NUMTEST;
OK
three double
nine double
ten double
fifteen double
nineteen double
twentytwo double
Time taken: 0.429 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0.0 1.23456789E8 1.23456789E9 1.23456789123456E14 1.23456789123456794E18 1.2345678901234568E21
10.0 NULL NULL NULL NULL NULL
Time taken: 0.165 seconds, Fetched: 2 row(s)
2、转化规则如下:
–map-column-hive THREE=tinyint,FIFTEEN=bigint,TEN=int,NINE=int,TWENTYTWO=bigint,NINETEEN=bigint
HIVE处理结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo bigint
Time taken: 0.427 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.089 seconds, Fetched: 2 row(s)
3、单独测试,NUMBER(15)转为INT
HIVE处理结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen int
nineteen bigint
twentytwo bigint
Time taken: 0.424 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 NULL 1234567891234567890 NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.135 seconds, Fetched: 2 row(s)
4、单独测试,NUMBER(19)转为INT1
2
3
4
5
6
7
8
9
10
11
12
13
14hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen int
nineteen int
twentytwo bigint
Time taken: 0.437 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 NULL NULL NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.113 seconds, Fetched: 2 row(s)
5、测试NUMBER(22)转为string:
–map-column-hive THREE=tinyint,FIFTEEN=bigint,TEN=int,NINE=int,TWENTYTWO=string,NINETEEN=bigint
HIVE结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
Time taken: 0.426 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012
10 NULL NULL NULL NULL NULL
Time taken: 0.132 seconds, Fetched: 2 row(s)
6、测试NUMBER(11)转为INT和BIGINT
INT的HIVE结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven int
Time taken: 0.43 seconds, Fetched: 7 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 NULL
10 NULL NULL NULL NULL NULL NULL
Time taken: 0.1 seconds, Fetched: 2 row(s)
BIGINT的HIVE结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
Time taken: 0.426 seconds, Fetched: 7 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901
10 NULL NULL NULL NULL NULL NULL
Time taken: 0.132 seconds, Fetched: 2 row(s)
7、测试NUMBER无长度,无精度类型,需要转成string
Hive的结果:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17hive> desc numtest;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
aa string
bb double
Time taken: 0.425 seconds, Fetched: 9 row(s)
hive> select * from numtest;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901 123456789012345 1.234567812E7
10 NULL NULL NULL NULL NULL NULL 123.99 NULL
Time taken: 0.176 seconds, Fetched: 2 row(s)
8、双精度double类型,转成string1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17hive> desc numtest;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
aa string
bb string
Time taken: 0.553 seconds, Fetched: 9 row(s)
hive> select * from numtest;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901 123456789012345 12345678.12
10 NULL NULL NULL NULL NULL NULL 123.99 NULL
Time taken: 0.162 seconds, Fetched: 2 row(s)
Conclusion
HIVE转化Oracle的NUMBER数据类型结论如下:
Orace数据类型 | Hive数据类型 |
---|---|
NUMBER(1)~NUMBER(3) | TINYINT |
NUMBER(4)~NUMBER(10) | INT |
NUMBER(11)~NUMBER(19) | BIGINT |
NUMBER(20)以上 | STRING |
HIVE数据类型转换
所有类型 | BOOLEAN | TINYINT | SMALLINT | INT | BIGINT | FLOAT | DOUBLE | DECIMAL | STRING | VARCHAR | TIMESTAMP | DATE | BINARY |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BOOLEAN | true | false | false | false | false | false | false | false | false | false | false | false | false |
TINYINT | false | true | true | true | true | true | true | true | true | true | false | false | false |
SMALLINT | false | false | true | true | true | true | true | true | true | true | false | false | false |
INT | false | false | false | true | true | true | true | true | true | true | false | false | false |
BIGINT | false | false | false | false | true | true | true | true | true | true | false | false | false |
FLOAT | false | false | false | false | false | true | true | true | true | true | false | false | false |
DOUBLE | false | false | false | false | false | false | true | true | true | true | false | false | false |
DECIMAL | false | false | false | false | false | false | false | true | true | true | false | false | false |
STRING | false | false | false | false | false | false | true | true | true | true | false | false | false |
VARCHAR | false | false | false | false | false | false | true | true | true | true | false | false | false |
TIMESTAMP | false | false | false | false | false | false | false | false | true | true | true | false | false |
DATE | false | false | false | false | false | false | false | false | true | true | false | true | false |
BINARY | false | false | false | false | false | false | false | false | false | false | false | false | true |