测试Hive INT数据类型

测试HIVE如何处理NUMBER类型数据


  • 以TEST.NUMTEST为例

Environment

Product Version
HDP 2.4.0.0-169
Sqoop 1.4.6.2.4.0.0-169
Hive 1.2.1000.2.4.0.0-169
Oracle 11g.203

TEST.NUMTEST表结构:

表内容:

Purpose

测试出Hive处理NUMBER类型的结果

Procedure

1、不进行map-column-hive,直接导入,sqoop语句如下:

1
sqoop import --connect 'jdbc:oracle:thin:@10.143.90.12:1521:infodb' --username bdc  --password bdc --table TEST.NUMTEST --hive-import --hive-database TEST --hive-table  numtest  --fields-terminated-by "\t" --lines-terminated-by "\n"    --hive-overwrite --delete-target-dir  --hive-drop-import-delims --null-string '\\N' --null-non-string '\\N'  -m 1

HIVE处理结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
hive> desc NUMTEST;
OK
three double
nine double
ten double
fifteen double
nineteen double
twentytwo double
Time taken: 0.429 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0.0 1.23456789E8 1.23456789E9 1.23456789123456E14 1.23456789123456794E18 1.2345678901234568E21
10.0 NULL NULL NULL NULL NULL
Time taken: 0.165 seconds, Fetched: 2 row(s)

2、转化规则如下:
–map-column-hive THREE=tinyint,FIFTEEN=bigint,TEN=int,NINE=int,TWENTYTWO=bigint,NINETEEN=bigint

HIVE处理结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo bigint
Time taken: 0.427 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.089 seconds, Fetched: 2 row(s)

3、单独测试,NUMBER(15)转为INT

HIVE处理结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen int
nineteen bigint
twentytwo bigint
Time taken: 0.424 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 NULL 1234567891234567890 NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.135 seconds, Fetched: 2 row(s)

4、单独测试,NUMBER(19)转为INT

1
2
3
4
5
6
7
8
9
10
11
12
13
14
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen int
nineteen int
twentytwo bigint
Time taken: 0.437 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 NULL NULL NULL
10 NULL NULL NULL NULL NULL
Time taken: 0.113 seconds, Fetched: 2 row(s)

5、测试NUMBER(22)转为string:
–map-column-hive THREE=tinyint,FIFTEEN=bigint,TEN=int,NINE=int,TWENTYTWO=string,NINETEEN=bigint

HIVE结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
Time taken: 0.426 seconds, Fetched: 6 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012
10 NULL NULL NULL NULL NULL
Time taken: 0.132 seconds, Fetched: 2 row(s)

6、测试NUMBER(11)转为INT和BIGINT

INT的HIVE结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven int
Time taken: 0.43 seconds, Fetched: 7 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 NULL
10 NULL NULL NULL NULL NULL NULL
Time taken: 0.1 seconds, Fetched: 2 row(s)

BIGINT的HIVE结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
hive> desc NUMTEST;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
Time taken: 0.426 seconds, Fetched: 7 row(s)
hive> select * from NUMTEST;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901
10 NULL NULL NULL NULL NULL NULL
Time taken: 0.132 seconds, Fetched: 2 row(s)

7、测试NUMBER无长度,无精度类型,需要转成string

Hive的结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
hive> desc numtest;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
aa string
bb double
Time taken: 0.425 seconds, Fetched: 9 row(s)
hive> select * from numtest;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901 123456789012345 1.234567812E7
10 NULL NULL NULL NULL NULL NULL 123.99 NULL
Time taken: 0.176 seconds, Fetched: 2 row(s)

8、双精度double类型,转成string

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
hive> desc numtest;
OK
three tinyint
nine int
ten int
fifteen bigint
nineteen bigint
twentytwo string
eleven bigint
aa string
bb string
Time taken: 0.553 seconds, Fetched: 9 row(s)
hive> select * from numtest;
OK
0 123456789 1234567890 123456789123456 1234567891234567890 1234567890123456789012 12345678901 123456789012345 12345678.12
10 NULL NULL NULL NULL NULL NULL 123.99 NULL
Time taken: 0.162 seconds, Fetched: 2 row(s)

Conclusion

HIVE转化Oracle的NUMBER数据类型结论如下:

Orace数据类型 Hive数据类型
NUMBER(1)~NUMBER(3) TINYINT
NUMBER(4)~NUMBER(10) INT
NUMBER(11)~NUMBER(19) BIGINT
NUMBER(20)以上 STRING

HIVE数据类型转换

所有类型 BOOLEAN TINYINT SMALLINT INT BIGINT FLOAT DOUBLE DECIMAL STRING VARCHAR TIMESTAMP DATE BINARY
BOOLEAN true false false false false false false false false false false false false
TINYINT false true true true true true true true true true false false false
SMALLINT false false true true true true true true true true false false false
INT false false false true true true true true true true false false false
BIGINT false false false false true true true true true true false false false
FLOAT false false false false false true true true true true false false false
DOUBLE false false false false false false true true true true false false false
DECIMAL false false false false false false false true true true false false false
STRING false false false false false false true true true true false false false
VARCHAR false false false false false false true true true true false false false
TIMESTAMP false false false false false false false false true true true false false
DATE false false false false false false false false true true false true false
BINARY false false false false false false false false false false false false true
文章目录
  1. 1. 测试HIVE如何处理NUMBER类型数据
    1. 1.0.1. Environment
    2. 1.0.2. Purpose
    3. 1.0.3. Procedure
    4. 1.0.4. Conclusion
,