sqoop-export
Purpose
The
export
tool exports a set of files from HDFS back to an RDBMS. The target table must already exist in the database. The input files are read and parsed into a set of records according to the user-specified delimiters.
- 目的:将数据从HDFS导出到RDBMS中
- 导出的目标表
table
必须是已经存在的
Syntax
- 导出的基本语法
1 | sqoop export (generic-args) (export-args) |
- 主要参数有
- 主要的控制参数
对上面的表格的几个重要参数解释:
—columns
:没有包含在其后面的字段类型,要么具有默认参数,要么允许插入空值
By default, all columns within a table are selected for export. You can select a subset of columns and control their ordering by using the
--columns
argument.Note that columns that are not included in the
--columns
parameter need to have either defined default value or allowNULL
values.
—export-dir
:导出目录,必须指定;参数必须配合—table
或者—call
The
--export-dir
argument and one of--table
or--call
are required.
—input-null-string
和—input-null-string
The
--input-null-string
and--input-null-non-string
arguments are optional.If
--input-null-string
is not specified, then the string “null” will be interpreted as null for string-type columns. (1)If
--input-null-non-string
is not specified, then both the string “null” and the empty string will be interpreted as null for non-string columns. (2)
- 两个参数是可选的
- 如果参数(1)未被指定,则
NULL
被翻译成空值 - 如果参数(2)未被指定,则无论是
NULL
值还是空字符串都被翻译成空值
Inserts and Updates
By default,
sqoop-export
appends new rows to a table; each input record is transformed into anINSERT
statement that adds a row to the target database table.If you specify the
--update-key
argument, Sqoop will instead modify an existing dataset in the database. Each input record is treated as anUPDATE
statement that modifies an existing row.
- 默认情况下,sqoop-export是将新的一行数据追加到表的末尾
- 上面的操作相当于是执行了一条SQL的insert语句
- 指定了—update-key参数,则在进行操作的时候会更新现有的数据
1 | CREATE TABLE foo( |
两个更新的模式updatemod
updateonly
:默认模式,更新已经存在的记录,不插入新数据allowinsert
:允许插入新值,相当于是append
update-key
根据update-key
中指定的字段是否为主键
- 不是主键:
- updateonly:仅仅是更新update
- allowinsert:相当于是append,会有数据的冗余
- 是主键:
- updateonly:仅仅是更新update
- allowinsert:相当于是insert+append
demo
- 全量导出
1 | sqoop export |
- 增量导出
1 | sqoop export |