sqooopp
sqoop import--rdbms to hdfs
export--hdfs to rdbms
data injection or migration
import--individual tables
export---
target tables exists in rdbms
Sqoop Eval--to run a query on db
in mysql:
----------
cltr+l to clear screen;
show databases;
use retail_db;
show tables;
list of databases
sqoop-list-databases \
--connect "jdbc:mysql://ipaddress:portno" \
--username name \
--password pass
list tables
sqoop-list-tables \
--connect "jdbc:mysql://ip address:portno/retail_db" \
--username name \
--password pass
eval:
sqoop-eval \
--connect "jdbc:mysql://ip address:portno" \
--username name \
--password pass
--query "select * from retail_db.customers limit 10"
for oracle : orc in place of mysql
to use in cloudxlab:
sqoop-list-databases \
--connect "jdbc:mysql://10.142.1.2/sqoopex" \
--username sqoopuser \
--password NHkkP876rp
sqoop-list-tables \
--connect "jdbc:mysql://10.142.1.2/sqoopex" \
--username sqoopuser \
--password NHkkP876rp
sqoop-eval \
--connect "jdbc:mysql://10.142.1.2/sqoopex" \
--username sqoopuser \
--password NHkkP876rp \
--query "select * from retail_db.customers limit 10"
sqoop import:
------------------
mapreduce job---only mapper no reducer as no aggregation
default 4 mapper
we can change the no of mappers,
mapper divide the works based on primaey key.
if no primay key--then
mapper don't know how to devide the works.
1.change no of mapper to 1.
2. using split by column(multiple mapper can be used)
if no primay key then it will fail--no primay key found
cmd
-----------------
sqoop import \
--connect "jdbc:mysql://10.142.1.2/retail_db" \
--username sqoopuser \
--password NHkkP876rp \
--table Covid19 \
--target-dir /dirname
import on cloudxlab:
--------------------------------
sqoop import --connect "jdbc:mysql://10.142.1.2/retail_db" --username sqoopuser --password NHkkP876rp --table Covid19
--target-dir ./Covid19data
no pk:
-----------
sqoop import \
--connect "jdbc:mysql://10.142.1.2/retail_db" \
--username sqoopuser \
--password NHkkP876rp \
--table orders \
-m -1 \
--target-dir ./orders
--------------
if don't mention then sqoop use txt file format.
to import all tables from mysql:
------------------------------------------
sqoop-import-all-tables \
--connect "jdbc:mysql://10.142.1.2/retail_db" \
--username sqoopuser \
--password NHkkP876rp \
--table orders \
-m -4 \
--target-dir ./orders
-------------------------
target dir vs warehouse dir
-----------------------------------
in target dir--path mention final path where all data will be there.
/data
in warehouse dir:
system will create a sub dir with the table name
/data/employee
warehouse dir -- is production format.
supports 4 file formats:
text file
Sequence file
Avro file format
Parquet file format
Comments
Post a Comment