sqooopp

 sqoop import--rdbms to hdfs

export--hdfs to rdbms


data injection or migration

import--individual tables


export---

target tables exists in rdbms


Sqoop Eval--to run a query on db


in mysql:

----------


cltr+l to clear screen;


show databases;


use retail_db;


show tables;


list of databases


sqoop-list-databases \

--connect "jdbc:mysql://ipaddress:portno" \

--username name \

--password pass


list tables


sqoop-list-tables \

--connect "jdbc:mysql://ip address:portno/retail_db" \

--username name \

--password pass


eval:


sqoop-eval \

--connect "jdbc:mysql://ip address:portno" \

--username name \

--password pass

--query "select * from retail_db.customers limit 10"


for oracle : orc in place of mysql



to use in cloudxlab:


sqoop-list-databases \

--connect "jdbc:mysql://10.142.1.2/sqoopex" \

--username sqoopuser \

--password NHkkP876rp



sqoop-list-tables \

--connect "jdbc:mysql://10.142.1.2/sqoopex" \

--username sqoopuser \

--password NHkkP876rp


sqoop-eval \

--connect "jdbc:mysql://10.142.1.2/sqoopex" \

--username sqoopuser \

--password NHkkP876rp \

--query "select * from retail_db.customers limit 10"


sqoop import:

------------------


mapreduce job---only mapper no reducer as no aggregation


default 4 mapper

we can change the no of mappers,


mapper divide the works based on primaey key.


if no primay key--then 

mapper don't know how to devide the works.


1.change no of mapper to 1.

2. using split by column(multiple mapper can be used)


if no primay key then it will fail--no primay key found


cmd

-----------------


sqoop import \

--connect "jdbc:mysql://10.142.1.2/retail_db" \

--username sqoopuser \

--password NHkkP876rp \

--table Covid19 \

--target-dir /dirname


import on cloudxlab:

--------------------------------

sqoop import --connect "jdbc:mysql://10.142.1.2/retail_db" --username sqoopuser --password NHkkP876rp --table Covid19 

--target-dir ./Covid19data


no pk:

-----------

sqoop import \

--connect "jdbc:mysql://10.142.1.2/retail_db" \

--username sqoopuser \

--password NHkkP876rp \

--table orders \

-m -1 \

--target-dir ./orders


--------------

if don't mention then sqoop use txt file format.


to import all tables from mysql:

------------------------------------------

sqoop-import-all-tables \

--connect "jdbc:mysql://10.142.1.2/retail_db" \

--username sqoopuser \

--password NHkkP876rp \

--table orders \

-m -4 \

--target-dir ./orders


-------------------------


target dir vs warehouse dir

-----------------------------------

in target dir--path mention final path where all data will be there.

/data


in warehouse dir:

system will create a sub dir with the table name

/data/employee


warehouse dir -- is production format.


supports 4 file formats:


text file

Sequence file

Avro file format

Parquet file format




Comments

Popular posts from this blog

scala-4