Syntax error on incremental append with sqoop in putty - hdfs

When I am trying an incremental append with sqoop in putty,it throws syntax error.
mysql> sqoop import --connect 'jdbc:mysql://localhost:3306/retail_db'
--username retail_dba --password cloudera
--table sample --target-dir /Aravind1/sqoopdemo01
--check-column id --incremental append
--last-value 2 -m 1;
Error:
ERROR 1064 (42000): You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for the right
syntax to use near 'sqoop import --connect
'jdbc:mysql://localhost:3306/retail_db' --usernameretail_' at line 1"
I am trying to append into the file a new record inserted in the table "Sample" Can anyone help me out with the issue?
Thanks,
Aravind

Related

Permission denied on pg_catalog function while importing DB in cloudsql

I have exported multiple DBs from the gcp cloudsql instance (postgre 14) to gcp bucket using the gcloud sql export command, but when I try to import those DBs (after deleting the original DB and recreating them with same name) using gcloud sql import command, the import operation gives the following error:
"Error:exit status 3 stdout(capped at 100k bytes): SET SET SET SET SET set_config----------(1 row) SET SET SET SET REVOKE REVOKE GRANT GRANT stderr: ERROR: permission denied for function pg_replication_origin_advance"
I also tried the import using the "postgres" user in the --user argument of sql import command but ended up with the same error.
It is giving this error when trying to execute the "GRANT ALL ON FUNCTION pg_catalog.pg_replication_origin_advance(text, pg_lsn) to cloudsqlsuperuser" in sql dump file created by gcloud sql export.

Cannot find database driver: org.postgresql.Driver for RDS postgres connection-Liquibase

I am trying to connect to RDS Postgres from Liquibase.
Maven dependency is:
<dependency>
<groupId>software.aws.rds</groupId>
<artifactId>aws-postgresql-jdbc</artifactId>
<version>0.1.0</version>
</dependency>
I am not using properties file as the credentials are confidential and I cannot put them in properties file.
hence calling using command line
-liquibase --url:jdbc:postgressql://xxxx.rds.amazonaws.com:5432/database_name --username = $username --password = $password --changeLogFile=changelog_aurora.xml update
Also tried:
-liquibase --url:jdbc:postgressql://xxxx.rds.amazonaws.com:5432/database_name --username = $username --password = $password --changeLogFile=changelog_aurora.xml --driver=org.postgressql.Driver update
I am getting below error:
Unexpected error running liquibase: java.lang.RuntimeException:Cannot
find database driver: org.postgres.Driver
Please help me here. Do I need to install Postgres dependency as well? If so what should be the version? should it match with AWS RDS Postgres?
I have following recommendations for you:
You can download Postgre JDBC driver file from here if you haven't already or if you don't have JAVA installed and configured.
Preferably keep your postgresql-<version>.jar file in the same folder where your changelog file is OR add path (ref) OR use --classpath=path:anotherPath with your commandline to indicate where should liquibase go and look for the Postgre JDBC driver file.
Also if you are familiar with AWS SSM, you can keep all your secrets in SSM and read at runtime from within your property file (FYI)

Not able to override the input value for Google Cloud Dataproc query

We are migrating our existing jobs from Hadoop to GCP Environment. So we have a requirement where we need to change existing beeline command to Cloud Dataproc. In Hadoop environment we were using following command to query a table in Hive:
beeline -u BEELINE_URL --hivevar HIVE_CORE_DB=$HIVE_CORE_DB --hivevar HIVE_CORE_TBL=$HIVE_CORE_TBL -f table.hql
The input file table.hql contains following information:
select count(*) from ${hivevar:HIVE_CORE_DB}${hivevar:HIVE_CORE_TBL};
When I am converting the same code for Cloud Dataproc command I am using the following command:
gcloud dataproc jobs submit hive --cluster=cluster_name --region=region_name --params HIVE_CORE_DB=$HIVE_CORE_DB --params HIVE_CORE_TBL=$HIVE_CORE_TBL --file=table.hql
And table.hql file again contains the same input as earlier i.e.
select count(*) from ${hivevar:HIVE_CORE_DB}${hivevar:HIVE_CORE_TBL};
However, I am not able to override the values of variable containing in input file table.hql.
Getting following error:
Error: Error while compiling statement: FAILED: ParseException line 1:15 cannot recognize input near '$' '{' 'hivevar' in table name (state=42000,code=40000)
TIA
Instead of specifying input values under separate --params flags, you need to use only one.
Try changing the
--params HIVE_CORE_DB=$HIVE_CORE_DB --params HIVE_CORE_TBL=$HIVE_CORE_TBL
part of you command to
--params HIVE_CORE_DB=$HIVE_CORE_DB,HIVE_CORE_TBL=$HIVE_CORE_TBL.

How to Sqoop Import data from Different Data base in HDFS

$cat > import.txt
import
--connect
jdbc:mysql://localhost/hadoopdb
--username
hadoop
-password
abc
In a txt file I have kept the jdbc url, username and password in one text file and when I call a sqoop command I call it as follows:
sqoop --options-file /user/cloudera/import.txt --table employee
But I want to import from multiple database into HDFS. How shall I approach the same for multiple database?
I tried searching for the same but dint get any proper resource. Can anyone help me with this?
I have accomplished this by writing a shell script with multiple sqoop statements. One sqoop statement per job. You could have each statement within the shell script reference it's own options file.
You can create workflow.xml for sqoop action by parameterise each fields e.g.
import
--connect
jdbc:mysql://localhost/hadoopdb
--username
hadoop
-password
abc
--connect
$(connection_string)
--username
$(user_name)
--password-file
(password_file_path)
--table
$(table_name)
And assign value of each variable in job.properties file and run it thru Oozie commands:
oozie job -oozie http://XXXX.XX.iroot.adidom.com:XXXX/oozie -config job.properties -run
you can also schedule it thru coordinator.xml
Thanks,

mysqlimport: Error: 1227 Access denied with MySQL 8.0 and Amazon RDS

We are using MySQL 8.0.* and .csv file for the importing data into Amazon RDS. We are executing this command from the app server command line.
Error:
mysqlimport: Error: 1227 Access denied; you need (at least one of) the SUPER, SYSTEM_VARIABLES_ADMIN or SESSION_VARIABLES_ADMIN privilege(s) for this operation
Command:
mysqlimport --local --compress --columns='col1,col2,col3,col4' -h dbhost -u dbusername -pdbpassword dbname --fields-terminated-by='|' file_path/table_name.csv
We have already provided DBA permission to DB user.
As error suggests, the user you are running import command not having permissions SESSION_VARIABLES_ADMIN.
You could setup it like below.
GRANT SESSION_VARIABLES_ADMIN ON *.* TO 'user'#'%';
OR
GRANT SESSION_VARIABLES_ADMIN ON *.* TO 'user'#'specific-host';
It should resolve the issue.
Comment out the parameters TEMP_LOG_BIN and GTID_PURGED in the mysql dump and save. Try to import the dump file in target DB. It should work.