Quantcast
Channel: Teradata Downloads - Hadoop
Viewing all 43 articles
Browse latest View live

Sqoop export fails with untranslatable character error

$
0
0

HI, we have a sqoop job that exports data to Teradata hourly. Sometimes job fails with untranslatable character error. We would like to know, if there is any Java Function to check if the string is translatable by teradata. This is of higher priority. Please let us know, if there are any questions.
2015-04-19 22:59:56,734 WARN org.apache.hadoop.mapred.Child: Error running child
com.teradata.hadoop.exception.TeradataHadoopSQLException: java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 14.00.00.01] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.
   at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:147)
 at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:136)
       at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:239)
        at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatch(TDPreparedStatement.java:1951)
        at com.teradata.hadoop.mapreduce.TeradataRecordWriter.write(TeradataRecordWriter.java:60)
  at com.teradata.hadoop.mapreduce.TeradataRecordWriter.write(TeradataRecordWriter.java:23)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
   at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
      at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
       at com.teradata.hadoop.mapreduce.TeradataTextFileExportMapper.map(TeradataTextFileExportMapper.java:32)
   at com.teradata.hadoop.mapreduce.TeradataTextFileExportMapper.map(TeradataTextFileExportMapper.java:12)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:415)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.00.00.01] [Error 6706] [SQLState HY000] The string contains an untranslatable character.
     at com.teradata.jdbc.jdbc_4.util.ErrorFactory.cloneJDBCException(ErrorFactory.java:169)
 at com.teradata.jdbc.jdbc_4.statemachine.PreparedBatchStatementController.handleRunException(PreparedBatchStatementController.java:93)
      at com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:126)
       at com.teradata.jdbc.jdbc_4.statemachine.PreparedBatchStatementController.run(PreparedBatchStatementController.java:56)
   at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:372)
        at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:219)
 ... 16 more

   at com.teradata.hadoop.mapreduce.TeradataRecordWriter.write(TeradataRecordWriter.java:64)
     at com.teradata.hadoop.mapreduce.TeradataRecordWriter.write(TeradataRecordWriter.java:23)
       at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
      at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
  at com.teradata.hadoop.mapreduce.TeradataTextFileExportMapper.map(TeradataTextFileExportMapper.java:32)
      at com.teradata.hadoop.mapreduce.TeradataTextFileExportMapper.map(TeradataTextFileExportMapper.java:12)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
2015-04-19 22:59:56,737 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Forums: 

Teradata connector JAVA API document

Big Analytics Appliance

$
0
0

I saw a presentation on big analytics appliance which gives Hadoop, aster and teradata all in one appliance. and then a language called sql-h which allows query on all 3. 
A question I had was that the load on the hadoop grows very fast because people pour in a lot of data which is not used very frequently. (at least in my case).
Is it possible that if I have to just grown the hadoop nodes then I install the teradata hadoop distrubtion on a commodity server (cheaper normal server from HP/Dell) and then add it to the cluster?
or should all nodes be the teradata hardware?

Forums: 

Teradata Commodity Hardware for Hadoop

$
0
0

Suppose a customer is using Teradata already and wants to move some of the unused data to Hadoop.
Suppse we use the "Teradata Commodity Offering for Hadoop"1  to build a hadoop cluster using commodity Dell Servers.
Can we still use things like Teradata High performance Hadoop connector to move data back and forth? what about SQL-H? can that be used as well? 
OR will SQL-H and High Perf data movement be available only when we buy the big analytics appliance?
 
I want to what is the interoperability of the commodity offering with the teradata, aster and big analytics appliances.
 
1. http://assets.teradata.com/resourceCenter/downloads/Brochures/EB7660_2.pdf?processed=1

Forums: 

Will sqoop import command work with Teradata Dbase 15?

$
0
0

Hello, I am new to Teradata.  For our project, we are currently using Teradata 13.10, and planning an upgrade to V. 15 soon.
The client is doing "Big Data Analysis"  To pull the data into Hadoop, they are using this command: 

sqoop import --connect jdbc:teradata://xxx.xx.xxx.xx/DATABASE=MY_DBASE --driver com.teradata.jdbc.TeraDriver --username MyUser --password Password4MyUser

Questions:
1) Can anyone tell me if this command will work the same in V. 15 as it does now in 13.10? 
2) Are there any issues to be aware of? 
3) Is there something that would be better to use than this command?  If so, why would it be better?
While researching this myself, I found 2 articles that seem relevant:
http://forums.teradata.com/forum/tools/tpt-15-00-teradata-hdp-sanbox-data-movement
http://forums.teradata.com/forum/tools/tpt-15-00-teradata-hdp-sanbox-data-movement
Thanks a lot, I appreciate any help / guidance
 

Forums: 

Free Webinar on Big Data and Hadoop Developer

$
0
0

Hi Everyone,
Learn Big Data and Hadoop Developer course with GreyCampus and get better job opportunities with Bigger organisations on the technology front
   Topics

  1. What is big data?
  2. What is the need for big data in present corporate world?
  3. What are the objectives covered in the big data?(Brief on the objectives)
  4. How is big data useful to learners ?
  5. What are the prerequisites of Big data?
  6. How is the current Big Data Market in terms of Jobs opportunities to participants ?

Register here for the free webinar
http://www.greycampus.com/training/big-data-hadoop-webinar

Forums: 

Grave Mistakes that Companies Make in Big Data Projects

$
0
0

Hi All,

 

Big Data helps enterprises capture the information using software. Do you know what are the mistakes that enterprises do while implementing Big Data Projects?

 

Ubiquitous in the contemporary industry, big data and  analytics are being deployed by just about every organization to improve business outcomes. One of the primary purposes of big data implementation is to incorporate additional sets of data into the existing data infrastructure, so as to give companies the capability of questioning anything from the resulting data set.

 

But then, big data is not restricted to the mere handling of large volumes of data and there are certain common mistakes that enterprises need to avoid while implementing big data projects. This is to attain better decision support processes and analytical insights.

 

Continue reading

Forums: 

Big Data Governance – What is it? How to get started?


Can we recommend TDCH for Extract-Load process

$
0
0

Hi,
We all knew TDCH use JDBC connectivity to establish connectivity between Hadoop and Teradata and to move data across these platforms.
My question is using JDBC, if FastLoad operation is interrupted for some reason, it must be restarted from the beginning. This issue is quite frequent when large record set is loaded because Checkpoint feature is not available using JDBC connectivity. Seems even in recent JDBC 15.x release Checkpoint feature is not introduced.
Having this open item, could someone guide me:
1. Can we recommend TDCH (Hadoop-to-Teradata) to move data in regular interval for large IDW system
2. QueryGrid (LOAD_FROM_HCATALOG) - use JDBC connectivity or different mechanism
3. What is the internal latency involved on Teradata while using TDCH/QueryGrid, more specific to AWT usage
Thanks!!
TDThrottle

Tags: 
Forums: 

Free Webinar on Big Data and Hadoop Developer

$
0
0

Hi,

 

GreyCampus is conducting Free Webinar on Big Data and Hadoop Developer(link is external). Learn Big Data & Hadoop Developer and become a Hadoop Expert by mastering MapReduce, Yarn, Pig, Hive, HBase, Oozie, Flume and Sqoop while working on industry based Use-cases and Projects.

 

Topics to be covered:

  1. ​Brief Introduction to Big Data
  2. Explanation of any one module
  3. Prerequisites of Big data
  4. Current Big Data market in terms of job 

Time and Date:

Wednesday

JULY 15, 2015

​8:00pm to 11:00pm IST

 

Register here for the free webinar

Forums: 

Multithreaded/MultiProcess inserts in teradata (Sqoop with teradata)

$
0
0

Hi all,
I am trying to export a hive table to teradata using sqoop with teradata jdbc driver. The sqoop export works well only when I use a single mapper. With multiple mappers, the jdbc driver throws a '[SQLState 40001] Transaction ABORTed due to deadlock' error. This is consistently reproducible when the target teradata table does not have a primary index on it. When the target table has a primary index, the issue is generally reproducible by increasing the number of mappers (i.e. if it works with 4, it will most likely fail with 10 mappers). I am using the default values for sqoop.export.records.per.statement and sqoop.export.statements.per.transaction during sqoop export. Any idea what could be causing this? 
Just to check how teradata was behaving when inserts in the same table are done by multiple threads (or processes as in the case of multiple mappers with sqoop export), I created a multithreaded jdbc application and tried inserting some data into the target table. Each thread in this application created a new connection and did batch inserts (records.per.statement=100, statements.per.transaction=100) into the target table.
Below are my observations.
With NO PI table : I was able to consistenly reproduce the transaction deadlock issue (No duplicate records)
With PI table : Surprisingly, I was not able to reproduce it even with 12 threads each trying to insert 100000 rows. I was able to reproduce the issue when I tried to run multiple instances of this application (each instance running 4 threads). This happened when the inserted data had duplicate records.
Questions:
1. What could be causing the above transaction deadlock issue?
2. Are JDBC batch insertions using multiple threads/process expected to work with Teradata?
3. I saw a few posts on the web suggesting to try lower down the sqoop export batch size (records.per.statement / statements.per.transaction) to make it work. But does that actually fixes the issue? I think it just reduces the chances of deadlock and there could be an impact on performance if we decrease the batchsize.
 

Forums: 

Free Webinar on Big Data and Hadoop Developer

$
0
0

Hi,

 

GreyCampus is conducting Free Webinar on Big Data and Hadoop Developer(link is external). Learn Big Data & Hadoop Developer and become a Hadoop Expert by mastering MapReduce, Yarn, Pig, Hive, HBase, Oozie, Flume and Sqoop while working on industry based Use-cases and Projects.

 

Topics to be covered :

  1. ​Brief Introduction to Big Data
  2. Explanation of any one module
  3. Prerequisites of Big data
  4. Current Big Data market in terms of job opportunities 

Time and Date:

Monday, August 3rd, 2015

​8:00pm to 11:00pm IST

 

Register here for the free webinar

Forums: 

Can the TDCH be used to load data into view (in teradata)

$
0
0

I was trying use TDCH connector to load the data from Hive table to Teradata table. But, I want to load the data into the targt table (in teradata) via VIEW instead of accessing the table directly. 
So, is there a way to load the data to target table through the VIEW?
There is an option called "tdch.output.teradata.data.dictionary.use.xview" but setting this option to true didn't help we either. i couldn't get much info.. in the use of this option (dch.output.teradata.data.dictionary.use.xview), so, I'm just curious what's the use of this option?
Below is a sample TDCH job I was using.

hadoop jar $TDCH_JAR \

com.teradata.connector.common.tool.ConnectorExportTool \

-Dmapred.job.queue.name=<queuename> \

-libjars $HIVE_LIB_JARS \

-classname com.teradata.jdbc.TeraDriver \

-url jdbc:teradata://<IPadress>/ \

-username xxxxx \

-password xxxxx \

-jobtype hive \

-fileformat textfile \

-nummappers 10 \

-method internal.fastload \

-separator "\u0009" \

-sourcedatabase <database> \

-sourcetable <table> \

-sourcefieldnames "<all the columns>" \

-targettable <target table> \

-targetfieldnames "<all the target columns>" \

-stagedatabase <stage database> \

-forcestage true

Forums: 

Need to get common users to multiple users mapped to it

$
0
0

Hi,
I have a table which has one user ID mapped to multiple user IDs. I want to get the common user IDs mapped to each of the two user IDs
My Data looks in the following way

User ID	User ID 1
1	2
1	3
1	4
1	5
1	6
1	7
1	8
1	9
2	1
2	3
2	5
2	7
2	9
2	11
2	13
2	15
2	17
.	.
.	.
.	.
.	.
.	.
.	.
2000	
2001

I need to get for every User ID combination starting from (1,2) till (1999,2000) how many common User ID1's are there
Thanks in advance for any solution

Forums: 

Need to get common users for every combination of user IDs

$
0
0

Hi,
I have a table which has one user ID mapped to multiple other user IDs. I want to get the common user IDs mapped to every two user IDs
 
My Data looks in the following way

User ID	User ID 1
1	2
1	3
1	4
1	5
1	6
1	7
1	8
1	9
2	1
2	3
2	5
2	7
2	9
2	11
2	13
2	15
2	17
.	.
.	.
.	.
.	.
.	.
.	.
2000	2001

User ID and User ID1 are 2 different columns. For each user ID in column 1 there are multiple user ID mapping  in column 2
I need to get for every User ID combination(Column 1 combination eg:(1 and 2), (1 and 3) and so on) starting from (1,2) till (1999,2000) how many common User ID1's are there
 
Thanks in advance for any solution

Forums: 

Splitsize for the Hive table in TDCH

$
0
0

I'm using TDCH to export a hive data into teradata table. For this I need to specify the number of mappers to my TDCH job. So, my question is "is this number of mappers option we give to TDCH job is just a hint to TDCH? or number of mappers created by TDCH will always be equal to number of mappers given in the option" (of the TDCH job)?
My assumption is that the number of mappers mainly depends on the split-size than on the given number of mappers (in the options of TDCH job) . Is my assumption correct for the TDCH jobs?.
Also, for the Hive table how is the split size defined? is that defined based on the number of rows? or it's just defined based on the size of the data (like 60MB or 120MB etc) similar to the cases like the "textfiles"?

Forums: 

Hive table export to Teradata using TDCH failing... due to connection reset

$
0
0
when exporting 2billion+ records into teradata from hadoop using TDCH (Teradata Connector for Hadoop) using the below command with "batch.insert",

hadoop jar teradata-connector-1.3.2-hadoop210.jar com.teradata.connector.common.tool.ConnectorExportTool \
-D mapreduce.job.queuename=<queuename> \
-libjars ${LIB_JARS} \
-classname com.teradata.jdbc.TeraDriver \
-url <jdbc_connection_string> \
-username <user_id> \
-password "********" \
-jobtype hive \
-sourcedatabase <hive_src_dbase> \
-sourcetable <hive_src_table> \
-fileformat orcfile \
-stagedatabase <stg_db_in_tdata> \
-stagetablename <stg_tbl_in_tdata> \
-targettable <target_tbl_in_tdata> \
-nummappers 25 \
-batchsize 13000 \
-method batch.insert \
-usexviews false \
-keepstagetable true \
-queryband '<queryband>'


Data is loading successfully into stage table but, then the export job fails before inserting the records in stage table into target table saying, "Connection Reset"

Can someone please help me identify the reason for this, and how to fix this. Thanks a lot in advance !!
Forums: 

How to load data from a hive table defined with custom serde to teradata using TDCH/TPT.

$
0
0

Currently TDCH supports 4 file formats (RCFILE|ORCFILE|SEQUENCEFILE|TEXTFILE) . Is there a way to load data from a custom serde formatted table .?
Any suggestions please ?
 
Thanks,
RN

Forums: 

Why Big Data? - Live Webinar

$
0
0

Big Data refers to large sets of data that cannot be analyzed with traditional tools. Techfetch.com has planned to conduct a webinar “Why Big Data?” on Oct 23, 2015. You can come to know about various career opportunities available in the field of Big Data.
Date: October 23, 2015 (Friday)
Time: 11:00 AM - 12:00 PM PST
Presenter: Kannan R Srinivasan
Registration Link: http://bit.ly/1NNkgQA
Topics covered:
Here are the various topics explained during this webinar session
1. Hadoop Ecosystem Introduction
2. Understanding Big Data and Hadoop
3. What is Hadoop?
4. Introduction to Pig, MapReduce, Hive and HBase.
5. Q & A
 
 Don’t miss it! Enroll now @ http://bit.ly/1NNkgQA

Forums: 

Teradata Binary file to Hadoop Sequence File

Viewing all 43 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>