I would like to know how it is possible to perform unit tests on the ETL developed on Talend.
My ETLs performs files reading, files generation, and connection with SAP system. (read/write IDOC).
Is there any tools? Is what it takes to develop a small java Test Framework?
Yes, Mohcine, Talend introduced in version 6 test case automation which is part of its overall Continuous Integration framework. You right click on a component in a job and select "Create Test Case". It will create a skeleton test case job. You can extend this test case job to perform a variety of tests, including db connectivity and results. It will take some to learn the tool to make it useful, but worth the effort. Also, this feature may only be available in the subscription version of Talend, I am not sure if its available in Open Studio.
Here is an example: diagram is a very simple job that loads a file into a db table.
Here is the test case I created by first generating the skeleton, then modifying it for my specific purposes.
Here is the assert where I match the number of rows read from the file with the number of rows inserted into the db table.
For further info check out this tutorial.
Related
I have seen a few different unit testing approaches online using qunit and k4unit but I can only get them testing on single functions. I was hoping that I could run a unit test that check the daily checks I execute each day such as, "has the nightjobs ran correctly?", "are the dashboards on the WebUI up?", "did the deploy script run with no errors?". Is there built in kdb+ functionality for these kind of tests or a clean way to adapt the qunit or k4unit unit tests? Or will it require a script written from scratch?
Thanks
I don't think a unit test is what you're looking for here. Some kind of reporting mechanism for jobs would be more appropriate. Within your existing jobs you could generate some kind of alert to indicate the job's success/failure. This qmail library may be useful for that.
I'm not sure what kind of system you're using, but AquaQ Analytics' TorQ system has a reporter process which can (amongst other things) email alerts for specific processes.
(Disclaimer: I'm an employee of AquaQ Analytics)
I have designed a job in Talend. The job is fetching data from database and converting it into json and it uploads that json on server. I want to write test case for my job like we write unit test in java projects. I have searched a lot on how to write test case for talend job but did not find anything. If any one know how to test talend job please suggest.
You can simply create a job which call your job (either tRunJob or tSoap if your job is soap-exposed):
Init your database
call your job
check the result on the server (or mock the server call by overriding context parameters)
use tAssert to make your check
use tAssertCatcher->tLogRow to print test result
I made a CI (internal project) for our project with a basic Java application, which is a telnet wrapper around the Talend Command Line API (listJob, runJob...), then generates a Junit XML result file. Everything is called by Jenkins.
It seems that nothing really exist to perfectly test Talend jobs :-(
Good luck.
In talend 6.0.1 i found a tab named "Test Cases", it seems new to me. On https://help.talend.com/display/TalendRealtimeBigDataPlatformStudioUserGuide60EN/6.10+Testing+Jobs+using+test+cases you can find an explanation on writing such tescases. Im not sure if its what you wanted but i will have look on that.
For the end to end testing we are running two versions of the job asking the user which version he needs to compare with which version and dynamically creating the table on the fly and compare the result at the db side. This is just an attempt.
Yeah there is no Junit OOB(out of the box.)
let me begin by saying I 'm a coldfusion newbie.
I 'm trying to research if its possible to do the following and what would be the best approach to achieve it.
Whenever a developer checks in code into SVN, I would like to do a get all the new changes/files and do an auto build to check if the code can be deployed successfully to production server. I guess there are two parts to it, one syntax checking and second integration test(if functionality is working as expected). For the later part some unit test tools would have to be used.
Can someone comment on their experience doing something similar for coldfusion.
Sorry for being a bit vague...I know its a very open-ended question but any feedback would be appreciated.
Thanks
There's a project called "Cloudy With A Chance of Tests" that purports to do what you require. In particular it brings together a number of other CFML code analysis projects (VarScope & QueryParam) to check code, as well as unit testing. I am not currently using it myself but did have a look at it some time ago (more than 12 months) and it appeared to be quite good.
https://github.com/mhenke/Cloudy-With-A-Chance-Of-Tests
Personally I run MXUnit tests in Jenkins using the instructions from the MXUnit site - available here:
http://wiki.mxunit.org/display/default/Continuous+Integration+--+Running+tests+with+Jenkins
Essentially this is set up as an ant task in Jenkins, which executes the MXUnit tests and reports back the results.
We're not doing fully continuos integration, but we have a process which automates some of the drudgery of our builds:
replace the site's application.cf(m|c) with one that tells users that the app is being deployed (we had QA staff raising defects that were due to re-deployments)
read a database manifest XML which lists all SQL scripts which make up the current release. We concatenate the scripts into a single upgrade script, suitable for shipping
execute the SQL script against the server's DB, noting any errors. The concatenation process also adds a line of SQL after each imported script that white to a runlog table, so we can see what ran, how long it took and which build it was associated with. If you're looking to replicate this step, take a look at Liquibase
deploy the latest code
make an http call to a ?reset=true type URL to tell the app to re-initialize
execute any tests
The build is requested manually through the build servers we have, but you click a button, make tea and it's done.
We've just extended the above to cope with multiple servers in a cluster and it ticks along nicely. I think the above suggestion of using the Jenkins SVN plugin to automate the process sounds like the way to go.
I want to test my database as part of a set of integration tests. I've got all my code unit tested against mocks etc for speed but I need to make sure all the stored procedures and code is working as it should when persisting. I did some Googling yesterday and found a nice article here http://msdn.microsoft.com/en-us/magazine/cc163772.aspx but it seemed a little old. I wondered if there is any current 'better' way of clearing out the database, restoring to an expected state or rolling back ready for each test? I'm coding in c#4, mvc3 using sql 2008.
We are using DbUnit to set up and/or tear down the database between tests as well as to assert database state during test.
It's stupid-simple, so it may not be exactly what you need, but what I've done is keep a backup of the database at a given sane state - usually what the current production database is it. Then, for each build we restore that database (using Jenkins, NANT and SQLCMD), apply the current builds update scripts and run our test suite. This has the advantage of both giving you a database that is a 'known quantity' and it verifies that your upgrade scripting is working.
I am about to embark on a project using Apache Hadoop/Hive which will involve a collection of hive query scripts to produce data feeds for various down stream applications. These scripts seem like ideal candidates for some unit testing - they represent the fulfillment of an API contract between my data store and client applications, and as such, it's trivial to write what the expected results should be for a given set of starting data. My issue is how to run these tests.
If I was working with SQL queries, I could use something like SQLlite or Derby to quickly bring up test databases, load test data and run a collection of query tests against them. Unfortunately, I am unaware of any such tools for Hive. At the moment, my best thought is to have the test framework bring up a hadoop local instance and run Hive against that, but I've never done that before and I'm not sure it will work, or be the right path.
Also, I'm not interested in a pedantic discussion about if what I am doing is unit testing or integration testing - I just need to be able to prove my code works.
Hive has special standalone mode, specifically design for the testing purposes. In this case it can run without hadoop. I think it is exactly what you need.
There is a link to the documentation:
http://wiki.apache.org/hadoop/Hive/HiveServer
I'm working as part of a team to support a big data and analytics platform, and we also have this kind of issue.
We've been searching for a while and we found two pretty promising tools: https://github.com/klarna/HiveRunner https://github.com/bobfreitas/HadoopMiniCluster
HiveRunner is a framework built on top of JUnit to test Hive Queries. It starts a standalone HiveServer with in memory HSQL as the metastore. With it you can stub tables, views, mock samples, etc.
There are some limitations on Hive versions though, but I definitely recommend it
Hope it helps you =)
You may also want to consider the following blog post which describes automating unit testing using a custom utility class and ant: http://dev.bizo.com/2011/04/hive-unit-testing.html
I know this is an old thread, but just in case someone comes across it. I have followed up on the whole minicluster & hive testing, and found that things have changed with MR2 and YARN, but in a good way. I have put together an article and github repo to give some help in it:
http://www.lopakalogic.com/articles/hadoop-articles/hive-testing/
Hope it helps!