When i try to run a word-count prg in mapreduce using oozie.. It just reads the input records and displays it. I guess its not even invoking my mapper and reducer classes.Since i am using the new API, have included the new-api property tag also in workflow.xml.
Map-reduce snippet:
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
my workflow.xml:
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns='uri:oozie:workflow:0.1' name="wordcount">
<start to="wc-node" />
<action name="wc-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${wordcountRoot}/output- data/${outputDir}"/>
</prepare>
<configuration>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>WordCount.Map</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>WordCount.Reduce</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/${wf:user()}/${wordcountRoot}/input-data</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/${wf:user()}/${wordcountRoot}/output-data/${outputDir}</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapreduce.job.acl-view-job</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.acl-view-job</name>
<value>*</value>
</property>
</configuration>
</map-reduce>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Map/Reduce failed</message>
</kill>
<end name="end" />
I referred this link https://cwiki.apache.org/OOZIE/map-reduce-cookbook.html but still no luck.
If any1 has come across this issue, please guide me as to where i am going wrong.
Thanks in advance.
Issue resolved...
While using new mapreduce API..we need to prefix the "$" symbol to the mapper and reducer class name:
<property>
<name>mapreduce.map.class</name>
<value>oozie.WordCount$Map</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>oozie.WordCount$Reduce</value>
</property>
Related
We have a file-based SAML IdP configuration for WSO2AM-2.1.0 (similar to this one) and we'd like to migrate to wso2am-2.6.0
Using the same IdP cnofiguration file the IdP is not configured and in the logs we see:
ERROR - IdentityProvider Error while building default provisioning connector config for IDP oamidp.
Cause : No configured name found for ProvisioningConnectorConfig Building rest of the IDP configs
It's the XML file configuration in repository/conf/identity/identity-providers/
I found an example configuration documented in here https://docs.wso2.com/display/IS570/Configuring+a+SP+and+IdP+Using+Configuration+Files
I believe our configuration is compliant with the exaample (which is not mentioning any ProvisioningConnectorConfig tag
the DefaultProvisioningConnectorConfig needs to be commented out when empty
<IdentityProvider>
<IdentityProviderName>oamidp</IdentityProviderName>
<DisplayName>oamidp</DisplayName>
<IdentityProviderDescription>Access Manager DEV</IdentityProviderDescription>
<Alias>oamidp</Alias>
<IsPrimary/>
<IsEnabled>true</IsEnabled>
<IsFederationHub/>
<HomeRealmId/>
<ProvisioningRole/>
<FederatedAuthenticatorConfigs>
<saml2>
<Name>SAMLSSOAuthenticator</Name>
<DisplayName>samlsso</DisplayName>
<IsEnabled>true</IsEnabled>
<Properties>
<property>
<Name>IdpEntityId</Name>
<Value>http://localhost/simplesaml/saml2/idp/metadata.php</Value>
</property>
<property>
<Name>IsLogoutEnabled</Name>
<Value>true</Value>
</property>
<property>
<Name>SPEntityId</Name>
<Value>https://wso2am-test/sp</Value>
</property>
<property>
<Name>SSOUrl</Name>
<Value>http://localhost/simplesaml/saml2/idp/SSOService.php</Value>
</property>
<property>
<Name>isAssertionSigned</Name>
<Value>false</Value>
</property>
<property>
<Name>commonAuthQueryParams</Name>
<Value/>
</property>
<property>
<Name>IsUserIdInClaims</Name>
<Value>false</Value>
</property>
<property>
<Name>IsLogoutReqSigned</Name>
<Value>false</Value>
</property>
<property>
<Name>IsAssertionEncrypted</Name>
<Value>false</Value>
</property>
<property>
<Name>IsAuthReqSigned</Name>
<Value>true</Value>
] </property>
<!-- there was a typo in the code, we have both values to be sure -->
<property>
<Name>ISAuthnReqSigned</Name>
<Value>true</Value>
</property>
<property>
<Name>IsAuthnRespSigned</Name>
<Value>true</Value>
</property>
<property>
<Name>LogoutReqUrl</Name>
<Value>https://logon-test.mycomp.com/oamfed/idp/samlv20</Value>
<!-- Value>false</Value -->
</property>
</Properties>
</saml2>
</FederatedAuthenticatorConfigs>
<DefaultAuthenticatorConfig>SAMLSSOAuthenticator</DefaultAuthenticatorConfig>
<ProvisioningConnectorConfigs/>
<!-- DefaultProvisioningConnectorConfig/ -->
<ClaimConfig>
<LocalClaimDialect>true</LocalClaimDialect>
<ClaimMappings>
</ClaimMappings>
</ClaimConfig>
<Certificate>MII....ZNYg=</Certificate>
<PermissionAndRoleConfig/>
<JustInTimeProvisioningConfig>
<UserStoreClaimUri/>
<ProvisioningUserStore/>
<IsProvisioningEnabled>false</IsProvisioningEnabled>
</JustInTimeProvisioningConfig>
</IdentityProvider>
I have implemented secondary sort in mapreduce and trying to execute it using Oozie (From Hue).
Though I have set the partitioner class in the properties, the partitioner is not being executed. So, I'm not getting output as expected.
The same code runs fine when run using hadoop command.
And here is my workflow.xml
<workflow-app name="MyTriplets" xmlns="uri:oozie:workflow:0.5">
<start to="mapreduce-598d"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="mapreduce-598d">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.output.dir</name>
<value>/test_1109_3</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/apps/hive/warehouse/7360_0609_rx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0609_tx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0509_util/day=05-09-2017/hour=16/quarter=1/</value>
</property>
<property>
<name>mapred.input.format.class</name>
<value>org.apache.hadoop.hive.ql.io.RCFileInputFormat</value>
</property>
<property>
<name>mapred.mapper.class</name>
<value>PonRankMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>PonRankReducer</value>
</property>
<property>
<name>mapred.output.value.comparator.class</name>
<value>PonRankGroupingComparator</value>
</property>
<property>
<name>mapred.mapoutput.key.class</name>
<value>PonRankPair</value>
</property>
<property>
<name>mapred.mapoutput.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.reduce.output.key.class</name>
<value>org.apache.hadoop.io.NullWritable</value>
</property>
<property>
<name>mapred.reduce.output.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.partitioner.class</name>
<value>PonRankPartitioner</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>False</value>
</property>
</configuration>
</map-reduce>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
When running using hadoop jar command, I set the partitioner class using JobConf.setPartitionerClass API.
Not sure why my partitioner is not executed when running using Oozie. Inspite of adding
<property>
<name>mapred.partitioner.class</name>
<value>PonRankPartitioner</value>
</property>
Any What I'm missing when running it from Oozie ??
Solved this by re-writing the mapreduce job using new API's.
The property used in oozie workflow for partitioner was mapreduce.partitioner.class.
When I run the mapred job manually, it produces a valid avro file with . avro extension. But when I write it in oozie workflow, it produces a text file, which is a corrupt avro file. Here is my workflow:
<workflow-app name='sample-wf' xmlns="uri:oozie:workflow:0.2">
<start to='start_here'/>
<action name='start_here'>
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/hadoop/${workFlowRoot}/final-output-data"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/hadoop/${workFlowRoot}/input-data</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/hadoop/${workFlowRoot}/final-output-data</value>
</property>
<property>
<name>mapreduce.mapper.class</name>
<value>org.apache.avro.mapred.HadoopMapper</value>
</property>
<property>
<name>mapreduce.reducer.class</name>
<value>org.apache.avro.mapred.HadoopReducer</value>
</property>
<property>
<name>avro.mapper</name>
<value>com.flipkart.flap.data.batch.mapred.TestAvro$CFDetectionMapper</value>
</property>
<property>
<name>avro.reducer</name>
<value>com.flipkart.flap.data.batch.mapred.TestAvro$CFDetectionReducer</value>
</property>
<property>
<name>mapreduce.input.format.class</name>
<value>org.apache.avro.mapreduce.AvroKeyInputFormat</value>
</property>
<property>
<name>avro.schema.input.key</name>
<value>{... schema ...}</value>
</property>
<property>
<name>mapreduce.mapoutput.key.class</name>
<value>org.apache.hadoop.io.AvroKey</value>
</property>
<property>
<name>avro.map.output.schema.key</name>
<value>{... schema ...}</value>
</property>
<property>
<name>mapreduce.mapoutput.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapreduce.output.format.class</name>
<value>org.apache.avro.mapred.AvroKeyValueOutputFormat</value>
</property>
<property>
<name>mapreduce.output.key.class</name>
<value>org.apache.avro.mapred.AvroKey</value>
</property>
<property>
<name>mapreduce.output.value.class</name>
<value>org.apache.avro.mapred.AvroValue</value>
</property>
<property>
<name>avro.schema.output.key</name>
<value>{ .... schema .... }</value>
</property>
<property>
<name>avro.schema.output.value</name>
<value>"string"</value>
</property>
<property>
<name>mapreduce.output.key.comparator.class</name>
<value>org.apache.avro.mapred.AvroKeyComparator</value>
</property>
<property>
<name>io.serializations</name>
<value>org.apache.hadoop.io.serializer.WritableSerialization,org.apache.avro.mapred.AvroSerialization
</value>
</property>
</configuration>
</map-reduce>
<ok to='end'/>
<error to='fail'/>
</action>
<kill name='fail'>
<message>MapReduce failed, error message[$sf:errorMessage(sf:lastErrorNode())}]</message>
</kill>
<end name='end'/>
</workflow-app>
And my mapper and reducer are defined like these :
public static class CFDetectionMapper extends
Mapper<AvroKey<AdClickFraudSignalsEntity>, NullWritable, AvroKey<AdClickFraudSignalsEntity>, Text> {
}
public static class CFDetectionReducer extends
Reducer<AvroKey<AdClickFraudSignalsEntity>, Text, AvroKey<AdClickFraudSignalsEntity>, AvroValue<CharSequence>>
Can you please tell me what is wrong here?
You are using some wrong property names:
<property>
<name>mapreduce.mapoutput.key.class</name>
<value>org.apache.hadoop.io.AvroKey</value>
</property>
[...]
<property>
<name>mapreduce.mapoutput.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
should be:
<property>
<name>mapreduce.map.output.key.class</name>
<value>org.apache.hadoop.io.AvroKey</value>
</property>
[...]
<property>
<name>mapreduce.map.output.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
(note the added dot). This used to be different for the earlier mapred names, but now its this way - see http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/DeprecatedProperties.html.
The filter I wrote threw ClassCastException
[Ljava.security.cert.X509Certificate; cannot be cast to java.security.cert.X509Certificate
when I tried to cast an Object extracted from the ServletRequest attribute, i.e.
public void doFilter(ServletRequest req, ServletResponse res, FilterChain filterChain) throws
IOException, ServletException
{
X509Certificate cert = (X509Certificate) req.getAttribute("javax.servlet.request.X509Certificate");
System.out.println("cert dn " + cert.getSubjectDN().toString());
filterChain.doFilter(req, res);
}
As I dug deeper I understood that exception like this was most probably caused by different classloaders though they are of same class type. How do I resolve this?
Thanks
I used the following Spring 3 configurarion to load Jetty 7 piecemeal
<bean class="org.eclipse.jetty.server.Server"
init-method="start" destroy-method="stop">
<property name="connectors">
<list>
<bean id="SSLConnector" class="org.eclipse.jetty.server.ssl.SslSocketConnector">
<property name="port" value="8553"/>
<property name="maxIdleTime" value="3600000"/>
<property name="soLingerTime" value="-1"/>
<property name="needClientAuth" value="true"/>
<property name="sslContext">
<ref bean="sslContext"/>
</property>
</bean>
</list>
</property>
<property name="handler">
<bean name="contexts" class="org.eclipse.jetty.server.handler.ContextHandlerCollection">
<property name="handlers">
<list>
<bean class="org.eclipse.jetty.servlet.ServletContextHandler">
<property name="contextPath">
<value>/caas</value>
</property>
<property name="resourceBase" value="src/main/secure_webapp"/>
<property name="sessionHandler">
<bean class="org.eclipse.jetty.server.session.SessionHandler"/>
</property>
<property name="servletHandler">
<bean class="org.eclipse.jetty.servlet.ServletHandler">
<property name="filters">
<list>
<bean class="org.eclipse.jetty.servlet.FilterHolder">
<property name="name" value="myfilter"/>
<property name="filter">
<bean class="com.acme.MyFilter"/>
</property>
</bean>
</list>
</property>
<property name="filterMappings">
<list>
<bean class="org.eclipse.jetty.servlet.FilterMapping">
<property name="pathSpec">
<value>/*</value>
</property>
<property name="filterName"
value="myfilter"/>
</bean>
</list>
</property>
<property name="servlets">
<list>
<bean class="org.eclipse.jetty.servlet.ServletHolder">
<property name="name" value="default"/>
<property name="servlet">
<bean class="org.eclipse.jetty.servlet.DefaultServlet"/>
</property>
</bean>
</list>
</property>
<property name="servletMappings">
<list>
<bean class="org.eclipse.jetty.servlet.ServletMapping">
<property name="pathSpecs">
<list>
<value>/</value>
</list>
</property>
<property name="servletName" value="default"/>
</bean>
</list>
</property>
</bean>
</property>
</bean>
</list>
</property>
</bean>
</property>
</bean>
I don't think it's a duplicate class problem in this case, because X509Certificate is contained in the core JRE libraries. There is, afaik, no other library which provides this abstract class.
I think the problem is the getAttribute() returns an array of X509Certificate objects, whereas you cast it down to a single object. The beginning [L of the ClassCastException message indicates that the object returned is an array.
Try casting to an array of certificates:
X509Certificate[] cert = (X509Certificate[])
req.getAttribute("javax.servlet.request.X509Certificate");
Also, i think that you should retrieve the object from getAttribute() and use instanceof checks to see whether it contains the desired types and maybe handle them differently.
I wrote a test case that extends AbstractTransactionalJUnit4SpringContextTests. The single test case I've written creates an instance of class User and attempts to write it to the database using Hibernate. The test code then uses SimpleJdbcTemplate to execute a simple select count(*) from the user table to determine if the user was persisted to the database or not. The test always fails though. I was suspect because in the Spring controller I wrote, the ability to save an instance of User to the db is successful.
So I added the Rollback annotation to the unit test and sure enough, the data is written to the database since I can even see it in the appropriate table -- the transaction isn't rolled back when the test case is finished.
Here's my test case:
#ContextConfiguration(locations = {
"classpath:context-daos.xml",
"classpath:context-dataSource.xml",
"classpath:context-hibernate.xml"})
public class UserDaoTest extends AbstractTransactionalJUnit4SpringContextTests {
#Autowired
private UserDao userDao;
#Test
#Rollback(false)
public void teseCreateUser() {
try {
UserModel user = randomUser();
String username = user.getUserName();
long id = userDao.create(user);
String query = "select count(*) from public.usr where usr_name = '%s'";
long count = simpleJdbcTemplate.queryForLong(String.format(query, username));
Assert.assertEquals("User with username should be in the db", 1, count);
}
catch (Exception e) {
e.printStackTrace();
Assert.assertNull("testCreateUser: " + e.getMessage());
}
}
}
I think I was remiss by not adding the configuration files.
context-hibernate.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd>
<bean id="namingStrategy" class="org.springframework.beans.factory.config.FieldRetrievingFactoryBean">
<property name="staticField">
<value>org.hibernate.cfg.ImprovedNamingStrategy.INSTANCE</value>
</property>
</bean>
<bean id="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean" destroy-method="destroy" scope="singleton">
<property name="namingStrategy">
<ref bean="namingStrategy"/>
</property>
<property name="dataSource" ref="dataSource"/>
<property name="mappingResources">
<list>
<value>com/company/model/usr.hbm.xml</value>
</list>
</property>
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">org.hibernate.dialect.PostgreSQLDialect</prop>
<prop key="hibernate.show_sql">true</prop>
<prop key="hibernate.use_sql_comments">true</prop>
<prop key="hibernate.query.substitutions">yes 'Y', no 'N'</prop>
<prop key="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</prop>
<prop key="hibernate.cache.use_query_cache">true</prop>
<prop key="hibernate.cache.use_minimal_puts">false</prop>
<prop key="hibernate.cache.use_second_level_cache">true</prop>
<prop key="hibernate.current_session_context_class">thread</prop>
</props>
</property>
</bean>
<bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager">
<property name="sessionFactory" ref="sessionFactory"/>
<property name="nestedTransactionAllowed" value="false" />
</bean>
<bean id="transactionInterceptor" class="org.springframework.transaction.interceptor.TransactionInterceptor">
<property name="transactionManager">
<ref local="transactionManager"/>
</property>
<property name="transactionAttributes">
<props>
<prop key="create">PROPAGATION_REQUIRED</prop>
<prop key="delete">PROPAGATION_REQUIRED</prop>
<prop key="update">PROPAGATION_REQUIRED</prop>
<prop key="*">PROPAGATION_SUPPORTS,readOnly</prop>
</props>
</property>
</bean>
</beans>
context-dataSource.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
<bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close">
<property name="driverClass" value="org.postgresql.Driver" />
<property name="jdbcUrl" value="jdbc\:postgresql\://localhost:5432/company_dev" />
<property name="user" value="postgres" />
<property name="password" value="postgres" />
</bean>
</beans>
context-daos.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
<bean id="extendedFinderNamingStrategy" class="com.company.dao.finder.impl.ExtendedFinderNamingStrategy"/>
<bean id="finderIntroductionAdvisor" class="com.company.dao.finder.impl.FinderIntroductionAdvisor"/>
<bean id="abstractDaoTarget" class="com.company.dao.impl.GenericDaoHibernateImpl" abstract="true" depends-on="sessionFactory">
<property name="sessionFactory">
<ref bean="sessionFactory"/>
</property>
<property name="namingStrategy">
<ref bean="extendedFinderNamingStrategy"/>
</property>
</bean>
<bean id="abstractDao" class="org.springframework.aop.framework.ProxyFactoryBean" abstract="true">
<property name="interceptorNames">
<list>
<value>transactionInterceptor</value>
<value>finderIntroductionAdvisor</value>
</list>
</property>
</bean>
<bean id="userDao" parent="abstractDao">
<property name="proxyInterfaces">
<value>com.company.dao.UserDao</value>
</property>
<property name="target">
<bean parent="abstractDaoTarget">
<constructor-arg>
<value>com.company.model.UserModel</value>
</constructor-arg>
</bean>
</property>
</bean>
</beans>
Some of this I've inherited from someone else. I wouldn't have used the proxying that is going on here because I'm not sure it's needed but this is what I'm working with.
Any help much appreciated.
IIRC, Hibernate (and other O/R mappers) delay the database INSERT and UPDATE statements until the transaction commit. The SELECT then does not see the data, as it is not yet written. Try explicitly requesting a session flush.
See also the documentation for an explanation and example:
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/objectstate.html#objectstate-flushing
Doesn't transactionInterceptor automatically commits?
http://www.docjar.com/html/api/org/springframework/transaction/interceptor/TransactionInterceptor.java.html
Line 117:
commitTransactionAfterReturning(txInfo);
By default the flush mode should be auto, so there should be an implicit flush when commit is performed. I have the same problem and had to solve it by manually calling flush. That is very unsatisfying because my existing code depends on auto-flush before migrating to Spring transation manager. I wish there is a way to tell org.springframework.orm.hibernate3.HibernateTransactionManager to flush after commit.