We need to your assistance for the below AWS ETL Glue issue .
We are trying to read json files using AWS Glue dynamic frame .
Ex input json data :
{"type":"TripLog","count":"2","def":["CreateTimestamp","UUID","DataTimestamp","VIN","DrivingRange","DrivingRangeUnit","FinishPos.Datum","FinishPos.Event","FinishPos.Lat","FinishPos.Lon","FinishPos.Odo","FinishPos.Time","FuelConsumption1Trip","FuelConsumption1TripUnit","FuelConsumptionTripA","FuelConsumptionTripAUnit","FuelUsed","FuelUsedUnit","Mileage","MileageUnit","ODOUnit","Score.AcclAdviceMsg","Score.AcclScore","Score.AcclScoreUnit","Score.BrakeAdviceMsg","Score.BrakeScore","Score.BrakeScoreUnit","Score.ClassJudge","Score.IdleAdviceMsg","Score.IdleScore","Score.IdleScoreUnit","Score.IdleStopTime","Score.LifetimeTotalScore","Score.TotalScore","Score.TotalScoreUnit","StartPos.Datum","StartPos.Lat","StartPos.Lon","StartPos.Odo","StartPos.Time","TripDate","TripId"],"data":[["2017-10-17 08:47:17.930","xxxxxxx","20171017084659"," xxxxxxxxxxx ","419","mile","WGS84","Periodic intervals during IG ON","38,16,39.846","-77,30,45.230","33559","20171017-033104","50.1","M-G - mph(U.S. gallon)","36.0","M-G - mph(U.S. gallon)","428.1","cm3",null,null,"km",null,null,"%",null,null,"%",null,null,null,"%","0x0",null,null,"%","WGS84","39,12,50.988","-76,38,36.417","33410","20171017-015103","20171017-015103","0"],["2017-10-17 08:47:17.930"," xxxxxxx ","20171017084659","xxxxxxxxxxx","414","mile","WGS84","Periodic intervals during IG ON","38,12,12.376","-77,29,57.915","33568","20171017-033604","50.1","M-G - mph(U.S. gallon)","36.0","M-G - mph(U.S. gallon)","838.0","cm3",null,null,"km",null,null,"%",null,null,"%",null,null,null,"%","0x0",null,null,"%","WGS84","39,12,50.988","-76,38,36.417","33410","20171017-015103","20171017-015103","0"]]}
Step 1): Code to read the json file to dynamic frame : * landing_location(Our file location)
dyf = glueContext.create_dynamic_frame.from_options(connection_type = "s3",connection_options= {"paths": [landing_location], 'recurse':True, 'groupFiles': 'inPartition', 'groupSize': '1048576'},format = "json", transformation_ctx = "dyf")
dyf.printSchema()
root |-- type: string |-- count: string |-- def: array | |-- element: string |-- data: array | |-- element: array | | |-- element: choice | | | |-- int | | | |-- string
Step 2): Converting into spark data frame and exploding the data.
dtcadf = dyf.toDF()
dtcadf.show(truncate=False)
dtcadf.registerTempTable('dtcadf')
data=spark.sql('select explode(data) from dtcadf')
data.show(1,False)
Getting below issue :
An error occurred while calling o270.showString.
Note : Same file we can able to succeed when we directly read the file using spark data frame instead of AWS Glue dynamic frame.
Can you please help me to resolve the issue ,And do let me know for further information from our end.
Hello i have a regex problem,
This is the text structure:
TK00123456: Change a lot gibberish 16:34. --- access : [ more
gibberish Module](http://somewebsite.com/selectedModuleCode=Support
form.aspx longblob) summary --- | Properties | | --- Creator | more
gibberish | 16/01/2018 16:26:53 Manager | External Status |
Working on Resolution
Proper English Text
This is my regex
re.match(r'(?s)Change(.*?)Working', text)
Output:
None
Using same RegEx on https://regex101.com/
Match 1 Full match 12-270
`Change a lot gibberish 16:34. --- access :
[ more gibberish
Module](http://somewebsite.com/selectedModuleCode=Support form.aspx
longblob) summary --- | Properties | | --- Creator | more gibberish |
16/01/2018 16:26:53 Manager | External Status |
Working`
I have python version 2.6.6 on RHEL and I cant upgrade to python 2.7 if that is the problem.
Any Suggestions?
You are looking for re.search() rather than re.match():
import re
string = """
TK00123456: Change a lot gibberish 16:34. --- access : [ more gibberish Module](http://somewebsite.com/selectedModuleCode=Support form.aspx longblob) summary --- | Properties | | --- Creator | more gibberish | 16/01/2018 16:26:53 Manager | External Status |
Working on Resolution
Proper English Text
"""
rx = re.compile(r'(?s)Change(.*?)Working')
print(rx.search(string).group(0))
Explanation: re.match() only matches at the beginning of the string and there is no Change (see the TK00123456: there?).
I am trying to upload around 10 GB file from my local machine to S3 (inside a camel route). Although file gets uploaded in around 3-4 minutes, but it also throwing following exception:
2014-06-26 13:53:33,417 | INFO | ads.com/outbound | FetchRoute | 167 - com.ut.ias - 2.0.3 | Download complete to local. Pushing file to S3
2014-06-26 13:54:19,465 | INFO | manager-worker-6 | AmazonHttpClient | 144 - org.apache.servicemix.bundles.aws-java-sdk - 1.5.1.1 | Unable to execute HTTP request: The target server failed to respond
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)[141:org.apache.httpcomponents.httpcore:4.2.4]
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)[141:org.apache.httpcomponents.httpcore:4.2.4]
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)[141:org.apache.httpcomponents.httpcore:4.2.4]
.......
at java.util.concurrent.FutureTask.run(FutureTask.java:262)[:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_55]
at java.lang.Thread.run(Thread.java:744)[:1.7.0_55]
2014-06-26 13:55:08,991 | INFO | ads.com/outbound | FetchRoute | 167 - com.ut.ias - 2.0.3 | Upload complete.
Due to which camel route doesn't stop and it is continuously throwing InterruptedException:
2014-06-26 13:55:11,182 | INFO | ads.com/outbound | SftpOperations | 110 - org.apache.camel.camel-ftp - 2.12.1 | JSCH -> Disconnecting from cxportal.integralads.com port 22
2014-06-26 13:55:11,183 | INFO | lads.com session | SftpOperations | 110 - org.apache.camel.camel-ftp - 2.12.1 | JSCH -> Caught an exception, leaving main loop due to Socket closed
2014-06-26 13:55:11,183 | WARN | lads.com session | eventadmin | 139 - org.apache.felix.eventadmin - 1.3.2 | EventAdmin: Exception: java.lang.InterruptedException
java.lang.InterruptedException
at EDU.oswego.cs.dl.util.concurrent.LinkedQueue.offer(Unknown Source)[139:org.apache.felix.eventadmin:1.3.2]
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor.execute(Unknown Source)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.tasks.DefaultThreadPool.executeTask(DefaultThreadPool.java:101)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks.execute(AsyncDeliverTasks.java:105)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.handler.EventAdminImpl.postEvent(EventAdminImpl.java:100)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.adapter.LogEventAdapter$1.logged(LogEventAdapter.java:281)[139:org.apache.felix.eventadmin:1.3.2]
at org.ops4j.pax.logging.service.internal.LogReaderServiceImpl.fire(LogReaderServiceImpl.java:134)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.LogReaderServiceImpl.fireEvent(LogReaderServiceImpl.java:126)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.PaxLoggingServiceImpl.handleEvents(PaxLoggingServiceImpl.java:180)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.PaxLoggerImpl.inform(PaxLoggerImpl.java:145)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.internal.TrackingLogger.inform(TrackingLogger.java:86)[18:org.ops4j.pax.logging.pax-logging-api:1.7.1]
at org.ops4j.pax.logging.slf4j.Slf4jLogger.info(Slf4jLogger.java:476)[18:org.ops4j.pax.logging.pax-logging-api:1.7.1]
at org.apache.camel.component.file.remote.SftpOperations$JSchLogger.log(SftpOperations.java:359)[110:org.apache.camel.camel-ftp:2.12.1]
at com.jcraft.jsch.Session.run(Session.java:1621)[109:org.apache.servicemix.bundles.jsch:0.1.49.1]
at java.lang.Thread.run(Thread.java:744)[:1.7.0_55]
Please see my code below and let me know, where I am going wrong:
TransferManager tm = new TransferManager(
S3Client.getS3Client());
// TransferManager processes all transfers asynchronously,
// so this call will return immediately.
Upload upload = tm.upload(
Utils.getProperty(Constants.BUCKET),
getS3Key(file.getName()), file);
try {
upload.waitForCompletion();
logger.info("Upload complete.");
} catch (AmazonClientException amazonClientException) {
logger.warn("Unable to upload file, upload was aborted.");
amazonClientException.printStackTrace();
}
The stacktrace doesn't even have any reference to my code, hence couldn't determine where the issue is.
Any help or pointer would be really appreciated.
Thanks
I have set up my version table info in my version.rc file as follows:
+-----------------+-----------------------------------------+
| Key | Value |
+-----------------+-----------------------------------------+
| CompanyName | MyCompany
| FileDescription | A test application that does something. |
| InternalName | TestApp |
| FileVersion | 1.0.0 |
| OriginalFilename| TestApp.exe
| ProductVersion | 1.0.0 |
| ProductName | Test Application |
+-----------------+-----------------------------------------+
Whenever my application crashes or some antivirus message pops up asking for permission or basically any event that displays my application name occurs, the application name is displayed as "A test application that does something", i.e. FileDescription is taken as the application name. I am using this article as my reference.
What I see:
What I Want to See:
To achieve the second image, I edited FileDescription to "Test Application".
BUT, now in the task manager (and other areas where the description is used),
After Editing FileDescription to "Test Application":
Before Editing FileDescription:
I want to know if there's some way to specify to the OS to use the ProductName in the first case above (and other similar cases) and FileDescription in the second case above (and other similar cases).