AWS Lambda download a file using Chromedriver - amazon-web-services

I have a container that is built to run selenium-chromedriver with python to download an excel(.xlsx) file from a website.
I am Using SAM to build & deploy this image to be run in AWS Lambda.
When I build the container and invoke it locally, the program executes as expected: The download occurs and I can see the file placed in the root directory of the container.
The problem is: when I deploy this image to AWS and invoke my lambda function I get no errors, however, my download is never executed. The file never appears in my root directory.
My first thought was that maybe I didn't allocate enough memory to the lambda instance. I gave it 512 MB, and the logs said it was using 416MB. Maybe there wasn't enough room to fit another file inside? So I have increased the memory provided to 1024 MB, but still no luck.
My next thought was that maybe the download was just taking a long time, so I also allowed the program to wait for 5 minutes after clicking the download to ensure that the download is given time to complete. Still no luck.
I have also tried setting the following options for chromedriver (full list of chromedriver options posted at bottom):
options.add_argument(f"--user-data-dir={'/tmp'}"),
options.add_argument(f"--data-path={'/tmp'}"),
options.add_argument(f"--disk-cache-dir={'/tmp'}")
and also setting tempfolder = mkdtemp() and passing that into the chrome options as above in place of /tmp. Still no luck.
Since this applicaton is in a container, it should run the same locally as it does on AWS. So I am wondering if it is part of the config outside of the container that is blocking my ability to download a file? Maybe the request is going out but the response is not being allowed back in?
Please let me know if there is anything I need to clarify -- Any help on this issue is greatly appreciated!
Full list of Chromedriver options
options.binary_location = '/opt/chrome/chrome'
options.headless = True
options.add_argument('--disable-extensions')
options.add_argument('--no-first-run')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--disable-client-side-phishing-detection')
options.add_argument('--allow-running-insecure-content')
options.add_argument('--disable-web-security')
options.add_argument('--lang=' + random.choice(language_list))
options.add_argument('--user-agent=' + fake_user_agent.user_agent())
options.add_argument('--no-sandbox')
options.add_argument("--window-size=1920x1080")
options.add_argument("--single-process")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-dev-tools")
options.add_argument("--no-zygote")
options.add_argument(f"--user-data-dir={'/tmp'}")
options.add_argument(f"--data-path={'/tmp'}")
options.add_argument(f"--disk-cache-dir={'/tmp'}")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument("enable-automation")
options.add_argument("--headless")
options.add_argument("--disable-browser-side-navigation")
options.add_argument("--disable-gpu")
driver = webdriver.Chrome("/opt/chromedriver", options=options)```

Just in case anybody stumbles across this queston in future, adding the following to chrome options solved my issue:
prefs = {
"profile.default_content_settings.popups": 0,
"download.default_directory": r"/tmp",
"directory_upgrade": True
}
options.add_experimental_option("prefs", prefs)

Related

hadoop mapreduce job is not running

i have created a basic mapreduce program and created jar file out of it. when i am trying to run it from console like:
[cloudera#localhost ~]$ hadoop jar /home/cloudera/Desktop/csvjar.jar testpackage.Mapreduce /import/climate /output5
Nothing is happening, no error or map reduce status. It just displays
[cloudera#localhost ~]
Mapreduce is the class where map, reduce and main function resides. Jar file kept on local machine and HDFS also. I have tried with both the paths. Nothing happened in both the conditions. Output5 folder does not exist in the hdfs.
I also came through the same issue. In my code, I missed the closing braces while checking the arguments section in the driver code. I am attaching the part of the code with "}" for reference.
if(otherArgs.length !=3){
System.err.println("Number of argument passed is not 3");
System.exit(1);
}
I hope this would help you.

Using file.managed for downloading a file in Salt

salt.states.file.managed takes source_hash as an argument to verify a downloaded file. This blocks me from using file.managed for a file on an online server I don't have control over. The file also changes regularly. My configuration looks like this.
download_stuff:
file.managed:
- name: localfile.tar.gz
- source: http://someserver.net/onlinefile.tar.gz
- source_hash: ???
I don't want to use cmd.run with Curl or wget because this would always download the file, even when it's already on the local machine.
I would like the know if one of the options below is possible/exists:
online md5 calculation service. Is there any way of getting an md5 hash of the file, using a free web service? I'm thinking of something like http://md5service.net?url={url-to-file}.
salt-internal conversion or workaround. Is it possible to handle this in Salt? Maybe by leaving out source_hash somehow?
alternative state. Is there another state in Salt for doing something like this, without losing the benefit of only downloading the file when needed?
If you can't control the other server, please make sure that you can trust it to download its content. Not using a hash will prevent you from detecting partial or corrupted downloads. There's also no way to work with a file that has changed on the remote server.
Nevertheless you could use a state like this to circumvent the hashcode. The creates part will prevent a second download once the file has been downloaded:
bootstrap:
cmd.run:
- name: curl -L https://bootstrap.saltstack.com -o /etc/salt/cloud.deploy.d/bootstrap-salt.sh
- creates: /etc/salt/cloud.deploy.d/bootstrap-salt.sh
Downloading a file with file.managed can be done since version 2016.3.0., even if you don't have access to the hash, by adding skip_verify: True. For the example given, it would be:
download_stuff:
file.managed:
- name: localfile.tar.gz
- source: http://someserver.net/onlinefile.tar.gz
- skip_verify: True
From the docs:
If True, hash verification of remote file sources (http://, https://, ftp://) will be skipped, and the source_hash argument will be ignored.

Why are my files smaller after I FTP them using this Python program?

I'm trying to send some files (a zip and a Word doc) to a directory on a server using ftplib. I have the broad strokes sorted out:
session = ftplib.FTP(ftp.server, 'user','pass')
filewpt = open(file, mode)
readfile = open(file, mode)
session.cwd(new/work/directory)
session.storbinary('STOR filename.zip', filewpt)
session.storbinary('STOR readme.doc', readfile)
print "filename.zip and readme.doc were sent to the folder on ftp"
readfile.close()
filewpt.close()
session.quit()
This may provide someone else what they are after but not me. I have been using FileZilla as a check to make sure the files were transferred. When I see they have made it to the server, I see that they are both way smaller or even zero K for the readme.doc file. Now I'm guessing this has something to do with the fact that I stored the file in 'binary transfer mode' <--- whatever that means.
This is where my problems lie. I have no idea at all (yet) what is meant by binary transfer mode. Is it simply that I have to use retrbinary to return the files to their original state?
Could someone please explain to me like I'm a two year old what has happened to my files? If there's any more info required, please let me know.
This is a fantastic resource. Solved most of my problems. Still trying to work out the intricacies of FTPs, but I guess I will save that for another day. The link below builds a function to effortlessly upload files to an FTP without the partial upload problem that I've seen experienced by more than one Stack Exchanger.
http://effbot.org/librarybook/ftplib.htm

Using "rundll32.exe" to access SpeechUX.dll

Good Day,
I have searched the Internet tirelessly trying to find an example of how to start Windows Speech Training from with in my VB.Net Speech Recognition Application.
I have found a couple examples, which I can not get working to save my life.
One such example is on the Visual Studios Fourms:
HERE
this particular example users the "Process.Start" call to try and start the Speech Training Session. However this does not work for me. Here is the exmaple from that thread:
Process.Start("rundll32.exe", "C:\Windows\system32\speech\speechux\SpeechUX.dll, RunWizard UserTraining")
What happens is I get and error that says:
There was a problem starting
C:\Windows\system32\speech\speechux\SpeechUX.dll
The specified module could not be found
So I tried creating a shortcut (.lnk) file and thought I could access the DLL this way. My short cut kind of does the same thing. In the short cut I call the "rundll32.exe" with parameters:
C:\Windows\System32\rundll32.exe "C:\Windows\system32\speech\speechux\SpeechUX.dll" RunWizard UserTraining
Then in my VB.Net application I use the "Process.Start" and try to run the shortcut.
This also gives me the same error. However the shortcut itself will start the SPeech Training session. Weird?!?
So, I then took it one step further, to see if it has something to do with my VB.Net Application and the "Process.Start" Call.
I created a VBScript, and using "Wscript.Shell" I point to the Shortcut.
Running the VBScript calls the Shortcut and low and behold the Speech Training starts!
Great! But...
when I try to run the VBscript from my VB.net Application, I get that error again.
What the heck is going on here?
Your problem likely is that your program is compiled as 32-bit and your OS is 64-bit, and thus, when you try to access "C:\Windows\System32\Speech\SpeechUX\SpeechUX.dll" from your program, you're really accessing "C:\Windows\SysWOW64\Speech\SpeechUX\SpeechUX.dll" which, as rundll32.exe is reporting doesn't exist.
Compile your program as 64-bit instead or try the pseudo directory %SystemRoot%\sysnative.
Also, instead of rundll32.exe, you may want to just run SpeechUXWiz.exe with an argument.
Eg.
private Process StartSpeechMicrophoneTraining()
{
Process process = new Process();
process.StartInfo.FileName = System.IO.Path.Combine(Environment.SystemDirectory, "speech\\speechux\\SpeechUXWiz.exe");
process.StartInfo.Arguments = "MicTraining";
process.Start();
return process;
}
private Process StartSpeechUserTraining()
{
Process process = new Process();
process.StartInfo.FileName = System.IO.Path.Combine(Environment.SystemDirectory, "speech\\speechux\\SpeechUXWiz.exe");
process.StartInfo.Arguments = "UserTraining";
process.Start();
return process;
}
Hope that helps.
Read more about Windows 32-bit on Windows 64-bit at http://en.wikipedia.org/wiki/WoW64
or your problem specifically at http://en.wikipedia.org/wiki/WoW64#Registry_and_file_system
If you are using a 64bit OS and want to access system32 folder you must use the directory alias name, which is "sysnative".
"C:\windows\sysnative" will allow you access to system32 folder and all it's contents.
Honestly, who decided this at Microsoft is just silly!!

Verify digital signature within system32/drivers folder

I've spent all night researching this without a solution.
I'm trying to verify the digital signature of a file in the drives folder (C:\Windows\System32\drivers*.sys) pick whatever one you want. I know that the code is correct because if you move the file from that folder to C:\ the test works.
WinVerifyTrust gives error 80092003
http://pastebin.com/nLR7rvZe
CryptQueryObject gives error 80092009
http://pastebin.com/45Ra6eL4
What's the deal?
0x80092003 = CRYPT_E_FILE_ERROR = An error occurred while reading or writing to the file.
0x80092009 = CRYPT_E_NO_MATCH = No match when trying to find the object.
I'm guessing you're running on a 64-bit machine and WOW64 file system redirection is redirecting you to syswow64\drivers, which is empty. You can disable redirection with Wow64DisableWow64FsRedirection().
if you right click and view properties of file can you see a digital signature? most likely your file is part of a catalogue and you need to use the catalogue API to extract the cert from cert DB and verify it.