Automating a Website Search and Download of hundreds of Files Using Python

Automating a Website Search and Download of hundreds of Files Using Python - python-2.7

I am trying to automate a download process from a website for my data curation. It involves few searches and selecting the right file to download if the search term matches. Could you please give mes some insights or Sample code to start with?
Flow chart of process will be like:
Go to website-->Click on Advanced Search>> Click on "Title" >>Select from few dropdowns and fix the search space---> Enter the search string --->select the correct one and download the latest release.

Related

How to share Newman htmlextra report?

This may be a basic question but I cannot figure out the answer. I have a simple postman collection that is run through newman
newman run testPostman.json -r htmlextra
That generates a nice dynamic HTML report of the test run.
How can I then share that with someone else? i.e. via email. The HTML report is just created through a local URL and I can't figure out how to save it so it stays in its dynamic state. Right clicking and Save As .html saves the file, but you lose the ability to click around in it.
I realize that I can change the export path so it saves to some shared drive somewhere, but aside from that is there any other way?

It's been already saved to newman/ in the current working directory, no need to 'Save As' one more time. You can zip it and send via email.
If you want to change the location of generated report, check this.

Executing Alexa Tutorial Code ALWAYS Fails - Beginner

I'm new to Alexa Skill development and I'm sure this issue is process/environmental due to lack of experience.
Whenever I try to use a sample from an offical Alexa tutorial, I can never get the skill to pass the first TEST - always getting an error :(
In this case I am trying to run and fiddle with this tutorial:
https://developer.amazon.com/blogs/post/TxHGKH09BL2VA1/New-Alexa-Skills-Kit-Template-Step-by-Step-Guide-to-Build-a-Decision-Tree-Skill
What is happening / What I've done:
I download the Node SDK from the Git link, I also download the sample from the Git link. I then create a new ZIP that contains the sample code with the Node SDK included in the path /src/alexa-sdk/
I go to AWS and create a new function, not using a blueprint. I 'author from scratch' and create a function with the Skills Kit as a trigger. I name the function and use Node 6.10 runtime.
I upload my ZIP file and leave all boxes default, for Role I choose Custom Role then pick Basic Execution from the Role screen.
I leave the rest blank, go to NEXT and CREATE.
The function is created okay, but I do see this error 'This function contains external libraries. Uploading a new file will override these libraries.'
Here's the problem - this is the point of failure on all tutorials I've tried so far. I go to Configure Test Event, I choose ALEXA START SESSION as the template and click Save And Test...
EXECUTION RESULT FAILED:
{
"errorMessage": "Cannot find module '/var/task/index'",
"errorType": "Error",
"stackTrace": [
"require (internal/module.js:20:19)"
]
}
Here's something from associated error logs, unsure if it's useful:
Unable to import module 'index': Error
at Module.require (module.js:497:17)
at require (internal/module.js:20:19)
I have noticed two things that I suspect may be an issue:
1) When I go to the CODE tab for this function, I see this message:
Your Lambda function "testprojectx" cannot be edited inline since the file name specified in the handler does not match a file name in your deployment package.
2) When I look at the code that's inserted into the test when I choose ALEXA SESSION START, I see many instances of 'unique value here':
amzn1.echo-api.session.[unique-value-here]
Although, there is no mention of this in the tutorial link I am referencing.
I'm really downhearted about it now as this is like the 3rd tutorial code I've tried to configure. Can anybody with experience follow the steps I've taken and point me in the right direction.
Thank you SO MUCH in advance if so.
EDIT: Absolute Clarification on how I am creating the ZIP file
I'm using Windows 10 and Chrome to download the files from GitHub.
I download the skill-sample-nodejs-decision-tree-master ZIP file from GitHub,
I do not know how to use NPM so I do this simply via downloading to desktop.
I then download the alexa-skills-kit-sdk-for-nodejs-master.ZIP file to desktop.
I unzip the contents of decision-tree-master into a folder on the desktop also called alexa-skills-kit-sdk-for-nodejs-master.
Within this folder, I navigate to /src/ and create a new folder called 'node_modules' within /src/.
Within /src/node_modules/ I now create another new folder called 'alexa-sdk'.
I unzip the contents of alexa-skills-kit-sdk-for-nodejs-master.zip into /src/node_modules/alexa-sdk/.
I have tried two approaches from here - both fail:
1) I ZIP only the contents of /src/ (not including the /src/ folder itself) and upload to Amazon.
2) I ZIP the entire 'decision-tree-master' folder and upload to Amazon.
I must be missing something, as I said this is just one of many Alexa tutorials I've tried to get working and this always happens :( So disheartened now.

This is common issue I have seen in many posts. Most of the cases it is the way zipping the files making the problem. Instead of zipping the folder you have to select all files and zip it like below,

WOWZA LiveAutoRecord

I am tired of one problem so please make things clear to me.
Please read these following three points and help me out.
(1)
I have simply followed this https://www.wowza.com/docs/how-to-start-and-stop-live-stream-recordings-programmatically-livestreamrecordautorecord-example#documentation
I have attached my Application.xml. Now when I publish live stream name "test1" via FMLE it get recorded on server but when I run different instance of FMLE on different PC and publish live stream name "test2" it does not get record and I think it goes to previously recorded file "test1" (means no separate file being record, however there should be two files recorded test1 and test2).
Why this happenning ?
Is this com.wowza.wms.plugin.livestreamrecord.module.ModuleAutoRecordAdvancedExample for single stream recording ? means If I publish stream A B C D , it will record them in one single file ? (probably the output file will be A.mp4 as A was first published stream ?)
(2) What is this https://www.wowza.com/docs/how-to-start-and-stop-live-stream-recordings-programmatically-imediastreamactionnotify3#comments module for ?
I have implement this code in Eclipse and successfully put jar in lib folder and configured everything. Now again I am not able to record different streams with their corresponding name. Means If I publish stream1 and stream2 then desired output should be two different files (in content folder) but again I see one single file being record ?
(3) Can I use ModuleLiveStreamRecord.java ? This was in older version of WOWZA but I have properly imported required jar and tested it.
My requirement is very simple:
As soon as users start publishing, WOWZA should start live recording. If 10 users publishing live, 10 files should be generate.

Don't make things more difficult than necessary (assuming you have Wowza 4.x; if you still have 3.x then I highly recommend to upgrade for free)
Open the Engine Manager (http://your.server.com:8088)
Go to "Applications" from the top menu
Select your application from the left menu (e.g. "live")
In the setup window for this application, click the blue Edit button
Enable "Record all incoming streams"
Click "Save"
Click the orange "Restart now" button at the top
Done
Every stream that is published via this application will now automatically be recorded. The default folder for recordings is the /content folder in your Wowza installation. You can change this on the same page under "Streaming File Directory" (make sure it's a directory on your local system, unless you really well understand how Wowza works)
The filename is always the streamname + ".mp4", but when you start a new recording while the file already exists, the old file will be renamed first.
Want to control recording manually? Start publishing first, then select "Incoming streams" from the left menu and use the big red dot button behind a stream name to start recording.
If your server produces any different behavior with regards to the file (re)naming or recording, then you may need to review your Wowza setup.

I appreciate your response KBoek.
I sorted out issue but there were really debugging need if one doing custom module. I had to write custom module for live auto recording because I wanted HTTP authentication and then custom name of live recording.
thanks again

FinalBuilder 7 - Where are log files?

i did found a file named XXX.fbl7 which is type is named "FinalBuilder Log File".
if this is the file how can i open it?
once i click it from windows explorer i get "The project file specified on the command line was not found or invalid"

In the desktop application, Tools->Options->Logging section there is tab that allows you to export the logs to text, xml and HTML.
In a Final Builder Project, you can also use a Final Builder Action to create a separate log file in the format you want.
Personally I just use the Final Builder Server Notification options to let me know when things have failed, then go to the web server and review the full log in the web page.

Final Builder log files are binary files. You can't open them without having the product installed. As far as I remember, there was an option to export them as text files. But I haven't used FB for years, so I might be wrong.

How to embed a hgactivity graph in hgweb

I would like to embed an activity graph created by hgactivity inside my hgweb webinterface. What's the best method to do so?
Here's a screenshot of a hgactivity graph:
It shows the number of commits through time to a Mercurial repository.

The difficulty you'll have is where to put the chart so it can be served. If you're okay with having a standard view that everyone sees you could use a cron job to run hg activity and save the image to a standard filename in with the hgweb static files (the css, etc.). Then just tweak your hgweb template to include an img tag that references the image file. If your cron job is overwriting that file periodically (daily, hourly?) you'll be good to go.
If you need something more dynamic (user specific queries, particular date ranges, etc.) you might want to look at (my) hg chart extension. It's not as full features as hg activity but it does have the advantage of spitting out google chart API urls rather than image files. Example:
https://chart.apis.google.com/chart?cht=lxy&chs=400x400&chd=e:AAAKAaAjAtA6BHBQBaBkBtB3CACKCUChCqC0C9DHDRDaDkDuD3EBEOEXEhExE7FIFRFbFlFuF4GBGOGeGyG7HFHOHbHlHyIFIVIiIyI8JMJcJlJyJ8KcK2LGLWL8MQMwNDNTNgNqNzOAONOaOjOtO3PAPKPUPdPnPwP6QEQNQXQhQqQ0Q-RORXRnR0SBSLSUSeSrS0S-TITRTeTuT7UIUVUeUoU1VFVPVbVoVyV8WFWPWYWiWsW1W.XJXSXcXmXvX5YGYSYfYpYzY8ZGZTZcZpZzZ8aGaQaZajata6bDbNbWbgbwcDcQcacjc0c9dHdQdadkdtd3eBeNeae3fEfOfXfnf0gOgegug4hBhVhhhrh1h-iIiSibiliyjFjVjlj.kSkckpk1lClSlflvmDmMmWmfmpmznAnJnTncnmnwn5oDoNoWogoqo2pApKpTpdpnpwp9qHqUqdqnq3rArRrkr0r-sKsXshsqs0tLtbtkt0uEuRuou7vFvOvYvivrv4wFwPwfwowyw7xFxPxYxlxvx4yFyVyfypyyy8zGzPzZzizsz20D0M0W0g0p0z081J1T1d1m1w152J2g3Q3q3z4E4Q4g4t5B5K5U5k5u536B6R6r677E7R7h707-8O8b8l8x879F9S9b9o9y97-P-f-o-y-8.F.P.Y.i.s.1..,VnFsKVETK.eWNyCaLTTrSnBdN.MKMVTTHuL8SLLBAbENHZD.HrE8CEKSC1G1H9CiSeJiMb..ItFLFDmnDBIhMKCVFcDbFaCAOuNUEsBtepD3DuBTA6DfGjBoDdDLAuHpAVFWEjI5CYCzAtGWGqFTAhfrDFGxHbFVNZBjE7EBAbDjEaK2CjJXAnHeDpFyGhRSD2OWGJajC.KGHreDISCqGtKVHUCZKbFtCHhId8GrB2EpHRJqItR5A5OSSrOJHgDpKmBHA4D2C1BbE4KBHbCtFHKQW7QpQuKRJDMSEGfDDrDZAeB2VqEPGkHlFHJrHuFFJ-IcB5DQFaGZAaArATA4AJALDaBmCTCkCoAlEtAkEPHpCwE.ETGbFfC9BZJtMJBNBwBPCZHzA3CEAUEiCBBqPdcDIwLnPjFPH3B9S-GNFbDqDaOfdOKcGDKaHeK8IODGJdDXCUCdHADbBQDKCIB1DGAzDCWKLREaCGAFAeA7DEPCA0BZC5FSc0OTC9N7ANKGDGQMEPPfN.BSFHBwJeHiH-FvJlXxEuF1K-M0COEbHHDfB-FKA-TpaADISdHoXiMUMGETE2HnBFBqIYAVATAWA2F5DOEELxNmElS-EDBFFRBBHaEFAyE2AbI9SHDKDSDSFqBtCyFQFZFeBCHhAuCKAibPDlCjXXMRDYKXCq&chxt=y,x&chxl=1:%7c05/03/05%7c03/17/06%7c01/30/07%7c12/15/07%7c10/29/08&chxr=0,0,7166
which looks like:
Then there are no files to save or serve. You tweak your template to invoke a little code that runs hg chart, insert the URL into the page's HTML, and let google create and serve the image.

I came up with the following setup:
Add a folder activity to the template static
Add a changegroup hook called activity in hgwebconfig:
[hooks]
changegroup.activity = hg activity --filename /usr/share/mercurial/templates/static/activity/${PWD##*/}.png
The ${PWD##*/} will be replaced by the folder name of the repository (a hook script is run in the root of a repository).
Upon triggering (push or pull of one or more changesets) an activity graph is placed in the static/activity folder of the (default) template folder.
Now you can add the following HTML to the template page of your preference
<img src="{staticurl}/activity/{repo}.png"/>
This will load the most recent activity graph for the current repository.
Caveat:
You need at least one push after activation of this hook before the image is created.

I started a project that has this build in. You can see a demo on
http://hg.python-works.com it's pylons based and have activity graph.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js