Headless chrome cli in Production - clojure

I will be doing some pdf generation for my application. Currently, my plan is to create HTML using templates and convert them to PDF.
The pdf's aren't long. Maximum 3 pages. And approximately we will be making approx 100 docs in a day.
I was happy with the results I got from chrome --headless in my local machine. I called the cli command directly from my clojure code. So far so good. Looking at the number of wrappers available (Browserless, Chromeless, Puppeteer, ...) I'm not sure about the scalability factor in production.
Is it safe to use/call the chrome cli directly in production boxes?
What will I miss if I skip these wrappers?
My server side stack is Clojure/Compojure/Leiningen. Any insights/alternatives are appreciated.

I'm using Athena PDF for pdf generation in combination with Clojure:
https://github.com/arachnys/athenapdf
It has a REST interface. Since it runs in Docker its easy to scale.

Instead of detouring through html and chrome, I'd just use a pdf creating library such as clj-pdf. Here is a nice blog post about it.
p.s. If you dont mind running a third program to generate the pdf, I would have used emacs with org-mode (or heck, even writing it in elisp altogether) ;)

Related

HTML to PDF on Google AppEngine

We're currently trying to convert html files to PDF on AppEngine using Python. The HTML files are from a third-party vendor so we have no control over their format. Both the Flexible and Standard environments are options, but every path we go down we seem to hit a roadblock:
PDFkit requires a wkhtml2pdf install, no PIP package available, however converts perfectly offline
xhtml2pdf / PISA - works even on GAE Standard but doesn't support many features such as float and badly formatted HTML
WeasyPrint - C dependencies in theory would run on the Flexible environment but no pip packages available for dependencies including Cairo and Pango
Has anyone got a robust solution running on AppEngine with any of the above? Or with other libraries I am missing?
I ran into this same problem a year back and concluded that this is currently not possible in App Engine, at least with a good quality conversion. (Someone please point out if things have changed)
xhtml2pdf - I was able to successfully run it in standard App Engine but not at all happy with the conversion quality.
PDFkit - Ran into a similar problem and came up with a different solution. Hosted PDFkit on a Compute Engine Instance and exposed an endpoint wherein a POST request with the HTML file will return the converted PDF as a response. This gave me the best/expected results in terms of quality/speed of processing.
It did incur some extra charges but I was able to utilize the instance for something else too ;). I chose the least possible configuration initially since I was not storing anything on the Compute Engine Instance.

Foswiki: Uploading and downloading topics without FTP

I have a Foswiki wiki on a server. Is it possible to script the following without FTP access (for various reasons I can't use it):
Download a topic's wikitext, modify it locally, then upload it again (overwriting the topic)
Upload wikitext to a new topic
I've been doing these tasks manually, but I'd like to automate them. I've looked into the Foswiki API and a few plugins, but nothing seems capable of doing this.
Is there a way? (any programming language)
If you have web access, you could drive the bin/view and bin/save scripts remotely from a script.
Take a look at our BuildContrib upload target for an example. It gets a strikeone key and downloads the original topic to recover any form data. It then uploads the topic text, creating a new version. It's written in perl, and uses LWP.
https://github.com/foswiki/distro/blob/master/BuildContrib/lib/Foswiki/Contrib/BuildContrib/Targets/upload.pm
The following isn't(!) the right solution (sure exists an nice Foswiki-way approach), but if you know perl, you can do anything with the:
Install Firefox
install MozRepl addon into it
Install the WWW::Mechanize::Firefox perl module
Now, you can script anything what you can do directly from the browser, e.g. logging into the Foswiki, click buttons, save topics, etc..etc. Drawback - it isn't an easy way - you need to know many details.
Myself using this technique for testing.

Image processing on a web server

I want to run image processing algos on server which can interact easily with web apps. The image processing algos are compute heavy and wont be available in custom built libraries. Currently I am using Ruby on Rails on Heroku for my website.
What would be the best architecture to achieve this? take images from website - run image processing algo on it - display back on website
Most of my image processing code is on C/C++.
Can i call C/C++ code from Ruby on Rails directly? Is this possible on Heroku?
Or should I design a system where C/C++ code expose some APIs which can be called by Ruby on Rails server?
Heroku typically uses small virtual machine instances, so depending on just how heavy your processing is, it may not be the best choice of architecture. However, if you do use it I would do this:
Use a background task gem to do your processing. Have this running on a separate process (called a worker rather than a dyno in Heroku terminology). Delayed Job is a tried and tested solution for background tasks with a wealth of online information relating to integrating it into Heroku, but there are newer ones like Sidekiq which use the new threading system in modern versions of Ruby. They would allow everything to be done in the dyno, but I would say that it would be useful to keep all background processing away from the webserver dynos, so Delayed Job (or similar) would be fine.
As for integrating C/C++, I haven't needed to do this as yet. However, I know it is possible to create gems that integrate C or C++ code and compile natively. As long as you're using ruby rather than JRuby, I don't think Heroku should have a problem with them. There are other ways of accomplishing this, look at SO questions specifically about this topic, such as
How can I call C++ functions from within ruby
It seems that you need to create an extension, then create a gem to contain it. These links may or may not help.
http://www.rubyinside.com/how-to-create-a-ruby-extension-in-c-in-under-5-minutes-100.html
http://guides.rubygems.org/gems-with-extensions/
I recommend making a gem as I think it may be difficult to otherwise get libraries or executables on to a Heroku instance. You can store the gem in your vendor directory if you don't want to make it public.
Overall I would have the webserver upload to S3 or wherever you're storing the images (this can be done directly in the browser without using the webserver as a stepping stone with the AWS JS API. Have a look for gems to help.)
Then the webserver can request a background task to process the image.
If you're not storing them, things become a little more interesting... You'll need a database if you're using background tasks, so you could pass the image data over to the worker as a blob in the database perhaps.
I really wouldn't do all the processing just in the webserver dyno, unless you're really only hitting this thing very occasionally. With multiple users you'd hit a bottleneck very quickly.
The background process can set a flag on the image table row so the webserver can let the user know when processing is complete. (You can poll for information using JS on the upload complete screen using AJAX)
Of course, there are many other ways of accomplishing this, depending on a number of factors.
Apologies that the answer is vague, but the question is quite open-ended.
Good luck.

Coldfusion continuous Integration

let me begin by saying I 'm a coldfusion newbie.
I 'm trying to research if its possible to do the following and what would be the best approach to achieve it.
Whenever a developer checks in code into SVN, I would like to do a get all the new changes/files and do an auto build to check if the code can be deployed successfully to production server. I guess there are two parts to it, one syntax checking and second integration test(if functionality is working as expected). For the later part some unit test tools would have to be used.
Can someone comment on their experience doing something similar for coldfusion.
Sorry for being a bit vague...I know its a very open-ended question but any feedback would be appreciated.
Thanks
There's a project called "Cloudy With A Chance of Tests" that purports to do what you require. In particular it brings together a number of other CFML code analysis projects (VarScope & QueryParam) to check code, as well as unit testing. I am not currently using it myself but did have a look at it some time ago (more than 12 months) and it appeared to be quite good.
https://github.com/mhenke/Cloudy-With-A-Chance-Of-Tests
Personally I run MXUnit tests in Jenkins using the instructions from the MXUnit site - available here:
http://wiki.mxunit.org/display/default/Continuous+Integration+--+Running+tests+with+Jenkins
Essentially this is set up as an ant task in Jenkins, which executes the MXUnit tests and reports back the results.
We're not doing fully continuos integration, but we have a process which automates some of the drudgery of our builds:
replace the site's application.cf(m|c) with one that tells users that the app is being deployed (we had QA staff raising defects that were due to re-deployments)
read a database manifest XML which lists all SQL scripts which make up the current release. We concatenate the scripts into a single upgrade script, suitable for shipping
execute the SQL script against the server's DB, noting any errors. The concatenation process also adds a line of SQL after each imported script that white to a runlog table, so we can see what ran, how long it took and which build it was associated with. If you're looking to replicate this step, take a look at Liquibase
deploy the latest code
make an http call to a ?reset=true type URL to tell the app to re-initialize
execute any tests
The build is requested manually through the build servers we have, but you click a button, make tea and it's done.
We've just extended the above to cope with multiple servers in a cluster and it ticks along nicely. I think the above suggestion of using the Jenkins SVN plugin to automate the process sounds like the way to go.

offline use of the google earth plugin

I have a use case that requires offline access to google earth. I know that google earth enterprise offers a disconnected product, however we may not have access to that product and/or google earth enterprise is prohibitively expensive at $25K for a dev license.
I would prefer to use the google earth plugin since I am building an application and would like to use the JS api. Is it possible to host the google earth plugin on my own disconnected server? We would use google earth connected to a standalone offline WMS server for access to imagery.
said another way, can I host the plugin and corresponding javascript on my own server?
I do not know if i understand well your problem but i can explain you waht I'm currently working on.
Im my current application with google earth plugin js api, I'm able to start the plugin even if offline. But one requirement is to have cached data.
If you have cached data and if you start the plugin offline, then zooming to a level with higher resolution that the one you have in your cached data will have no effect (imagery will not be update to higher resolution)
but depending on what you really need, yes , you can start the plugin offline
This is not really answering your original question but if you are interested, just tell me :-)
I tried to cache Google Earth with a proxy server but I couldn't.
Furthermore I think the api is validated every time it loads against Google Servers and doesn't allow offline use
It's some monthes now since I have worked with this.
I'll try to explain with what i can remember :-)
in the html where i have my plug-in, i have removed:
"script type="text/javascript" src="https://www.google.com/jsapi">
but i have saved locally this jsapi.js file. I also saved locally loader_1-008.js
then, im my code (c++, Qt) I'm using evaluateJavaScript(Qstring source) twice
where source is the text read from my 2 .js files
These 2 evaluateJavaScript calls need to be done before loading my html (the one with the plugin)
in my QWebView
I can not remmeber much more but I hope this can start to help you