WORD and PDF to text Web Service - web-services

I'm looking to write (or use an existing) web service that takes an MS WORD and PDF file, extracts it's content and returns it as text.
Anyone knows of such a service or how to write one?

For Word-to-text you can use antiword and pass its output to the client.
For PDF, there's PdfTk - its dump_data operation might be useful.

Related

How to read a PDF file using Amazon Polly?

In a AWS blog (second line of first paragraph), it's mentioned that we can convert the text in a pdf document to speech.
I tried to find the documentation related to pdf reading, but still not able to get any solution.
Might Help you:
After logging on to the Amazon Polly console, choose Get started, and then choose the Text-toSpeech tab.
Choose the Plain text tab.
Type or paste this text into the input box.
Hi.
How are you,
I am testing this service.
For Choose a language and region, choose English US, then choose the voice you want to use for
this text.
To listen to the speech immediately, choose Listen to speech.
To save the speech to a file, do one of the following:
a. Choose Save speech to MP3.
b. To change to a different file format, choose Change file format, choose the file format you
want, and then choose Change
More details click here

Specify output filename of Cloud Vision request

So I'm sitting with Google Cloud Vision (for Node.js) and I'm trying to dynamically upload a document to a Google Cloud Bucket, process it using Google Cloud Vision API, and then downloading the .json afterwards. However, when Cloud Vision processes my request and places it in my bucket for saved text extractions, it appends output-1-to-n.json at the end of the filename. So let's say I'm processing a file called foo.pdf that's 8 pages long, the output will not be foo.json (even though I specified that), but rather be foooutput1-to-8.json.
Of course, this could be remedied by checking the page count of the PDF before uploading it and appending that to the path I search for when downloading, but that seems like such an unneccesary hacky solution. I can't seem to find anything in the documentation about not appending output-1-to-n to outputs. Extremely happy for any pointers!
You can't specify a single output file for asyncBatchAnnotate because depending on your input, many files may get created. The output config is only a prefix and you have to do a wildcard search in gcs for your given prefix (so you should make sure your prefix is unique).
For more details see this answer.

JItterBit HTTP Endpoint

I am working to set up a HTTP Endpoint in JitterBit, for this end point we have a system that will call this Endpoint and pass parameters through the URL to it.
example...
http://[server]:[server port]/EndPoint?Id={SalesForecID}&Status={updated status in SF}
Would i need to use the Text File, JSON or XML Method for this? Follow up question would be if it is JSON or XML what would the file look like that is uploaded during creating the endpoint. I have tired with no success with the text file version.
any help would be great.
I'm just seeing your question now. You may have found a solution, but this took me a while to figure out, so I'll respond anyway.
To get the passed values, go ahead and create your HTTP Endpoint and add a new operation triggered by it. Then, in your new operation create a script with something like the following:
$SalesForceID = $jitterbit.networking.http.query.Id
$UpdatedStatus = $jitterbit.networking.http.query.Status
You can then use these variables elsewhere in your operation chain.
If you want to use these values to feed into another RESTful web service (i.e. an HTTP Source), you'll have to create a separate transformation operation with the HTTP Source. You'd set that source URL to be: http://mysfapp.com/call?Id=[SalesForceID]&Status=[UpdatedStatus]. I'm not sure why, but you can't have the script that extracts the parameters from the Endpoint and the HTTP Source that uses those in the same operation.
Cheers

How To Pass Different Parameters To web Service Every Time It Is Called With Jmeter

I am testing web service using Jmeter, web service has several methods which takes some parameters. What I need to do is to pass different parameters every time user (thread) "calls" web service method.
I know, that I can do something like that, if I write Soap Messages in xml files and then to "give" Jmeter path of folder including this xml files, but Jmeter will take randomly those files and there is probability to use the same file twice or more. But I want Jmeter to use every request time different unique Soap Message.
Can Anyone help me?
Use CSV Data Set Config, prepare csv file containing your parameters.
Set CSV config recycle on EOF to false.
Set CSV config stop thread on EOF to true.
Set CSV config Sharing mode to all threads.

Upload Photos using Silverlight - Ria Services

i'm trying to find a good exemple on uploading and downloading images using solely Silverlgith + Ria Services, i tried to find some but i failed, please any help would be appreciated.
thank you all in advance
I just found some useful walk trough here and make sure to read follow-up that improves the save process and used image
We did it by saving the images on disk (not in a DB) - like this:
Upload image:
Write a Domain Service with an operation like void UploadJPGImage(string uniqueName, byte[] jpgBytes). This needs to be marked with the attribute for ClientAccess. The (server-side) implementation saves the image on the disk.
for the uniqueName, we generate a GUID client-side
Download image:
HTTP Handler - write an HTTP handler for downloading the image using a URL containing the unique name parameter passed by the client when uploading the image
Or, one could write a Domain Service operation, like byte[] DownloadJPGImage(string uniqueName)