MQ - Data Compression on queue

MQ - Data Compression on queue - compression

Problem: MQ7 has a hard limit of a maximum 100MB jms message. For large payloads ( xml ) that are close that, can this be compressed on the queue to shorten the data length?
I tried compressing a 7MB jms string message using the MQ ZLIB compression on the svr.def.conn channel and it didn't make any difference to the data length of the jms message. I only set the one channel and expected the channel that is used would compress the data going into the queue.
MQ Server: 7.5
Client: JAVA
Message Type: String

Channel level compression is used to compress the data in transit between the two ends of the channel, in your case between the JMS client and the MQ SVRCONN channel. The messages themselves will be compressed while going over the network but not while sitting on the queue.

I would recommend to commpress the payload and use a ByteMessage. Message properties can be used to qualify the payload type, similar to HTTP e.g. "Content-Encoding", "Content-Type"
String payload = ...; // the xml
Session session = ...;
BytesMessage bytesMessage = session.createBytesMessage();
bytesMessage.writeBytes(compressGZIP(payload, StandardCharsets.UTF_8));
bytesMessage.setStringProperty("Content-Encoding", "gzip");
bytesMessage.setStringProperty("Content-Type", "text/xml; charset=utf-8");
Here is the compressGZIP method:
private byte[] compressGZIP(String string, Charset charset) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
try (GZIPOutputStream out = new GZIPOutputStream(byteArrayOutputStream)) {
StringReader stringReader = new StringReader(string);
// IOUtils from apache commons-io
IOUtils.copy(stringReader, out, charset);
}
return byteArrayOutputStream.toByteArray();
}
The consumer can then ask for the message properties, decompress and re-create the xml based on the "Content-Encoding" and "Content-Type" message properties.
Something like this
public void onMessage(Message message) {
BytesMessage bytesMessage = (BytesMessage) message;
long bodyLength = bytesMessage.getBodyLength();
byte[] rawPayload = new byte[(int) bodyLength];
InputStream payloadInputStream = new ByteArrayInputStream(rawPayload);
String contentEncoding = bytesMessage.getStringProperty("Content-Encoding");
if("gzip".equals(contentEncoding)) {
payloadInputStream = new GZIPInputStream(payloadInputStream);
}
String contentType = bytesMessage.getStringProperty("Content-Type");
MimeType mimeType = new MimeType(contentType); // from javax.activation
if("text".equals(mimeType.getPrimaryType())) {
if("xml".equals(mimeType.getSubType())) {
Charset charset;
String charsetString = mimeType.getParameter("charset");
if(charsetString != null) {
charset = Charset.forName(charsetString);
} else {
charset = StandardCharsets.UTF_8; // default
}
Reader reader = new InputStreamReader(payloadInputStream, charset);
String xml = IOUtils.toString(reader);
IOUtils.closeQuietly(reader);
}
}
}
The advantage of this solution is that you stay on the standard JMS api instead of using a provider specific configuration.
The disadvantage is that the sender and receiver must implement content-type handling.
Thus you have to make a decision between portability and implementation effort.

Related

How to make POST request to a web server with C++ and Core Foundation APIs for macOS?

I'm trying to follow this example to let me make a POST request to a web server and receive its response in pure C++ using Core Foundation functions. I'll copy and paste it here:
void PostRequest()
{
// Create the POST request payload.
CFStringRef payloadString = CFStringCreateWithFormat(kCFAllocatorDefault, NULL, CFSTR("{\"test-data-key\" : \"test-data-value\"}"));
CFDataRef payloadData = CFStringCreateExternalRepresentation(kCFAllocatorDefault, payloadString, kCFStringEncodingUTF8, 0);
CFRelease(payloadString);
//create request
CFURLRef theURL = CFURLCreateWithString(kCFAllocatorDefault, CFSTR("https://httpbin.org/post"), NULL); //https://httpbin.org/post returns post data
CFHTTPMessageRef request = CFHTTPMessageCreateRequest(kCFAllocatorDefault, CFSTR("POST"), theURL, kCFHTTPVersion1_1);
CFHTTPMessageSetBody(request, payloadData);
//add some headers
CFStringRef hostString = CFURLCopyHostName(theURL);
CFHTTPMessageSetHeaderFieldValue(request, CFSTR("HOST"), hostString);
CFRelease(hostString);
CFRelease(theURL);
if (payloadData)
{
CFStringRef lengthString = CFStringCreateWithFormat(kCFAllocatorDefault, NULL, CFSTR("%ld"), CFDataGetLength(payloadData));
CFHTTPMessageSetHeaderFieldValue(request, CFSTR("Content-Length"), lengthString);
CFRelease(lengthString);
}
CFHTTPMessageSetHeaderFieldValue(request, CFSTR("Content-Type"), CFSTR("charset=utf-8"));
//create read stream for response
CFReadStreamRef requestStream = CFReadStreamCreateForHTTPRequest(kCFAllocatorDefault, request);
CFRelease(request);
//set up on separate runloop (with own thread) to avoid blocking the UI
CFReadStreamScheduleWithRunLoop(requestStream, CFRunLoopGetCurrent(), kCFRunLoopCommonModes);
CFOptionFlags optionFlags = (kCFStreamEventHasBytesAvailable | kCFStreamEventErrorOccurred | kCFStreamEventEndEncountered);
CFStreamClientContext clientContext = {0, (void *)payloadData, RetainSocketStreamHandle, ReleaseSocketStreamHandle, NULL};
CFReadStreamSetClient(requestStream, optionFlags, ReadStreamCallBack, &clientContext);
//start request
CFReadStreamOpen(requestStream);
if (payloadData)
{
CFRelease(payloadData);
}
}
And the callback:
void LogData(CFDataRef responseData)
{
CFIndex dataLength = CFDataGetLength(responseData);
UInt8 *bytes = (UInt8 *)malloc(dataLength);
CFDataGetBytes(responseData, CFRangeMake(0, CFDataGetLength(responseData)), bytes);
CFStringRef responseString = CFStringCreateWithBytes(kCFAllocatorDefault, bytes, dataLength, kCFStringEncodingUTF8, TRUE);
CFShow(responseString);
CFRelease(responseString);
free(bytes);
}
static void ReadStreamCallBack(CFReadStreamRef readStream, CFStreamEventType type, void *clientCallBackInfo)
{
CFDataRef passedInData = (CFDataRef)(clientCallBackInfo);
CFShow(CFSTR("Passed In Data:"));
LogData(passedInData);
//append data as we receive it
CFMutableDataRef responseBytes = CFDataCreateMutable(kCFAllocatorDefault, 0);
CFIndex numberOfBytesRead = 0;
do
{
UInt8 buf[1024];
numberOfBytesRead = CFReadStreamRead(readStream, buf, sizeof(buf));
if (numberOfBytesRead > 0)
{
CFDataAppendBytes(responseBytes, buf, numberOfBytesRead);
}
} while (numberOfBytesRead > 0);
//once all data is appended, package it all together - create a response from the response headers, and add the data received.
//note: just having the data received is not enough, you need to finish the response by retrieving the response headers here...
CFHTTPMessageRef response = (CFHTTPMessageRef)CFReadStreamCopyProperty(readStream, kCFStreamPropertyHTTPResponseHeader);
if (responseBytes)
{
if (response)
{
CFHTTPMessageSetBody(response, responseBytes);
}
CFRelease(responseBytes);
}
//close and cleanup
CFReadStreamClose(readStream);
CFReadStreamUnscheduleFromRunLoop(readStream, CFRunLoopGetCurrent(), kCFRunLoopCommonModes);
CFRelease(readStream);
//just keep the response body and release requests
CFDataRef responseBodyData = CFHTTPMessageCopyBody(response);
if (response)
{
CFRelease(response);
}
//get the response as a string
if (responseBodyData)
{
CFShow(CFSTR("\nResponse Data:"));
LogData(responseBodyData);
CFRelease(responseBodyData);
}
}
I understood how it works, and started implementing it ..... only to get this error:
'CFReadStreamCreateForHTTPRequest' is deprecated: first deprecated in
macOS 10.11 - Use NSURLSession API for http requests
There's absolutely zero examples how to use NSURLSession for C++, or how to bypass that idiotic "is deprecated" error.
Any help on how am I supposed to code this in C++ now?
PS. I don't want to use any third-party libraries. This is a simple task that was available with simple API calls (as I showed above.)
PS2. Sorry I am not an Apple developer, and I'm not used to features being deprecated on the whim.

There are 3 options.
Ignore the warning.
Use ObjC runtme.
Use libcurl
The first one is the easiest and the second one is the hardest solutions for your skills. The third option is easy and the most advanced solution - if you extend you software with new features, CFNetwork will lack of functionality.

How to get download file size before download using C/C++ in Linux environment [duplicate]

I want to get the size of an http:/.../file before I download it. The file can be a webpage, image, or a media file. Can this be done with HTTP headers? How do I download just the file HTTP header?

Yes, assuming the HTTP server you're talking to supports/allows this:
public long GetFileSize(string url)
{
long result = -1;
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
if (long.TryParse(resp.Headers.Get("Content-Length"), out long ContentLength))
{
result = ContentLength;
}
}
return result;
}
If using the HEAD method is not allowed, or the Content-Length header is not present in the server reply, the only way to determine the size of the content on the server is to download it. Since this is not particularly reliable, most servers will include this information.

Can this be done with HTTP headers?
Yes, this is the way to go. If the information is provided, it's in the header as the Content-Length. Note, however, that this is not necessarily the case.
Downloading only the header can be done using a HEAD request instead of GET. Maybe the following code helps:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://example.com/");
req.Method = "HEAD";
long len;
using(HttpWebResponse resp = (HttpWebResponse)(req.GetResponse()))
{
len = resp.ContentLength;
}
Notice the property for the content length on the HttpWebResponse object – no need to parse the Content-Length header manually.

Note that not every server accepts HTTP HEAD requests. One alternative approach to get the file size is to make an HTTP GET call to the server requesting only a portion of the file to keep the response small and retrieve the file size from the metadata that is returned as part of the response content header.
The standard System.Net.Http.HttpClient can be used to accomplish this. The partial content is requested by setting a byte range on the request message header as:
request.Headers.Range = new RangeHeaderValue(startByte, endByte)
The server responds with a message containing the requested range as well as the entire file size. This information is returned in the response content header (response.Content.Header) with the key "Content-Range".
Here's an example of the content range in the response message content header:
{
"Key": "Content-Range",
"Value": [
"bytes 0-15/2328372"
]
}
In this example the header value implies the response contains bytes 0 to 15 (i.e., 16 bytes total) and the file is 2,328,372 bytes in its entirety.
Here's a sample implementation of this method:
public static class HttpClientExtensions
{
public static async Task<long> GetContentSizeAsync(this System.Net.Http.HttpClient client, string url)
{
using (var request = new System.Net.Http.HttpRequestMessage(System.Net.Http.HttpMethod.Get, url))
{
// In order to keep the response as small as possible, set the requested byte range to [0,0] (i.e., only the first byte)
request.Headers.Range = new System.Net.Http.Headers.RangeHeaderValue(from: 0, to: 0);
using (var response = await client.SendAsync(request))
{
response.EnsureSuccessStatusCode();
if (response.StatusCode != System.Net.HttpStatusCode.PartialContent)
throw new System.Net.WebException($"expected partial content response ({System.Net.HttpStatusCode.PartialContent}), instead received: {response.StatusCode}");
var contentRange = response.Content.Headers.GetValues(#"Content-Range").Single();
var lengthString = System.Text.RegularExpressions.Regex.Match(contentRange, #"(?<=^bytes\s[0-9]+\-[0-9]+/)[0-9]+$").Value;
return long.Parse(lengthString);
}
}
}
}

WebClient webClient = new WebClient();
webClient.OpenRead("http://stackoverflow.com/robots.txt");
long totalSizeBytes= Convert.ToInt64(webClient.ResponseHeaders["Content-Length"]);
Console.WriteLine((totalSizeBytes));

HttpClient client = new HttpClient(
new HttpClientHandler() {
Proxy = null, UseProxy = false
} // removes the delay getting a response from the server, if you not use Proxy
);
public async Task<long?> GetContentSizeAsync(string url) {
using (HttpResponseMessage responce = await client.GetAsync(url))
return responce.Content.Headers.ContentLength;
}

Akka actor for http request Java

Hello I am trying to look for a simple example in AKKA - Java to create a HTTP Client un an Actor. So far I am able to create a request and get the response Http Entity. I need to migrate it to an actor , so I can call multiple actors in parallel with a time out.
final ActorSystem system = ActorSystem.create();
final Materializer materializer = ActorMaterializer.create(system);
final List<HttpRequest> httpRequests = Arrays.asList(
HttpRequest.create(url) // Content-Encoding: gzip in respons
);
Unmarshaller<ByteString, BitTweet> unmarshal = Jackson.byteStringUnmarshaller(BitTweet.class);
JsonEntityStreamingSupport support = EntityStreamingSupport.json();
final Http http = Http.get(system);
final Function<HttpResponse, HttpResponse> decodeResponse = response -> {
// Pick the right coder
final Coder coder;
if (HttpEncodings.gzip().equals(response.encoding())) {
coder = Coder.Gzip;
} else if (HttpEncodings.deflate().equals(response.encoding())) {
coder = Coder.Deflate;
} else {
coder = Coder.NoCoding;
}
// Decode the entity
return coder.decodeMessage(response);
};
List<CompletableFuture<HttpResponse>> futureResponses = httpRequests.stream()
.map(req -> http.singleRequest(req, materializer)
.thenApply(decodeResponse))
.map(CompletionStage::toCompletableFuture)
.collect(Collectors.toList());
for (CompletableFuture<HttpResponse> futureResponse : futureResponses) {
final HttpResponse httpResponse = futureResponse.get();
system.log().info("response is: " + httpResponse.entity()
.toStrict(1, materializer)
.toCompletableFuture()
.get());
HttpEntity.Strict entity_ = HttpEntities.create(ContentTypes.APPLICATION_JSON, httpResponse.entity().toString());
Source<BitTweet, Object> BitTweet =
entity_.getDataBytes()
.via(support.framingDecoder()) // apply JSON framing
.mapAsync(1, // unmarshal each element
bs -> unmarshal.unmarshal(bs, materializer)
);

How can I issue a POST request that contains a basic authentication header, and a JSON body?

I am trying to use the CPPRESTSDK (a.k.a. Casablanca) to POST data to a RESTful server. To do this, I create a request, and assign a header:
// create request, and add header information
web::http::http_request req(methods::POST);
req.headers().add(header_names::authorization, authStr); // authStr is base64 representation of username & password
req.headers().add(header_names::content_type, http::details::mime_types::application_json);
Next, I make a web::json::value object that contains all the key-value pairs:
web::json::value obj = json::value::object();
obj[U("Key1")] = web::json::value::string(U("Val1")];
obj[U("Key2")] = web::json::value::string(U("Val2")];
obj[U("Key3")] = web::json::value::string(U("Val3")];
I then store this object in the request's body by calling:
req.set_body(obj);
Finally, I send the request to the server using an http_client:
// create http client
web::http::client::http_client client(addr); // addr is wstring
return client.request(req).then([](http_response response) {
return response;
});
The problem is that this doesn't do anything. If I place a breakpoint on this line, I get information about "400 Bad Request." I would assume that the request's body is somehow malformed, but it could also be that I am missing some information in the header. This error does not happen when I issue a GET request on the same URL, so it is definitely a problem with POSTs specifically. What do you think?
Here is a working example:
// create a new channel
pplx::task<web::http::http_response> postChannel(http_client client, std::wstring authStr, std::wstring cDesc, std::wstring cName, std::string cDiagCap, int cNormFloat, int cWriteDuty,
int cWriteMeth, std::string cItemPersist, std::wstring cItemPersistDat) {
// create request
http_request req(methods::POST);
req.headers().add(header_names::authorization, authStr);
std::wstring url = L"/config/v1/project/channels";
req.set_request_uri(url);
json::value obj = json::value::object();
obj[U("common.ALLTYPES_DESCRIPTION")] = json::value::string(cDesc);
obj[U("common.ALLTYPES_NAME")] = json::value::string(cName);
obj[U("servermain.CHANNEL_DIAGNOSTICS_CAPTURE")] = json::value(cDiagCap == "true" || cDiagCap == "t");
obj[U("servermain.CHANNEL_NON_NORMALIZED_FLOATING_POINT_HANDLING")] = json::value(cNormFloat);
obj[U("servermain.CHANNEL_WRITE_OPTIMIZATIONS_DUTY_CYCLE")] = json::value(cWriteDuty);
obj[U("servermain.CHANNEL_WRITE_OPTIMIZATIONS_METHOD")] = json::value(cWriteMeth);
obj[U("servermain.MULTIPLE_TYPES_DEVICE_DRIVER")] = json::value::string(U("Simulator")); // right now, Simulator channels are the only option
obj[U("simulator.CHANNEL_ITEM_PERSISTENCE")] = json::value(cItemPersist == "true" || cItemPersist == "t");
obj[U("simulator.CHANNEL_ITEM_PERSISTENCE_DATA_FILE")] = json::value::string(cItemPersistDat);
req.set_body(obj);
return client.request(req).then([](http_response response) {
return response;
});
}

Whats the Efficient way to call http request and read inputstream in spark MapTask

Please see the below code sample
JavaRDD<String> mapRDD = filteredRecords
.map(new Function<String, String>() {
#Override
public String call(String url) throws Exception {
BufferedReader in = null;
URL formatURL = new URL((url.replaceAll("\"", ""))
.trim());
try {
HttpURLConnection con = (HttpURLConnection) formatURL
.openConnection();
in = new BufferedReader(new InputStreamReader(con
.getInputStream()));
return in.readLine();
} finally {
if (in != null) {
in.close();
}
}
}
});
here url is http GET request. example
http://ip:port/cyb/test?event=movie&id=604568837&name=SID&timestamp_secs=1460494800&timestamp_millis=1461729600000&back_up_id=676700166
This piece of code is very slow . IP and port are random and load is distributed so ip can have 20 different value with port so I dont see bottleneck .
When I comment
in = new BufferedReader(new InputStreamReader(con
.getInputStream()));
return in.readLine();
The code is too fast.
NOTE: Input data to process is 10GB. Using spark to read from S3.
is there anything wrong I am doing with BufferedReader or InputStreamReader any alternative .
I cant use foreach in spark as I have to get the response back from server and need to save JAVARdd as textFile on HDFS.
if we use mappartition code something as below
JavaRDD<String> mapRDD = filteredRecords.mapPartitions(new FlatMapFunction<Iterator<String>, String>() {
#Override
public Iterable<String> call(Iterator<String> tuple) throws Exception {
final List<String> rddList = new ArrayList<String>();
Iterable<String> iterable = new Iterable<String>() {
#Override
public Iterator<String> iterator() {
return rddList.iterator();
}
};
while(tuple.hasNext()) {
URL formatURL = new URL((tuple.next().replaceAll("\"", ""))
.trim());
HttpURLConnection con = (HttpURLConnection) formatURL
.openConnection();
try(BufferedReader br = new BufferedReader(new InputStreamReader(con
.getInputStream()))) {
rddList.add(br.readLine());
} catch (IOException ex) {
return rddList;
}
}
return iterable;
}
});
here also for each record we are doing same .. isnt it ?

Currently you are using
map function
which creates a url request for each row in the partition.
You can use
mapPartition
Which will make the code run faster as it creates connection to the server only once , that is only one connection per partition.

A big cost here is setting up TCP/HTTPS connections. This is exacerbated by the fact that Even if you only read the first (short) line of a large file, in an attempt to re-use HTTP/1.1 connections better, modern HTTP clients try to read() to the end of the file, so avoiding aborting the connection. This is a good strategy for small files, but not for those in MB.
There is a solution there: set the content-length on the read, so that only a smaller block is read in, reducing the cost of the close(); the connection recycling then reduces HTTPS setup costs. This is what the latest Hadoop/Spark S3A client does if you set fadvise=random on the connection: requests blocks rather than the entire multi-GB file. Be aware though: that design is actually really bad if you are going byte-by-byte through a file...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

MQ - Data Compression on queue - compression

Channel level compression is used to compress the data in transit between the two ends of the channel, in your case between the JMS client and the MQ SVRCONN channel. The messages themselves will be compressed while going over the network but not while sitting on the queue.

Related

How to make POST request to a web server with C++ and Core Foundation APIs for macOS?

How to get download file size before download using C/C++ in Linux environment [duplicate]

Akka actor for http request Java

How can I issue a POST request that contains a basic authentication header, and a JSON body?

Whats the Efficient way to call http request and read inputstream in spark MapTask

Categories

Resources