It seems (based on wireshark), cohttp client closes its connection automatically after response to GET request was received.
Is there a way to keep this connection alive (to make it persistent)?
If no is there any other HTTP library to create persistent connections?
Looking at the code at github it doesn't look like there is such an option.
let call ?(ctx=default_ctx) ?headers ?(body=`Empty) ?chunked meth uri =
...
Net.connect_uri ~ctx uri >>= fun (conn, ic, oc) ->
let closefn () = Net.close ic oc in
...
read_response ~closefn ic oc meth
Where read_response is:
let read_response ~closefn ic oc meth =
...
match has_body with
| `Yes | `Unknown ->
let reader = Response.make_body_reader res ic in
let stream = Body.create_stream Response.read_body_chunk reader in
let closefn = closefn in
Lwt_stream.on_terminate stream closefn;
let gcfn st = closefn () in
Gc.finalise gcfn stream;
let body = Body.of_stream stream in
return (res, body)
If I am reading this correctly the connection will close as soon as the GC cleans up the stream.
Related
I am new to rust and I am trying to port golang code that I had written previosuly. The go code basically downloaded files from s3 and directly (without writing to disk) ungziped the files and parsed them.
Currently the only solution I found is to save the gzipped files on disk then ungzip and parse them.
Perfect pipeline would be to directly ungzip and parse them.
How can I accomplish this?
const ENV_CRED_KEY_ID: &str = "KEY_ID";
const ENV_CRED_KEY_SECRET: &str = "KEY_SECRET";
const BUCKET_NAME: &str = "bucketname";
const REGION: &str = "us-east-1";
use anyhow::{anyhow, bail, Context, Result}; // (xp) (thiserror in prod)
use aws_sdk_s3::{config, ByteStream, Client, Credentials, Region};
use std::env;
use std::io::{Write};
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<()> {
let client = get_aws_client(REGION)?;
let keys = list_keys(&client, BUCKET_NAME, "CELLDATA/year=2022/month=06/day=06/").await?;
println!("List:\n{}", keys.join("\n"));
let dir = Path::new("input/");
let key: &str = &keys[0];
download_file_bytes(&client, BUCKET_NAME, key, dir).await?;
println!("Downloaded {key} in directory {}", dir.display());
Ok(())
}
async fn download_file_bytes(client: &Client, bucket_name: &str, key: &str, dir: &Path) -> Result<()> {
// VALIDATE
if !dir.is_dir() {
bail!("Path {} is not a directory", dir.display());
}
// create file path and parent dir(s)
let mut file_path = dir.join(key);
let parent_dir = file_path
.parent()
.ok_or_else(|| anyhow!("Invalid parent dir for {:?}", file_path))?;
if !parent_dir.exists() {
create_dir_all(parent_dir)?;
}
file_path.set_extension("json");
// BUILD - aws request
let req = client.get_object().bucket(bucket_name).key(key);
// EXECUTE
let res = req.send().await?;
// STREAM result to file
let mut data: ByteStream = res.body;
let file = File::create(&file_path)?;
let Some(bytes)= data.try_next().await?;
let mut gzD = GzDecoder::new(&bytes);
let mut buf_writer = BufWriter::new( file);
while let Some(bytes) = data.try_next().await? {
buf_writer.write(&bytes)?;
}
buf_writer.flush()?;
Ok(())
}
fn get_aws_client(region: &str) -> Result<Client> {
// get the id/secret from env
let key_id = env::var(ENV_CRED_KEY_ID).context("Missing S3_KEY_ID")?;
let key_secret = env::var(ENV_CRED_KEY_SECRET).context("Missing S3_KEY_SECRET")?;
// build the aws cred
let cred = Credentials::new(key_id, key_secret, None, None, "loaded-from-custom-env");
// build the aws client
let region = Region::new(region.to_string());
let conf_builder = config::Builder::new().region(region).credentials_provider(cred);
let conf = conf_builder.build();
// build aws client
let client = Client::from_conf(conf);
Ok(client)
}
Your snippet doesn't tell where GzDecoder comes from, but I'll assume it's flate2::read::GzDecoder.
flate2::read::GzDecoder is already built in a way that it can wrap anything that implements std::io::Read:
GzDecoder::new expects an argument that implements Read => deflated data in
GzDecoder itself implements Read => inflated data out
Therefore, you can use it just like a BufReader: Wrap your reader and used the wrapped value in place:
use flate2::read::GzDecoder;
use std::fs::File;
use std::io::BufReader;
use std::io::Cursor;
fn main() {
let data = [0, 1, 2, 3];
// Something that implements `std::io::Read`
let c = Cursor::new(data);
// A dummy output
let mut out_file = File::create("/tmp/out").unwrap();
// Using the raw data would look like this:
// std::io::copy(&mut c, &mut out_file).unwrap();
// To inflate on the fly, "pipe" the data through the decoder, i.e. wrap the reader
let mut stream = GzDecoder::new(c);
// Consume the `Read`er somehow
std::io::copy(&mut stream, &mut out_file).unwrap();
}
playground
You don't mention what "and parse them" entails, but the same concept applies: If your parser can read from an impl Read (e.g. it can read from a std::fs::File), then it can also read directly from a GzDecoder.
I'm looking to build multiple concurrent servers on different ports with Rust and Tokio:
let mut core = Core::new().unwrap();
let handle = core.handle();
// I want to bind to multiple port here if it's possible with simple addresses
let addr = "127.0.0.1:80".parse().unwrap();
let addr2 = "127.0.0.1:443".parse().unwrap();
// Or here if there is a special function on the TcpListener
let sock = TcpListener::bind(&addr, &handle).unwrap();
// Or here if there is a special function on the sock
let server = sock.incoming().for_each(|(client_stream, remote_addr)| {
// And then retrieve the current port in the callback
println!("Receive connection on {}!", mysterious_function_to_retrieve_the_port);
Ok(())
});
core.run(server).unwrap();
Is there an option with Tokio to listen to multiple ports or do I need to create a simple thread for each port and run Core::new() in each?
Thanks to rust-scoped-pool, I have:
let pool = Pool::new(2);
let mut listening_on = ["127.0.0.1:80", "127.0.0.1:443"];
pool.scoped(|scope| {
for address in &mut listening_on {
scope.execute(move ||{
let mut core = Core::new().unwrap();
let handle = core.handle();
let addr = address.parse().unwrap();
let sock = TcpListener::bind(&addr, &handle).unwrap();
let server = sock.incoming().for_each(|(client_stream, remote_addr)| {
println!("Receive connection on {}!", address);
Ok(())
});
core.run(server).unwrap();
});
}
});
rust-scoped-pool is the only solution I have found to execute multiple threads and wait forever after spawning them. I think it's working but I was wondering if a simpler solution existed.
You can run multiple servers from one thread. core.run(server).unwrap(); is just a convenience method and not the only/main way to do things.
Instead of running the single ForEach to completion, spawn each individually and then just keep the thread alive:
let mut core = Core::new().unwrap();
let handle = core.handle();
// I want to bind to multiple port here if it's possible with simple addresses
let addr = "127.0.0.1:80".parse().unwrap();
let addr2 = "127.0.0.1:443".parse().unwrap();
// Or here if there is a special function on the TcpListener
let sock = TcpListener::bind(&addr, &handle).unwrap();
// Or here if there is a special function on the sock
let server = sock.incoming().for_each(|(client_stream, remote_addr)| {
// And then retrieve the current port in the callback
println!("Receive connection on {}!", mysterious_function_to_retrieve_the_port);
Ok(())
});
handle.spawn(sock);
handle.spawn(server);
loop {
core.turn(None);
}
I'd just like to follow up that there seems to be a slightly less manual way to do things than 46bit's answer (at least as of 2019).
let addr1 = "127.0.0.1:80".parse().unwrap();
let addr2 = "127.0.0.1:443".parse().unwrap();
let sock1 = TcpListener::bind(&addr1, &handle).unwrap();
let sock2 = TcpListener::bind(&addr2, &handle).unwrap();
let server1 = sock1.incoming().for_each(|_| Ok(()));
let server2 = sock2.incoming().for_each(|_| Ok(()));
let mut runtime = tokio::runtime::Runtime()::new().unwrap();
runtime.spawn(server1);
runtime.spawn(server2);
runtime.shutdown_on_idle().wait().unwrap();
I'm new in erlang and I'm trying to implement a register/login server. I have a function to register new users, and it works well:
reg(Sock) ->
receive
{tcp, _, Usr} ->
io:format("User: ~p ~n",[Usr])
end,
gen_tcp:send(Sock, [Usr]),
receive
{tcp, _, Pass} ->
io:format(" a pass ~p~n",[Pass])
end,
gen_tcp:send(Sock, [Pass]),
receive
{tcp, _, Msg} ->
case Msg of
<<"condutor\n">> ->
condutor ! {register, {Usr, Pass}};
<<"passageiro\n">> ->
io:format("passageiro~n")
end
end.
But now I want to have another function that controlls if a user wants to login or register, and send the proper function. But when I add this function it doesn't read the input of the user:
gestor(Sock) ->
receive
{tcp, _, Msg} ->
case Msg of
<<"login\n">> ->
login(Sock);
<<"registo\n">> ->
gen_tcp:send(Sock, "OK"),
reg(Sock)
end
end.
It receives the option of the user, sends him to the right function but then it doesn't read anything, I can't understand this because if I call the function reg directly it works fine, but if I call that function from another function, I can't read nothing from the socket.
If anyone could help me I would apreciate it very much.
EDITED:
Thanks a lot for your replies, I'm trying to implement a Java client and a Erlang server communicating through a Tcp socket, it's intended that the user types "registo" than a username, a password, and "condutor" or "passageiro".
The socket works fine because if I call reg(sock) instead of gestor(Sock) everything works as expected, the problem is when I call reg(Sock) inside gestor(Sock) in this case I can't receive the user input in reg function.
-module(server2).
-export([server/1]).
server(Port) ->
{ok, LSock} = gen_tcp:listen(Port, [binary, {packet, line}, {reuseaddr, true}]),
Condutor = spawn(fun()-> regUtente([]) end),
register(condutor, Condutor),
acceptor(LSock).
acceptor(LSock) ->
{ok, Sock} = gen_tcp:accept(LSock),
spawn(fun() -> acceptor(LSock) end),
io:format("Ligação estabelecida~n"),
% reg(Sock). --- Caling reg directly, without passing through gestor WORKS FINE.
gestor(Sock).
Java Client:
import java.io.*;
import java.net.*;
public class EchoClient {
public static void main(String[] args) throws IOException {
if (args.length != 2) {
System.err.println(
"Usage: java EchoClient <host name> <port number>");
System.exit(1);
}
String hostName = args[0];
int portNumber = Integer.parseInt(args[1]);
try (
Socket echoSocket = new Socket(hostName, portNumber);
PrintWriter out =
new PrintWriter(echoSocket.getOutputStream(), true);
BufferedReader in =
new BufferedReader(
new InputStreamReader(echoSocket.getInputStream()));
BufferedReader stdIn =
new BufferedReader(
new InputStreamReader(System.in))
) {
String userInput;
while ((userInput = stdIn.readLine()) != null) {
out.println(userInput);
System.out.println("echo: " + in.readLine());
}
} catch (UnknownHostException e) {
System.err.println("Don't know about host " + hostName);
System.exit(1);
} catch (IOException e) {
System.err.println("Couldn't get I/O for the connection to " +
hostName);
System.exit(1);
}
}
}
Send "OK\n" as a response to "registo." Your java code is using readline so you need a line terminator.
I want to do a client-side js_of_ocaml application with a server in OCaml, with contraints described below, and I would like to know if the approach below is right or if there is a more efficient one. The server can sometimes send large quantities of data (> 30MB).
In order to make the communication between client and server safer and more efficient, I am sharing a type t in a .mli file like this :
type client_to_server =
| Say_Hello
| Do_something_with of int
type server_to_client =
| Ack
| Print of string * int
Then, this type is marshalled into a string and sent on the network. I am aware that on the client side, some types are missing (Int64.t).
Also, in a XMLHTTPRequest sent by the client, we want to receive more than one marshalled object from the server, and sometimes in a streaming mode (ie: process the marshal object received (if possible) during the loading state of the request, and not only during the done state).
These constraints force us to use the field responseText of the XMLHTTPRequest with the content-type application/octet-stream.
Moreover, when we get back the response from responseText, an encoding conversion is made because JavaScript's string are in UTF-16. But the marshalled object being binary data, we do what is necessary in order to retrieve our binary data (by overriding the charset with x-user-defined and by applying a mask on each character of the responseText string).
The server (HTTP server in OCaml) is doing something simple like this:
let process_request req =
let res = process_response req in
let s = Marshal.to_string res [] in
send s
However, on the client side, the actual JavaScript primitive of js_of_ocaml for caml_marshal_data_size needs an MlString. But in streaming mode, we don't want to convert the javascript's string in a MlString (which can iter on the full string), we prefer to do the size verification and unmarshalling (and the application of the mask for the encoding problem) only on the bytes read. Therefore, I have writen my own marshal primitives in javascript.
The client code for processing requests and responses is:
external marshal_total_size : Js.js_string Js.t -> int -> int = "my_marshal_total_size"
external marshal_from_string : Js.js_string Js.t -> int -> 'a = "my_marshal_from_string"
let apply (f:server_to_client -> unit) (str:Js.js_string Js.t) (ofs:int) : int =
let len = str##length in
let rec aux pos =
let tsize =
try Some (pos + My_primitives.marshal_total_size str pos)
with Failure _ -> None
in
match tsize with
| Some tsize when tsize <= len ->
let data = My_primitives.marshal_from_string str pos in
f data;
aux tsize
| _ -> pos
in
aux ofs
let reqcallback f req ofs =
match req##readyState, req##status with
| XmlHttpRequest.DONE, 200 ->
ofs := apply f req##responseText !ofs
| XmlHttpRequest.LOADING, 200 ->
ignore (apply f req##responseText !ofs)
| _, 200 -> ()
| _, i -> process_error i
let send (f:server_to_client -> unit) (order:client_to_server) =
let order = Marshal.to_string order [] in
let msg = Js.string (my_encode order) in (* Do some stuff *)
let req = XmlHttpRequest.create () in
req##_open(Js.string "POST", Js.string "/kernel", Js._true);
req##setRequestHeader(Js.string "Content-Type",
Js.string "application/octet-stream");
req##onreadystatechange <- Js.wrap_callback (reqcallback f req (ref 0));
req##overrideMimeType(Js.string "application/octet-stream; charset=x-user-defined");
req##send(Js.some msg)
And the primitives are:
//Provides: my_marshal_header_size
var my_marshal_header_size = 20;
//Provides: my_int_of_char
function my_int_of_char(s, i) {
return (s.charCodeAt(i) & 0xFF); // utf-16 char to 8 binary bit
}
//Provides: my_marshal_input_value_from_string
//Requires: my_int_of_char, caml_int64_float_of_bits, MlStringFromArray
//Requires: caml_int64_of_bytes, caml_marshal_constants, caml_failwith
var my_marshal_input_value_from_string = function () {
/* Quite the same thing but with a custom Reader which
will call my_int_of_char for each byte read */
}
//Provides: my_marshal_data_size
//Requires: caml_failwith, my_int_of_char
function my_marshal_data_size(s, ofs) {
function get32(s,i) {
return (my_int_of_char(s, i) << 24) | (my_int_of_char(s, i + 1) << 16) |
(my_int_of_char(s, i + 2) << 8) | (my_int_of_char(s, i + 3));
}
if (get32(s, ofs) != (0x8495A6BE|0))
caml_failwith("MyMarshal.data_size");
return (get32(s, ofs + 4));
}
//Provides: my_marshal_total_size
//Requires: my_marshal_data_size, my_marshal_header_size, caml_failwith
function my_marshal_total_size(s, ofs) {
if ( ofs < 0 || ofs > s.length - my_marshal_header_size )
caml_failwith("Invalid argument");
else return my_marshal_header_size + my_marshal_data_size(s, ofs);
}
Is this the most efficient way to transfer large OCaml values from server to client, or what would time- and space-efficient alternatives be?
Have you try to use EventSource https://developer.mozilla.org/en-US/docs/Web/API/EventSource
You could stream json data instead of marshaled data.
Json.unsafe_input should be faster than unmarshal.
class type eventSource =
object
method onmessage :
(eventSource Js.t, event Js.t -> unit) Js.meth_callback
Js.writeonly_prop
end
and event =
object
method data : Js.js_string Js.t Js.readonly_prop
method event : Js.js_string Js.t Js.readonly_prop
end
let eventSource : (Js.js_string Js.t -> eventSource Js.t) Js.constr =
Js.Unsafe.global##_EventSource
let send (f:server_to_client -> unit) (order:client_to_server) url_of_order =
let url = url_of_order order in
let es = jsnew eventSource(Js.string url) in
es##onmessage <- Js.wrap_callback (fun e ->
let d = Json.unsafe_input (e##data) in
f d);
()
On the server side, you then need to rely on deriving_json http://ocsigen.org/js_of_ocaml/2.3/api/Deriving_Json to serialize your data
type server_to_client =
| Ack
| Print of string * int
deriving (Json)
let process_request req =
let res = process_response req in
let data = Json_server_to_client.to_string res in
send data
note1: Deriving_json serialize ocaml value to json using the internal representation of values in js_of_ocaml. Json.unsafe_input is a fast deserializer for Deriving_json that rely on browser-native JSON support.
note2: Deriving_json and Json.unsafe_input take care of ocaml string encoding
I'm trying to upload a byte array to a remote WS with multipart/form-data in play! framework using Scala. My code is:
//create byte array from file
val myFile = new File(pathName)
val in = new FileInputStream(myFile)
val myByteArray = new Array[Byte](audioFile.length.toInt)
in.read(audioByteArray)
in.close()
// create parts
val langPart = new StringPart("lang", "pt")
val taskPart = new StringPart("task","echo")
val audioPart = new ByteArrayPart("sbytes", "myFilename", myByteArray, "default/binary", "UTF-8")
val client: AsyncHttpClient = WS.client
val request = client.preparePost(RemoteWS)
.addHeader("Content-Type", "multipart/form-data")
.addBodyPart(audioPart)
.addBodyPart(taskPart)
.addBodyPart(langPart).build()
val result = Promise[Response]()
// execute request
client.executeRequest(request, new AsyncCompletionHandler[AHCResponse]{
override def onCompleted(response: AHCResponse): AHCResponse = {
result.success(Response(response))
response
}
override def onThrowable(t: Throwable) {
result.failure(t)
}
})
// handle async response
result.future.map(result =>{
Logger.debug("response: " + result.getAHCResponse.getResponseBody("UTF-8"))
})
Every time I execute this request, it throws this exception:
java.io.IOException: Unable to write on channel java.nio.channels.SocketChannel
The remote server is OK, I can make sucessfull requests using for example Postman.
I'm using Play Framework:
play 2.2.3 built with Scala 2.10.3 (running Java 1.8.0_05)
Version of AsyncHttpClient:
com.ning:async-http-client:1.7.18
Any help is welcome!