I am trying to replicate one of our Python lambda function to Rust and got stuck at the beginning with having a client to S3.
In Python it is simple, during the Lambda initialisation I declare a variable. However, when using Rust it is a bit more complicated.
As a newcomer to Rust dealing with async in a lazy_static setup is difficult.
use aws_config::meta::region::RegionProviderChain;
use aws_config::SdkConfig;
use aws_sdk_s3::Client as S3Client;
use lambda_http::Error as LambdaHttpError;
use lambda_http::{run, service_fn, Body, Error, Request, Response};
use lazy_static::lazy_static;
async fn connect_to_s3() -> S3Client {
let region_provider: RegionProviderChain =
RegionProviderChain::default_provider().or_else("eu-west-1");
let config = aws_config::load_from_env().await;
let client: S3Client = S3Client::new(&config);
client
}
lazy_static! {
static ref S3_CLIENT: S3Client = connect_to_s3();
}
This throws the following error:
error[E0308]: mismatched types
--> src/bin/lambda/rora.rs:20:38
|
20 | static ref S3_CLIENT: S3Client = connect_to_s3();
| -------- ^^^^^^^^^^^^^^^ expected struct `aws_sdk_s3::Client`, found opaque type
| |
| expected `aws_sdk_s3::Client` because of return type
|
note: while checking the return type of the `async fn`
--> src/bin/lambda/rora.rs:10:29
|
10 | async fn connect_to_s3() -> S3Client {
| ^^^^^^^^ checked the `Output` of this `async fn`, found opaque type
= note: expected struct `aws_sdk_s3::Client`
found opaque type `impl Future<Output = aws_sdk_s3::Client>`
help: consider `await`ing on the `Future`
|
20 | static ref S3_CLIENT: S3Client = connect_to_s3().await;
| ++++++
How can I initialize the connection to s3 during setup of the lambda?
After reviewing a bunch of code on github and talking to some of the Rust devs I know here is the current best option that I am aware of:
use async_once::AsyncOnce;
use lazy_static::lazy_static;
use aws_config::meta::region::RegionProviderChain;
use aws_sdk_glue::Client as GlueClient;
use aws_sdk_s3::Client as S3Client;
lazy_static! {
static ref S3_CLIENT: AsyncOnce<S3Client> = AsyncOnce::new(async {
let region_provider = RegionProviderChain::default_provider().or_else("eu-west-1");
let config = aws_config::from_env().region(region_provider).load().await;
S3Client::new(&config)
});
static ref GLUE_CLIENT: AsyncOnce<GlueClient> = AsyncOnce::new(async {
let region_provider = RegionProviderChain::default_provider().or_else("eu-west-1");
let config = aws_config::from_env().region(region_provider).load().await;
GlueClient::new(&config)
});
}
Related
I am new to rust and I am trying to port golang code that I had written previosuly. The go code basically downloaded files from s3 and directly (without writing to disk) ungziped the files and parsed them.
Currently the only solution I found is to save the gzipped files on disk then ungzip and parse them.
Perfect pipeline would be to directly ungzip and parse them.
How can I accomplish this?
const ENV_CRED_KEY_ID: &str = "KEY_ID";
const ENV_CRED_KEY_SECRET: &str = "KEY_SECRET";
const BUCKET_NAME: &str = "bucketname";
const REGION: &str = "us-east-1";
use anyhow::{anyhow, bail, Context, Result}; // (xp) (thiserror in prod)
use aws_sdk_s3::{config, ByteStream, Client, Credentials, Region};
use std::env;
use std::io::{Write};
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<()> {
let client = get_aws_client(REGION)?;
let keys = list_keys(&client, BUCKET_NAME, "CELLDATA/year=2022/month=06/day=06/").await?;
println!("List:\n{}", keys.join("\n"));
let dir = Path::new("input/");
let key: &str = &keys[0];
download_file_bytes(&client, BUCKET_NAME, key, dir).await?;
println!("Downloaded {key} in directory {}", dir.display());
Ok(())
}
async fn download_file_bytes(client: &Client, bucket_name: &str, key: &str, dir: &Path) -> Result<()> {
// VALIDATE
if !dir.is_dir() {
bail!("Path {} is not a directory", dir.display());
}
// create file path and parent dir(s)
let mut file_path = dir.join(key);
let parent_dir = file_path
.parent()
.ok_or_else(|| anyhow!("Invalid parent dir for {:?}", file_path))?;
if !parent_dir.exists() {
create_dir_all(parent_dir)?;
}
file_path.set_extension("json");
// BUILD - aws request
let req = client.get_object().bucket(bucket_name).key(key);
// EXECUTE
let res = req.send().await?;
// STREAM result to file
let mut data: ByteStream = res.body;
let file = File::create(&file_path)?;
let Some(bytes)= data.try_next().await?;
let mut gzD = GzDecoder::new(&bytes);
let mut buf_writer = BufWriter::new( file);
while let Some(bytes) = data.try_next().await? {
buf_writer.write(&bytes)?;
}
buf_writer.flush()?;
Ok(())
}
fn get_aws_client(region: &str) -> Result<Client> {
// get the id/secret from env
let key_id = env::var(ENV_CRED_KEY_ID).context("Missing S3_KEY_ID")?;
let key_secret = env::var(ENV_CRED_KEY_SECRET).context("Missing S3_KEY_SECRET")?;
// build the aws cred
let cred = Credentials::new(key_id, key_secret, None, None, "loaded-from-custom-env");
// build the aws client
let region = Region::new(region.to_string());
let conf_builder = config::Builder::new().region(region).credentials_provider(cred);
let conf = conf_builder.build();
// build aws client
let client = Client::from_conf(conf);
Ok(client)
}
Your snippet doesn't tell where GzDecoder comes from, but I'll assume it's flate2::read::GzDecoder.
flate2::read::GzDecoder is already built in a way that it can wrap anything that implements std::io::Read:
GzDecoder::new expects an argument that implements Read => deflated data in
GzDecoder itself implements Read => inflated data out
Therefore, you can use it just like a BufReader: Wrap your reader and used the wrapped value in place:
use flate2::read::GzDecoder;
use std::fs::File;
use std::io::BufReader;
use std::io::Cursor;
fn main() {
let data = [0, 1, 2, 3];
// Something that implements `std::io::Read`
let c = Cursor::new(data);
// A dummy output
let mut out_file = File::create("/tmp/out").unwrap();
// Using the raw data would look like this:
// std::io::copy(&mut c, &mut out_file).unwrap();
// To inflate on the fly, "pipe" the data through the decoder, i.e. wrap the reader
let mut stream = GzDecoder::new(c);
// Consume the `Read`er somehow
std::io::copy(&mut stream, &mut out_file).unwrap();
}
playground
You don't mention what "and parse them" entails, but the same concept applies: If your parser can read from an impl Read (e.g. it can read from a std::fs::File), then it can also read directly from a GzDecoder.
I'm new in rust and i need to make a small if statement on function option for example
use isahc::{
HttpClient,
config::{
RedirectPolicy,
VersionNegotiation,
SslOption},
prelude::*
};
use std::{
time::Duration
};
pub struct http {
pub timeout: u64
}
impl http {
pub fn send(&self) -> HttpClient {
let client =
HttpClient::builder()
.version_negotiation(VersionNegotiation::http11())
.redirect_policy(RedirectPolicy::None)
.timeout(Duration::from_secs(self.timeout));
.ssl_options(SslOption::DANGER_ACCEPT_INVALID_CERTS | SslOption::DANGER_ACCEPT_REVOKED_CERTS);
return client.build().unwrap();
}
}
fn main(){
let req = http{ timeout:"20".parse().unwrap()};
let test = req.send();
test.get("https://www.google.com");
}
now in my program the user will give me the options of the request (eg: follow redirects or not ) and this need if statement on these options so i tried to use it on this case but i always get a different function return type
impl http {
pub fn send(&self) -> HttpClient {
let client =
HttpClient::builder()
.version_negotiation(VersionNegotiation::http11())
.redirect_policy(RedirectPolicy::None)
.ssl_options(SslOption::DANGER_ACCEPT_INVALID_CERTS | SslOption::DANGER_ACCEPT_REVOKED_CERTS);
if 1 == 1 {
client.timeout(Duration::from_secs(self.timeout));
}
return client.build().unwrap();
}
}
Cargo Output
warning: type `http` should have an upper camel case name
--> src/sender.rs:14:12
|
14 | pub struct http {
| ^^^^ help: convert the identifier to upper camel case: `Http`
|
= note: `#[warn(non_camel_case_types)]` on by default
error[E0382]: use of moved value: `client`
--> src/sender.rs:32:16
|
24 | let client =
| ------ move occurs because `client` has type `HttpClientBuilder`, which does not implement the `Copy` trait
...
30 | client.timeout(Duration::from_secs(self.timeout));
| ------ value moved here
31 | }
32 | return client.build().unwrap();
| ^^^^^^ value used here after move
so what I'm doing wrong ?
i tired to change the function type exception but i can't use the function of the class client.get() for example
this a clear example in python for explain what i need to do
options : dict = {
"redirects": False,
"ssl_path":"~/cer.pem"
}
def send(opts):
# not real httplib !
r = httplib.get("http://stackoverflow.com")
if opts.get('redirects') == True:
r.redirects = True
if opts.get('cert_path',''):
r.ssl = opts.get('cert_path')
return r.send()
def main():
send(options)
Thanks
Since this is a builder pattern, each function call consumes the builder and returns it back. So you need to capture client from the return value of the timeout function in order to continue using it. Note that you also need to make client mutable.
Something like
impl http {
pub fn send(&self) -> HttpClient {
let mut client =
HttpClient::builder()
.version_negotiation(VersionNegotiation::http11())
.redirect_policy(RedirectPolicy::None)
.ssl_options(SslOption::DANGER_ACCEPT_INVALID_CERTS | SslOption::DANGER_ACCEPT_REVOKED_CERTS);
if 1 == 1 {
client = client.timeout(Duration::from_secs(self.timeout));
}
return client.build().unwrap();
}
}
I need to generate a value with a different type from my passed type. This is the first time I write on ocaml-like, and for example, in a familiar me haskell I would use Data.Generics.
How I have understood I need to use decorator and ppx. I wrote simple example
let recordHandler = (loc: Location.t, _recFlag: rec_flag, _t: type_declaration, fields: list(label_declaration)) => {
let (module Builder) = Ast_builder.make(loc);
let test = [%str
let schema: Schema = { name: "", _type: String, properties: [] }
]
let moduleExpr = Builder.pmod_structure(test);
[%str
module S = [%m moduleExpr]
]
}
let str_gen = (~loc, ~path as _, (_rec: rec_flag, t: list(type_declaration))) => {
let t = List.hd(t)
switch t.ptype_kind {
| Ptype_record(fields) => recordHandler(loc, _rec, t, fields);
| _ => Location.raise_errorf(~loc, "schema is used only for records.");
};
};
let name = "my_schema";
let () = {
let str_type_decl = Deriving.Generator.make_noarg(str_gen);
Deriving.add(name, ~str_type_decl) |> Deriving.ignore;
};
And
open Ppxlib;
let _ = Driver.run_as_ppx_rewriter()
But in using in rescript code
module User = {
#deriving(my_schema)
type my_typ = {
foo: int,
};
};
I caught:
schema is not supported
. And I made myself sure me to connect it right when I had changed #deriving(my_schema) for #deriving(abcd) and #deriving(sschema).
I got different error
Ppxlib.Deriving: 'abcd' is not a supported type deriving generator.
And my last experiment was to copy past existing library deriving accessors .
ppx_accessor
I copied-pasted it and renamed for accessors_2. And I got same error such as experiment.
accessors_2 is not supported
Also I haven't found examples "ppx rescript". Can you please help me.
What am I doing wrong (ALL , I know)
I have found answer in the article
Dropping support for custom PPXes such as ppx_deriving (the deriving
attribute is now exclusively interpreted as bs.deriving)
I'm trying to verify an IP address in Rust, but I can't find a solution casting a str into a u8 that doesn't involve using nightly Rust:
use std::net::{IpAddr, Ipv4Addr};
fn verify_address(address: String) -> bool {
let v: Vec<&str> = address.split('.').collect();
let v_u8: Vec<u8> = v.iter().map(|c| *c.to_owned() as u8).collect();
let addr = IpAddr::V4(Ipv4Addr::new(v_u8[0], v_u8[1], v_u8[2], v_u8[3]));
//.expect("ERR: Error parsing IPv4 address!");
if !addr.is_ipv4() {
return false;
}
return true;
}
error[E0605]: non-primitive cast: `str` as `u8`
--> src/lib.rs:6:42
|
6 | let v_u8: Vec<u8> = v.iter().map(|c| *c.to_owned() as u8).collect();
| ^^^^^^^^^^^^^^^^^^^
|
= note: an `as` expression can only be used to convert between primitive types. Consider using the `From` trait
Please reread the chapter about error handling. You do not need all this:
use std::net::Ipv4Addr;
fn main() {
let ip = "127.0.0.1".parse::<Ipv4Addr>();
match ip {
Ok(ip) => println!("valid"),
Err(e) => println!("invalid"),
}
}
About your question, you can use primitive cast for... primitive types only. You must use From and Into, or parse if you convert a &str into another type.
This is an experiment I'm doing while learning Rust and following Programming Rust.
Here's a link to the code in the playground.
I have a struct (Thing) with some inner state (xs). A Thing should be created with Thing::new and then started, after which the user should choose to call some other function like get_xs.
But! In start 2 threads are spawned which call other methods on the Thing instance that could mutate its inner state (say, add elements to xs), so they need a reference to self (hence the Arc). However, this causes a lifetime conflict:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
--> src/main.rs:18:30
|
18 | let self1 = Arc::new(self);
| ^^^^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined
on the method body at 17:5...
--> src/main.rs:17:5
|
17 | / fn start(&self) -> io::Result<Vec<JoinHandle<()>>> {
18 | | let self1 = Arc::new(self);
19 | | let self2 = self1.clone();
20 | |
... |
33 | | Ok(vec![handle1, handle2])
34 | | }
| |_____^
note: ...so that expression is assignable (expected &Thing, found &Thing)
--> src/main.rs:18:30
|
18 | let self1 = Arc::new(self);
| ^^^^
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `[closure#src/main.rs:23:20: 25:14
self1:std::sync::Arc<&Thing>]` will meet its required lifetime bounds
--> src/main.rs:23:14
|
23 | .spawn(move || loop {
| ^^^^^
Is there a way of spawning the state-mutating threads and still give back ownership of thing after running start to the code that's using it?
use std::io;
use std::sync::{Arc, LockResult, RwLock, RwLockReadGuard};
use std::thread::{Builder, JoinHandle};
struct Thing {
xs: RwLock<Vec<String>>
}
impl Thing {
fn new() -> Thing {
Thing {
xs: RwLock::new(Vec::new()),
}
}
fn start(&self) -> io::Result<Vec<JoinHandle<()>>> {
let self1 = Arc::new(self);
let self2 = self1.clone();
let handle1 = Builder::new()
.name("thread1".to_owned())
.spawn(move || loop {
self1.do_within_thread1();
})?;
let handle2 = Builder::new()
.name("thread2".to_owned())
.spawn(move || loop {
self2.do_within_thread2();
})?;
Ok(vec![handle1, handle2])
}
fn get_xs(&self) -> LockResult<RwLockReadGuard<Vec<String>>> {
return self.xs.read();
}
fn do_within_thread1(&self) {
// read and potentially mutate self.xs
}
fn do_within_thread2(&self) {
// read and potentially mutate self.xs
}
}
fn main() {
let thing = Thing::new();
let handles = match thing.start() {
Ok(hs) => hs,
_ => panic!("Error"),
};
thing.get_xs();
for handle in handles {
handle.join();
}
}
The error message says that the value passed to the Arc must live the 'static lifetime. This is because spawning a thread, be it with std::thread::spawn or std::thread::Builder, requires the passed closure to live this lifetime, thus enabling the thread to "live freely" beyond the scope of the spawning thread.
Let us expand the prototype of the start method:
fn start<'a>(&'a self: &'a Thing) -> io::Result<Vec<JoinHandle<()>>> { ... }
The attempt of putting a &'a self into an Arc creates an Arc<&'a Thing>, which is still constrained to the lifetime 'a, and so cannot be moved to a closure that needs to live longer than that. Since we cannot move out &self either, the solution is not to use &self for this method. Instead, we can make start accept an Arc directly:
fn start(thing: Arc<Self>) -> io::Result<Vec<JoinHandle<()>>> {
let self1 = thing.clone();
let self2 = thing;
let handle1 = Builder::new()
.name("thread1".to_owned())
.spawn(move || loop {
self1.do_within_thread1();
})?;
let handle2 = Builder::new()
.name("thread2".to_owned())
.spawn(move || loop {
self2.do_within_thread2();
})?;
Ok(vec![handle1, handle2])
}
And pass reference-counted pointers at the consumer's scope:
let thing = Arc::new(Thing::new());
let handles = Thing::start(thing.clone()).unwrap_or_else(|_| panic!("Error"));
thing.get_xs().unwrap();
for handle in handles {
handle.join().unwrap();
}
Playground. At this point the program will compile and run (although the workers are in an infinite loop, so the playground will kill the process after the timeout).