Spring Boot Multipart File Upload Size Inconsistency - amazon-web-services

I have an endpoint that uploads an image file to the server, then to S3.
When I run on localhost, the MultipartFile byte size is correct, and the upload is successful.
However, the moment I deploy it to my EC2 instance the uploaded file size is incorrect.
Controller Code
#PostMapping("/{id}/photos")
fun addPhotos(#PathVariable("id") id: Long,
#RequestParam("file") file: MultipartFile,
jwt: AuthenticationJsonWebToken) = ApiResponse.success(propertyBLL.addPhotos(id, file, jwt))
Within the PropertyBLL.addPhotos method, printing file.size results in the wrong size.
Actual file size is 649305 bytes, however when uploaded to my prod server it reads as 1189763 bytes.
My production server is an AWS EC2 instance, behind Https.
The Spring application yml files are the same. The only configurations I overrode were the file max size properties.
I'm using PostMan to Post the request. I'm passing the body as form-data, key named "file".
Again, it works perfectly when running locally.
I did another test where I wrote the uploaded file to the server so I could compare.
Uploaded file's first n bytes in Hex editor:
EFBFBD50 4E470D0A 1A0A0000 000D4948 44520000 03000000 02400802 000000EF BFBDCC96 01000000 0467414D 410000EF BFBDEFBF BD0BEFBF BD610500 00002063 48524D00 007A2600
Original file's first n bytes:
89504E47 0D0A1A0A 0000000D 49484452 00000300 00000240 08020000 00B5CC96 01000000 0467414D 410000B1 8F0BFC61 05000000 20634852 4D00007A 26000080 840000FA 00000080
They both appear to have the text "PNG" in them and also have the ending EXtdate:modify/create markers.
Per Request, the core contents of addPhoto:
val metadata = ObjectMetadata()
metadata.contentLength = file.size
metadata.contentType = "image/png"
LOGGER.info("Uploading image of size {} bytes, name: {}", file.size, file.originalFilename)
val request = PutObjectRequest(awsProperties.cdnS3Bucket, imageName, file.inputStream, metadata)
awsSdk.putObject(request)
This works when I run web server locally. imageName is just a custom built name. There is other code involving hibernate models, but is not relevant.
Update
This appears to be Https/api proxy related. When I hit the EC2 node's http url, it works fine. However, when I go through the api proxy (https://api.thedomain.com), which proxies to the EC2 node, it fails. I will continue down this path.

After more debugging I discovered that when I POST to the EC2 instance directly everything works as expected. Our primary and public api url makes proxies requests through Amazon's API Gateway service. This service for some reason converts the data to Base64 instead of just passing through raw binary data.
I have found documentation to update the API Gateway to passthrough binary data: here.
I am using the Content-Type value of multipart/form-data. Do not forget to also add it in your API Settings where you enable binary support.
I did not have to edit the headers options, additionally I used the default "Method Request Passthrough" template.
And finally, don't forget to deploy your api changes...
It's now working as expected.

Sorry, but many of the comments make no sense. file.size will return the size of the uploaded file in bytes, NOT the size of the request (which, yes, due to different filters could potentially be enhanced with additional information and increase in size). Spring can't just magically double the size of a PNG file (in your case adding almost another ~600kb of information on top of whatever you've sent). While I'd like to trust that you know what you're doing and the numbers you are giving us are indeed correct, to me, all evidence points to human error... please, double-, triple-, quadruple- check that you're indeed uploading the same file in all scenarios.
How did you get to 649305 bytes in the first place? Who gave you that number? Was it your code or did you actually look at the file on disk and see how big it was? The only way compression discussions make any sense in this context is if 649305 bytes is the already compressed size of the file when running locally (it's actual size on disk being 1189763 bytes) and indeed, compression not being turned on when deployed to AWS for some reason and you receive the full uncompressed file (we don't even know how you are deploying it... is it really the same as locally? Are you running a standalone .jar in both cases? Are you deploying a .war to AWS perhaps instead? Are you really running the app in the same container and container version in both cases or are you perhaps running Tomcat locally and Jetty on AWS? etc. etc. etc.). Are you sure your Postman request is not messed up and you're not sending something else by accident (or more than you think)?
EDIT:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sandbox</groupId>
<artifactId>spring-boot-file-upload</artifactId>
<version>1.0-SNAPSHOT</version>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.6.RELEASE</version>
</parent>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
package com.sandbox;
import static org.springframework.http.ResponseEntity.ok;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
#SpringBootApplication
public class Application {
public static void main(final String[] arguments) {
SpringApplication.run(Application.class,
arguments);
}
#RestController
class ImageRestController {
#PostMapping(path = "/images")
ResponseEntity<String> upload(#RequestParam(name = "file") final MultipartFile image) {
return ok("{\"response\": \"Uploaded image having a size of '" + image.getSize() + "' byte(s).\"}");
}
}
}
The example is in Java because it was just faster to put together (the environment is a simple Java environment having the standalone .jar deployed - no extra configs or anything, except for the server port being on 5000). Either way, you can try it out yourself by sending POST requests to http://test123456.us-east-1.elasticbeanstalk.com/images
This is my Postman request and the response using the image you've provided:
Everything seems to be looking fine on my AWS EB instance and all the numbers add up as expected. If you are saying your setup is as simple as it sounds then I'm unfortunately just as puzzled as you are. I can only assume that there's more to what you have shared so far (however, I doubt the issue is related to Spring Boot... then it is more likely that it has to do with your AWS configs/setup).

CloudFormation Template snippet for achieving Kenny Cason's solution:
MyApi:
Type: AWS::Serverless::Api
Properties:
BinaryMediaTypes:
- "multipart/form-data"

Related

Bypassing Cloud Run 32mb error via HTTP2 end to end solution

I have an api query that runs during a post request on one of my views to populate my dashboard page. I know the response size is ~35mb (greater than the 32mb limits set by cloud run). I was wondering how I could by pass this.
My configuration is set via a hypercorn server and serving my django web app as an asgi app. I have 2 minimum instances, 1gb ram, 2 cpus per instance. I have run this docker container locally and can't bypass the amount of data required and also do not want to store the data due to costs. This seems to be the cheapest route. Any pointers or ideas would be helpful. I understand that I can bypass this via http2 end to end solution but I am unable to do so currently. I haven't created any additional hypecorn configurations. Any help appreciated!
The Cloud Run HTTP response limit is 32 MB and cannot be increased.
One suggestion is to compress the response data. Django has compression libraries for Python or just use zlib.
import gzip
data = b"Lots of content to compress"
cdata = gzip.compress(s_in)
# return compressed data in response
Cloud Run supports HTTP/1.1 server side streaming, which has unlimited response size. All you need to do is use chunked transfer encoding.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding

How Do I Test A Camel Route That Reads From A Directory?

I am creating a Camel application that reads files dropped into a directory on an FTP server, converts them and uploads to a REST API.
If I create a route that reads a file like this:
from("file://target/input?delay=5s")
.choice()
.when(header(Exchange.FILE_NAME).endsWith(".xml"))
.unmarshal(jaxb)
The target/input directory has to exist which then makes testing a bit awkward as my test resources might not be in the correct directory.
In my test it would be easier to just get the contents of the test file and put it into the unmarshal part of the route
from("direct:xml")
.unmarshal(jaxb)
How do I make my Route flexible enough to test without just copying the whole route to a test class and modifying the input component?
You can simply make your from endpoint configurable and configure it through application properties.
For example if you use Spring:
#Value("${endpoint.consumer}")
private String consumerEndpoint;
...
from(consumerEndpoint)
...
You can then provide a test configuration that configures the endpoint as direct:input and use a ProducerTemplate to send test messages with different payloads to this endpoint.
Real application properties:
endpoint.consumer=file://target/input?delay=5s
Test properties:
endpoint.consumer=direct:input
Like this you get rid of any file endpoints in your tests.
If you also got to endpoints, you can make them configurable too. You can turn them into mocks in your tests by configure them as mock:whatever.

Does boto2 use http or https to upload files to s3?

I noticed that uploading small files to S3 bucket is very slow. For a file with size of 100KB, it takes 200ms to upload. Both the bucket and our app are in Oregon. App is hosted on EC2.
I googled it and found some blogs; e.g. http://improve.dk/pushing-the-limits-of-amazon-s3-upload-performance/
It's mentioned that http can bring much speed gain than https.
We're using boto 2.45; I'm wondering whether both uses https or http by default? Or is there any param to configure this behavior in boto?
Thanks in advance!
The boto3 client includes a use_ssl parameter:
use_ssl (boolean) -- Whether or not to use SSL. By default, SSL is used. Note that not all services support non-ssl connections.
Looks like it's time for you to move to boto3!
I tried boto3, which has a nice parameter "use_ssl" in connection constructor. However, it turned out that boto3 is significantly slower than boto2.... there're actually already many posts online about this issue.
Finally, I found that, in boto2, there's also a similar param "is_secure"
self.s3Conn = S3Connection(config.AWS_ACCESS_KEY_ID, config.AWS_SECRET_KEY, host=config.S3_ENDPOINT, is_secure=False)
Setting is_secure to False saves us about 20ms. Not bad..........

How to configure Jetty in spring-boot (easily?)

By following the tutorial, I could bring up the spring-boot with Jetty running using the following dependencies.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-tomcat</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jetty</artifactId>
</dependency>
However, how could I configure the Jetty server such as:
Server threads (Queue thread pool)
Server connectors
Https configurations.
all those configuration available in Jetty...?
Is there an easy way to do in
application.yml?
Configuration class?
Any example would be greatly appreciated.
Many thanks!!
There are some general extension points for servlet containers and also options for plugging Jetty API calls into those, so I assume everything you would want is in reach. General advice can be found in the docs. Jetty hasn't received as much attention yet so there may not be the same options available for declarative configuration as with Tomcat, and for sure it won't have been used much yet. If you would like to help change that, then help is welcome.
Possibility to configure Jetty (in parts) programatically from http://howtodoinjava.com/spring/spring-boot/configure-jetty-server/
#Bean
public JettyEmbeddedServletContainerFactory jettyEmbeddedServletContainerFactory() {
JettyEmbeddedServletContainerFactory jettyContainer =
new JettyEmbeddedServletContainerFactory();
jettyContainer.setPort(9000);
jettyContainer.setContextPath("/home");
return jettyContainer;
}
If anyone is using Spring Boot - you can easily configure this in you application.properties thusly:
server.max-http-post-size=n
where n is the maximum size to which you wish to set this property. For example I use:
server.max-http-post-size=5000000
As of the year 2020, while working on newer versions, this is what you need to do, to configure Jetty port, context path and thread pool properties. I tested this on Spring Boot version 2.1.6 while the document I referred to is for version 2.3.3
Create a server factory bean in a configuration file.
#Bean
public ConfigurableServletWebServerFactory webServerFactory() {
JettyServletWebServerFactory factory = new JettyServletWebServerFactory();
factory.setPort(8080);
factory.setContextPath("/my-app");
QueuedThreadPool threadPool = new QueuedThreadPool();
threadPool.setMinThreads(10);
threadPool.setMaxThreads(100);
threadPool.setIdleTimeout(60000);
factory.setThreadPool(threadPool);
return factory;
}
Following is the link to Spring Docs:
customizing-embedded-containers
Spring Boot provides following Jetty specific configuration through property file:-
server:
jetty:
connection-idle-timeout: # Time that the connection can be idle before it is closed.
max-http-form-post-size: # Maximum size of the form content in any HTTP post request e.g. 200000B
accesslog:
enabled: # Enable access log e.g. true
append: # Enable append to log e.g. true
custom-format: # Custom log format
file-date-format: # Date format to place in log file name
filename: # Log file name, if not specified, logs redirect to "System.err"
format: # Log format e.g ncsa
ignore-paths: # Request paths that should not be logged
retention-period: # Number of days before rotated log files are deleted e.g. 31
threads:
acceptors: # Number of acceptor threads to use. When the value is -1, the default, the number of acceptors is derived from the operating environment.
selectors: # Number of selector threads to use. When the value is -1, the default, the number of selectors is derived from the operating environment.
min: # Minimum number of threads e.g. 8
max: # Maximum number of threads e.g. 200
max-queue-capacity: # Maximum capacity of the thread pool's backing queue. A default is computed based on the threading configuration.
idle-timeout: # Maximum thread idle time in millisecond e.g. 60000ms
Please refer official Spring Boot documentation for more configuration details.

AppFabric Error ERRCA0017 SubStatus ES0006

Just installed Windows Server AppFabric 1.1 on my Windows 7 box and I'm trying to write some code in a console application against the cache client API as part of an evaluation of AppFabric. My app.config looks like the following:
<configSections>
<section name="dataCacheClient" type="Microsoft.ApplicationServer.Caching.DataCacheClientSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" allowLocation="true" allowDefinition="Everywhere" />
</configSections>
<dataCacheClient>
<hosts>
<host name="pa-chouse2" cachePort="22233" />
</hosts>
</dataCacheClient>
I created a new cache and added my domain user account as an allowed client account using the Powershell cmdlet Grant-CacheAllowedClientAccount. I'm creating a new DataCache instance like so:
using (DataCacheFactory cacheFactory = new DataCacheFactory())
{
this.cache = cacheFactory.GetDefaultCache();
}
When I call DataCache.Get, I end up with the following exception:
ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.)
I'd be very grateful if anyone could point out what I'm missing to get this working.
Finally figured out what my problem was after nearly ripping out what little hair I have left. Initially, I was creating my DataCache like so:
using (DataCacheFactory cacheFactory = new DataCacheFactory())
{
this.cache = cacheFactory.GetDefaultCache();
}
It turns out that DataCache didn't like having the DataCacheFactory created it disposed of. Once I refactored the code so that my DataCacheFactory stayed in scope as long as I needed my DataCache, it worked as expected.