Upload Zip Folder to Data Management API - autodesk-data-management

We upload many small (1kb) text files one at a time to the Data Management API, and the latency becomes a real issue as the number increases.
Is it possible to upload a zipped folder containing several text files, and have the individual files appear inside a single folder in BIM 360?
Ideally we could compress the files into a single zip folder, upload this package once and have the Data Management API extract all the files into a BIM 360 folder.

Related

Data Loss Prevention on Big Data files

I have migrated a big data application on to cloud and the input files are stored in GCS. The files can be of different formats like txt, csv, avro, parquet etc and these files contain sensitive data that I want to mask.
Also, I have read there is some quota restriction on the size of file. For my case a single file can contain 15M records.
I have tried the DLP UI as well as Client library to inspect those files, but its not working.
Github page - https://github.com/Hitman007IN/DataLossPreventionGCPDemo
under the resources there are 2 files. test.txt is working and test1.txt which is the sample file that I use in my application is not working.
Google Cloud DLP just launched support last week for scanning Avro files natively.

Transferring Pdf files from Local folder to AWS

I have a monthly activity where i get hundreds of PDF files in a folder and i need to transfer those to an AWS server . Currently i do this activity manually . But i need to automate this process of transfer of all pdf files form my local folder to a specific folder in AWS .
Also this process takes a lot of time ( approx 5 hours for 500 pdf files) . Is there a way to spped up the process?
While doing the copy from local to AWS you must be using some tool like winSCP or any SSH client, so you could automate the same using the script.
scp [-r] /you/pdf/dir youruser#aswhost:/home/user/path/
If you want to do it with speed, you could run multiple scp command in parallel of multiple terminal and may split files while creating to some logical grouped directories.
You can zip the files and transfer them. After transfer unzip the files.
Or else write a program which iterates over all files in your folder and uploads files to s3 using S3 api methods.

Amazon S3 synch command uploads the entire modified file again or just the delta in the file?

My system generate large log files continuously and I want to upload all the log files to Amazon S3. I am planning to use the s3 synch command for this. My system appens the logs in the same file until they are of about 50MB and then it create new log file. I understand that synch command will synch the modified local log file in s3 bucket, but I dont want to upload the entire log file when the file changes as the files are large and sending same data again and again will consume my data bandwidth.
So I am wondering if s3 synch command sends the entire modified file or just the delta in the file?
The documentation implies that it copies the whole updated files
Recursively copies new and updated files
Plus there would be no way to to do this without downloading the file from S3 which would effectively double the cost of an upload since you'd pay the download and upload costs.

Download bulk objects from Amazon S3 bucket

I have a large bucket folder with over 30 million object (images). Now, I need to download only 700,000 object (image) from that large folder.
I have the names of objects (images) that I need to download in a .txt file.
I can use AWS CLI, but not sure if it support downloading many objects at one command.
Is there a straight forward solution for that you would have in mind?

Merge multiple zip files on s3 into fewer zip files

We have a problem wherein some of the files in a s3 directory are in ~500MiB range, but many other files are in KiB and Bytes. I want to merge all the small files into fewer bigger files of the order of ~500MiB.
What is the most efficient way to rewriting data in an s3 folder instead of having to download, merge on local and push back to s3. Is there some utility/aws command i can use to achieve it?
S3 is a storage service and has no compute capability. For what you are asking, you need compute (to merge). So you cannot do what you want without downloading, merging and uploading.