I have uploaded a simple 10 row csv file (S3) into AWS ML website. It keeps giving me the error,
"We cannot find any valid records for this datasource."
There are records there and Y variable is continuous (not binary). I am pretty much stuck at this point because there is only 1 button to move forward to build Machine Learning. Does any one know what should I do to fix it? Thanks!
The only way I have been able to upload .csv files to S3 created on my own is by downloading an existing .csv file from my S3 server, modifying the data, uploading it then changing the name in the S3 console.
Could you post the first few lines of contents of the .csv file? I am able to upload my own .csv file along with a schema that I have created and it is working. However, I did have issues in that Amazon ML was unable to create the schema for me.
Also, did you try to save the data in something like Sublime, Notepad++, etc. in order to get a different format? On my mac with Microsoft Excel, the CSV did not work, but when I tried LibreOffice on my Windows, the same file worked perfectly.
Related
I have a general understanding question. I am building a flutter app that relies on a content library containing text files, latex equations, images, pdfs, videos etc.
The content lies on an aws amplify backend. Depending on the navigation of the user in the app, the corresponding data is fetched and displayed.
I am not sure about the correct way of fetching the data. The current method (which works) is that the data is stored in an S3 bucket. When data is requested, the data is downloaded to a temporary directory and then opened and processed in the app. This is actually not slow, but I feel that it is not the way it should be done.
When data is downloaded a file transfer notification pops up, which bothers me because it is shown all the time. Also I would like to read the data directly with something like a get request, without downloading the file first (specially for text files, which I would like to read directly into a String). But here I don't know how it works, because I don't see that you can save data in a file system with the other amplify services like data store or the rest api. Also, the S3 bucket is an intuitive way of storing data that is easy to use for the content creators of my company, for me it seems that the S3 bucket is the way to go. However with S3 I have only figured out the download method to fetch data.
Could someone give me a hint on what is the correct approach for this use case? Thank you very much!
I am uploading a TSV file for processing on GColab, the file is 4GB and the upload process is not getting completed from a very long time (hours). Any pointers here are of great help. Click here to check upload process details
It can be your internet connection. The import function for Google Colab is better useful when you upload small .py files. For huge files, I'd suggest you use Google Drive and upload it there in your account then simply move or copy it to your Google Colab instance:
1. Copy the file you want to use:
%cp "path/to/the file/file_name.extension" "path/to/your/google-colab-instance"
Google colab instance is usually like this - /contents/
Similarly,
2. Move the file you want to use:
%mv "path/to/the file/file_name.extension" "path/to/your/google-colab-instance"
The first "" would be the path to where you uploaded the .csv file in your drive.
Hope this helps. Let me know in the comments.
I have migrated a big data application on to cloud and the input files are stored in GCS. The files can be of different formats like txt, csv, avro, parquet etc and these files contain sensitive data that I want to mask.
Also, I have read there is some quota restriction on the size of file. For my case a single file can contain 15M records.
I have tried the DLP UI as well as Client library to inspect those files, but its not working.
Github page - https://github.com/Hitman007IN/DataLossPreventionGCPDemo
under the resources there are 2 files. test.txt is working and test1.txt which is the sample file that I use in my application is not working.
Google Cloud DLP just launched support last week for scanning Avro files natively.
I crated a console application that stores all user inputs into a .txt file.
Now I want to upload this txt file to somethig like Dropbox or it could be something else.
Would you give me some tipps how to do this? What should I look for in order to do this?
I found this link to MS page that could be the solution for my problem: MS-LINK
In this case I need an ASP.NET Webseite where I have to upload the file to.
Is there another possibility? Like uploading this directly to Dropbox for example?
This should be a comment, but my rep seems to be too low. Why dont you just save the file to the folder, which dropbox/Google drive checks for files to sync with the cloud?
I want create a form to upload files (txt, xls) to the server, not the database.
Does anyone kown any example showing how I can do this?
In order to get the file on to the database server's file system, you would first have to upload the file to the database which it sounds like you are already familiar with. From there, you can use the UTL_FILE package to write the BLOB to the database server's file system.