Azure Data Factory HDFS dataset preview error - hdfs

I'm trying to connect to the HDFS from the ADF. I created a folder and sample file (orc format) and put it in the newly created folder.
Then in ADF I created successfully linked service for HDFS using my Windows credentials (the same user which was used for creating sample file):
But when trying to browse the data through dataset:
I'm getting an error: The response content from the data store is not expected, and cannot be parsed.:
Is there something I'm doing wrongly or it is kind of permissions issue?
Please advise

This appears to be a generic issue, you need to point to a file with appropriate extension rather than a folder itself. Also make sure you are using a supported data store activity.
You can follow this official MS doc to use HDFS server with Azure Data Factory

Related

Correct way to fetch data from an aws server into a flutter app?

I have a general understanding question. I am building a flutter app that relies on a content library containing text files, latex equations, images, pdfs, videos etc.
The content lies on an aws amplify backend. Depending on the navigation of the user in the app, the corresponding data is fetched and displayed.
I am not sure about the correct way of fetching the data. The current method (which works) is that the data is stored in an S3 bucket. When data is requested, the data is downloaded to a temporary directory and then opened and processed in the app. This is actually not slow, but I feel that it is not the way it should be done.
When data is downloaded a file transfer notification pops up, which bothers me because it is shown all the time. Also I would like to read the data directly with something like a get request, without downloading the file first (specially for text files, which I would like to read directly into a String). But here I don't know how it works, because I don't see that you can save data in a file system with the other amplify services like data store or the rest api. Also, the S3 bucket is an intuitive way of storing data that is easy to use for the content creators of my company, for me it seems that the S3 bucket is the way to go. However with S3 I have only figured out the download method to fetch data.
Could someone give me a hint on what is the correct approach for this use case? Thank you very much!

Test data to requests for Postman monitor

I run my collection using Test data from a csv file, However there is no option to upload the test data file when adding monitor for the collection. On searching through internet could see that the test data file have to be provided in URL (saved in cloud ..google drive,.). But i couldn't get source for how to provide this URL to the collection . Can anyone please help
https://www.postman.com/praveendvd-public/workspace/postman-tricks-and-tips/request/8296678-d06b3fc0-6b8b-4370-9847-aee0f526e7db
you cannot use csv file in monitor , but could store the content of csv as variable and use that to drive the monitor . An example can be seen in the above public repository

Download JSON in Informatica Application Integration

I am new to and ICAI, and i have a requirement
a. create a service
b. the user will upload a json file using this
webservice
c. the json file will be downloaded and saved locally.
The solution path i was taking is:
create a process which will accept 2 inputs (some generic text and the json file)
which generated the below url
I tested the same in POSTMAN and it is working fine, but i am not able to download the json into informatica server on any location,
Final solution based on the feedback from Maciejg
Steps taken:
create a filewriter app connection and set it up only for
"eventtarget"
create a process
in start - create a input field of type - attachment
in start - create a temp field of type - filewriter connection
add a assignment task
in assignment task add a filed temp->content format of type content
-> attachment
in the same assignment task add another field temp->file name of
type formula
Above steps are enough to save the uploaded file, if required, other steps (check file type, authentication etc) can be added.
It seems you need to use a FileWriter Service. Check out this knowledgebase article for details.

GCP AI Notebook can't access storage bucket

New to GCP. Trying to load a saved model file into an AI Platform notebook. Tried several approaches without success.
Most obvious approach seemed to be to set the value of a variable to the path copied from storage:
model_path = "gs://<my-bucket>/models/3B/export/1600635833/saved_model.pb"
Results: OSError: SavedModel file does not exist at: (the above path)
I know I can connect to the bucket and retrieve contents because I downloaded a csv file from the bucket and printed out the contents.
OSError to me sounds like you are trying to access GCS bucket with a regular file system which do not support looking at GCS. (Example: Python open() function)
To access files in GCS I recommend you use the Client Libraries. https://cloud.google.com/storage/docs/reference/libraries
Another option for testing is to try to connect to SSH and use gsutil command.
Note: I assume <my-bucket> was edited to replace your real GCS bucket name.
According to the GCP documentation enter here , you are able to access Cloud Storage. This page will guide to using Cloud Storage with AI Platform Training.

Using S3 as target for AWS DMS: Uploaded File name doesn't change

We are using DMS to get data from SQL Server and load it in S3 bucket, after which the data is finally loaded into Snowflake DB using Snowpipe for Full Load.
Now, in order for Snowpipe to know there is new data in S3 bucket, the filename needs to be different than the last one. Have tried all the task setting options available (DROP_AND_CREATE, DO_NOTHING, TRUNCATE) to have the file name different, but still not working. It loads the file name as LOAD00000001.csv
In documentation it shows that file name will be incremental (eg. LOAD00000001.csv, LOAD00000002.csv .. and so on) but it's not happening. Which is why the Snowpipe is not able to register the changes.
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html
Can someone please help?
For DMS the incremental counter is started over from 1 each time the task is run. It does not have a "Don't override existing objects" feature.
Your best bet may be to handle the load yourself by looking for updated object timestamps in your folder or setting up S3 event notifications.