Google Analytics DataSet Type (CRMint) - google-cloud-platform

I am novice to Google Analytics and I am using a tool called CRMint to import a custom audience to Google Analytics. A Data Scientist is using a model to predict if a user has more chances to buy a product than another. Right now, I have a csv file containing 2 columns fullVisitorId and predictions.
On CRMint, I am using a job called "GaDataImporter" to import that CSV file into Google Analytics. As you can see on the picture bellow, I need to provide a GA Dataset ID.
I am currently trying to create a new DataSet from my Google Analytics dashboard but I am not sure about the dataset type and the import bahavihor. Anyone has some suggestions?

fullVisitorId is not an available dimension in Google Analytics (found only on BigQuery) so you cannot use it to link information to users in Google Analytics.
Rather you should use the clientId passed to Analytics as custom dimension, then use that as a key by importing the data as Custom Data.
(If you are novice to Google Analytics it is not something that can be explained in a post, anyway the process described is what you need)

Related

Upload data from API into GCP dataprep

Is it possible to import data from a Restful API directly into data prep?
I think there might be a couple of work arounds...
1: Save the results to a JSON file in a GCS bucket and import from there.
2: Import the results into a Big Query table and then import into data prep from there.
It would be much smoother to just call an API and get a result set, as opposed to having to take an extra step. I just can't find anywhere that explains how to do this.
TIA!
Long story short: there's no real way to directly stream data into Data Prep. Even the new Dataprep Premium Edition expects that you'll have the data in some form of a database--though this does expand your options to Google Sheets, Salesforce, Oracle, Microsoft SQLServer, MySQL and PostgreSQL.
Personally, I've just gotten in the habit of writing directly into BigQuery and/or Firestore-to-BigQuery to get around this sort of thing. It also has the nice side effect of being another type of logging from applications.

How to update data in google cloud storage/bigquery for google data studio?

For context, we would like to visualize our data in google data studio - this dataset receives more entries each week. I have tried hosting our data sets in google drive, but it seems that they're too large and this slows down google data studio (the file is only 50 mb, am I doing something wrong?).
I have loaded our data into google cloud storage --> google bigquery, and connected my google data studio to my bigquery table. This has allowed me to use the google data studio dashboard much quicker!
I'm not sure what is the best way to update our data weekly in google cloud/bigquery. I have found a slow way to do this by uploading the new weekly data to google cloud, then appending the data to my table manually in bigquery, but I'm wondering if there's a better way to do this (or at least a more automated way)?
I'm open to any suggestions, and if you think that bigquery/google cloud storage is not the answer for me, please let me know!
If I understand your question correctly, you want to automate the query that populate your table, which is connected to Data Studio.
If this is the case, then you can use Scheduled Query from BigQuery. Scheduled query allow you to define a query which results can be inserted in a new table. Particularly you can specify different rules for repetition (minimum each 15 minutes) and execution, as well as destination writing options (destination table, writing mode: append, truncate).
In order to use Scheduled Queries your account must have the right permissions. You can have a look at the following documentation to better understand how to use Scheduled Query [1].
Also, please note that at the front end the updated data in the BigQuery table will be seen updated in Datastudio at each refresh (click on refresh button in Datastudio). To automatically refresh the front-end visualization you can use the following plugin [2] or automate the click on the refresh button through Browser console commands.
[1] https://cloud.google.com/bigquery/docs/scheduling-queries
[2] https://chrome.google.com/webstore/detail/data-studio-auto-refresh/inkgahcdacjcejipadnndepfllmbgoag?hl=en

Pulling Instagram data into Google Big Query

I am new to development, so I am sorry if this is a really basic question. I am trying to access some of the data available from instagram's API as documented here. https://developers.facebook.com/docs/instagram-api/insights.
I would like some kind of data repository to pull the data into, so I am looking at Google Big Query to see if I can pull in the data. (The ultimate place will be PowerBi so I can publish online)
Looking at the Facebook request code - is it possible to put this into Google Big query to return the data?
I am replacing the 'instagram-business-user-id' with an ID I have generated already - but it feels like perhaps it needs more markup to let Big Query know what language it is in.
Any help would be much appreciated.
GET graph.facebook.com/{instagram-business-user-id}/insights
?metric=impressions,reach,profile_views
&period=day
Looking at the Facebook request code - is it possible to put this into Google Big query to return the data?
Yes it's absolutely possible using bigQuery API or bigQuery CLI
You can use this Psuedo workflow as an example (using BigQuery API):
Create a table in bigQuery with the desired schema for this you also have 2 options:
Save the result in 1 column with the full JSON, This means to the select you need you use JSON_EXTRACT to fetch specific data
Process the JSON in your code and save it in specific columns to simplify the select statement
Call instagram's API
Call bigQuery API or bigQuery CLI to insert the data, This link provides one option how to do this
Call bigQuery API or bigQuery CLI to fetch the data, This link provides one option how to do this

How can I use a parameter in a MS Power Bi web data source string?

I have a URL that returns a json object with everything I need for my power bi embedded report. I get the data for the report by adding a new web data source and pasting the URL in. a few transformations later and tada! sexy report. the report shows lots of charts and graphs etc... however I need to be able to change the datasource URL depending on who is looking at it.
The report shows data for a single organization. You can only look at it if you're in that organization. how can I pass an organizations ID when embedding the report so that the datasource will show different data?
for example if my datasource is defined in the originating pbix as
Json.Document(Web.Contents("http://www.testdata.com/api/json?orgId=1"))
how can I change it to
Json.Document(Web.Contents("http://www.testdata.com/api/json?orgId=2"))
when I'm pull the report to embed on a page?
I know you can filter data but that means I have to make the datasource URL pull ALL the data which would be huge and intensive just to have bi filter out something.
In short, I'm embedding a report on a website and tat report's only way to get data is via a json endpoint. That endpoint requires the org id of the user so how do I pass it to bi which in turn uses it in the data source url?
Your only option for this scenario is to pull all the required data into your dataset. Then you can use either Role Level Security (RLS) or the new JS API to filter the data for each user.
You should probably look at an Azure SQL data source as a more efficient, flexible and scalable back-end for PBI Embedded.

BigQuery API - retrieving rows limited

I'm trying to retrieve the result of query with aggregates, based on the GA sessions and using the BigQuery API in python. And then to push it to my data warehouse.
Issue: I can only retrieve 8333 records of the aforementioned query result.
But there are always 40k+ records any day of the year..
I tried to do 'allowLargeResults': True
I read I should extract all to google cloud first and then retrieve it...
Also read somewhere in Google doc that I might only get the first page?!
Has anybody faced the same situation?
See section on paging through results in the BigQuery docs https://cloud.google.com/bigquery/docs/data#paging
Alternately, you can export your table to Google Cloud Storage: https://cloud.google.com/bigquery/exporting-data-from-bigquery