In PowerBI, I'd like to get data from a website requiring authentication (http://kdp.amazon.com/). Going to New Source, Web, Advanced, doesn't show me anything that looks promising. Hopefully I'm missing something.
My ideal would be to go to a specific webpage (post authentication), and click on a link that allows me to download an excel spreadsheet.
Thanks for any ideas/pointers.
It depends, and chances are slim for your case.
If it is a direct URL to where the data or file resided (e.g data is on the page, file link, web API endpoint), then it depends on what kind of authentication method is used by the website, and whether you can provide the credentials through the Web.Contents options. (commonly used for web API authentication)
If it requires further navigation (e.g. click, type in info) to access the data / file after the authentication, then the answer is no.
That type of data scraping can be accomplished using a headless browser and a scripting/macro engine.
For example xvfb (X virtual framebuffer) + firefox + iMacros. I do consider this beyond power bi's capabilities. If you wish to pursue this further here are some references:
https://en.wikipedia.org/wiki/Xvfb
https://addons.mozilla.org/en-us/firefox/addon/imacros-for-firefox/
Again, similar but using an alternate toolset:
http://scraping.pro/use-headless-firefox-scraping-linux/
BTW, having done this once or twice before - this is not a great value proposition. If you have to resort to this sort of tactic, it may be time to consider why the developers didn't expose this functionality to you in an API - maybe there is a good reason?
Related
My question is: I'm developing a website and I want to monitor analytics with Google Analytics, however I've been reading articles about cookies and I didn't realize if I need to program my website with some kind of cookies in order to use google tool, or if I simply don't need to do anything on my website.
Thanks
To do tracking you simply need to insert the code snippet that you can get from the GA admin interface.
However since you are in the EU you need to point out to your visitors that they are being tracked on your web page and that the site uses cookies to do so (and I think you need to provide an opt-out, although that might be a German thing). This is mandated by the European Privacy directive, which is sometimes referred to as "Cookie Law" (technically incorrect, since it is neither a law nor specifically about cookies), so maybe this gave you the idea that you need to do extra programming.
I am writing this question after considerable investigation into this matter.
I have gone through Google's easy dashboards (gadash JS library), superProxy and plain analytics API, and couldn't find the best solution for my needs, although I can't believe my needs are so uncommon.
This is why I am turning to you, I have got a feeling I am missing something.
My requirement:
Display my own analytics account data to users on my website, preferably with Google's chart API or ga-dash, to resemble google analytics views as much as possible.
Users will not have to take part in authentication with Google API
Each user has his own query which is built dynamically !! (this is probably why superProxy cannot work for me because I think you need to manually set the queries in advance)
I use django-python as the basis for my website
problems with solutions I tried:
GAdash library - the problem is that each user has to be authenticated, and shown their own data, meaning they need access to my profile- that's simply not what I am looking for. It works great, but only for me. On the other hand if there was a way to make my profile truly public...
superProxy - sounds like a solution for this need exactly, however I don't think that you can programmatically set the queries.
I did find a way to retrieve the data for a query on the server side using my own credential which is a bit hacky, I am still missing that JS library which will parse this XML on the client side and display it as charts.
EDIT:
I ended up using Mark's solution (embeddedanalytics), since I could not find a better, easier solution.
Other alternatives were:
1. superProxy (lacking the ability to dynamically, programmatically loading new queries)
2. gaDash library - requires authentication from each user
3. Implement my own server side querying, and display to the user with some js graphics library - which would require considerable work on my side.
Check out www.embeddedanalytics.com. This is a platform/service which will do exactly what you are looking to do (disclosure - I work with them).
We also support your requirement that each user have its own dynamically built query. This is what we call our CMS Integration version. Are you trying to create a dashboard system for a CMS system you have built?
I am working on a system that needs to associate URLs with data based on keywords. I was hoping I could use a web service to automatically perform full-web searches based on keywords or tags, and the results would be in a machine-friendly format like JSON.
My first thought was Google, and their Google Custom Search service looks pretty good, and has proven itself in tests. It has a simple REST-like URL and returns results in JSON format. The only problem is that it has a limit of 100 queries per day. I need more like 1000. Their higher-quota pay option (Google Site Search) does not allow full-web searches, so is useless to me.
Surely others have wanted to do programmatic web searches before. Does Google offer another B2B search service that we could use? We are happy to pay per query, sign agreements, etc. I fear I am not looking in the right place on Google's site.
As I wrote this question I found Microsoft's Bing web services home page. At first blush it looks pretty good. I have a slight preference for Google, but am open to Microsoft. I would love to hear any advice about using Microsoft's APIs.
Google custom search offers a 'pay for >100 queries' option, I believe:
https://developers.google.com/custom-search/v1/overview
(see 'paid usage' section at the bottom)
#Sync found the right way in, and I believe I now understand the problem: Google has two control panels for custom search, and you can't get to one from the other.
I was on the panel for my Google Custom Search engine (www.google.com/cse/panel), which gives me control over low-level aspects of my search engine, and the only pay option was to convert to Google Site Search, but in so doing I would lose my full-web search power.
There is another, higher-level, control panel for all of Google's APIs (code.google.com/apis/console), of which Custom Search is a component. And from here, setting up billing to get a larger quota is clearly linked.
Sorry I am not providing proper links, as the relevant pages require login to access. While I consider this answer to be the authoritative one for my question, I am giving the green checkmark to #sync, without whose help I would not have been able to figure it out. I'd still love to see some comments on Bing's APIs, however!
I have found other solutions like this but this example collects raw data, I need something a little bit more processed, I need an user-centric analytics. Those statistics are for advertising purposes. I want to recollect as much information as possible.
That method might add a lot of overhead to each page load especially if you have a busy site / database server. I would recommend using google analytics on the template of each page instead. At least the analytics data is handled after the users page load.
I believe google has an api if you need to take the information from analytics and display it on your site.
I've been pouring over Facebook's Graph API but I can't seem to find any documentation with regards to retrieve comments users leave as they Like an Open Graph object (in my case, a web page with a Like button).
At a minimum, I would assume that there would be some way to retrieve comments from users whose privacy settings allow anybody to see their updates (I guess this might require me to know the UID of everybody who Likes my web page). Alternatively, there might be some method where user comments may be associated with the Open Graph object's stream instead.
Anybody?
To see a user's wall feed via the API (even public ones) you will to have them authenticate your app. Yeah, pretty crappy not to be able to query their public messages, but it is a current limitation of Facebook's API.