I'm writing my own service to track my growing library and notify me of when books become available. I'm in the middle of 5 series waiting for the next book to come out. I also pick some up locally and would like to grab similar titles from amazon. How can I get my purchase history and similar titles? Is there an API for these? I haven't found anything from searches.
I don't think Amazon exposes an API for order history.
The closest thing seems to be the product advertising API: http://docs.aws.amazon.com/AWSECommerceService/latest/DG/Welcome.html
That would allow you to search for items, for example using ItemSearch:
http://docs.aws.amazon.com/AWSECommerceService/latest/DG/ItemSearch.html
Alternatively, you probably could write a script scrape the data by navigating through the order history page, or to help you capture each page of results as you manually navigate your order history. You're on your own for this option, though.
Related
I am working with my team to prep a project for a potential client. We've researched Amazon MWS API, and we're trying to develop an algorithm using the data scraped from this API.
Just want to make sure we understand the research correctly:
Is it possible to scrape data from Amazon.com like the plugins RevSeller or HowMany do? Then can we add that data to a database for use in an algorithm to determine whether or not an Amazon reseller should invest in reselling a product?
Thanks!
I am doing a similar project. I don't know the specifics of RevSeller or HowMany, but another very popular plugin is Amzpecty. If you use a tool like Fiddler, you can see the HTTP traffic and figure out what it does. They basically scrape out the ASIN and offer listing ID's on the current page you are looking at and one-by-one call the Amazon Product Advertising API, which is not the same thing as MWS. Out of that data returned, they produce a nice overlay that tells you all kinds of important stuff.
Instead of a browser plugin, I'm just writing an app that makes HTTP calls based on a list of ASIN's to the PA API and then I can run the results through my own algorithms. Hope that gives you a starting point.
I'm trying to help an animal shelter deliver faster updates when a new pet is added to their website. This is likely to happen between 0-20 times a day.
The website is a simple data dump, animals are in tables with row delineation (easy to parse) and have unique IDs. When a new pet is added, ideally this would trigger a mobile notification to subscribed users (could also be an email message). The faster updates are sent, the better, but checking every 30 mins or so would be fine. Because this is for a charity, I want to spend as little as possible on resources (because I also want to be able to scale this up for other shelters that might want to use this).
For instance mobile notifications, Twitter seems to be a good candidate. It looks like my needs wont run into fees/restrictions.
The part that I'm stuck on is how best to ping the site for updates and publish those updates to twitter. The two options I've come up with are:
Build my own system. Use a web crawler like Scrapy to periodically crawl the site and check for new petIDs. Using AWS, I think I could get by with a nano instance (~$57 a year). Using dynamoDB to cache existing petIDs seems like a small additional cost. Use twitter API to post updates
Use an RSS feed generator like Feedity. These seem to be pretty expensive: Feedity is $180/year for hourly updates and $390 for 15 minute updates. Has API integrated with Twitter.
I'd like to know if there are any better/simpler/cheaper/more obvious options I may be overlooking. Thanks!
I operate a number of content websites that have several million user sessions and need a reliable way to monitor some real-time metrics on particular pieces of content (key metrics being: pageviews/unique pageviews over time, unique users, referrers).
The use case here is for the stats to be visible to authors/staff on the site, as well as to act as source data for real-time content popularity algorithms.
We already use Google Analytics, but this does not update quickly enough (4-24 hours depending on traffic volume). Google Analytics does offer a real-time reporting API, but this is currently in closed beta (I have requested access several times, but no joy yet).
New Relic appears to offer a few analytics products, but they are quite expensive ($149/500k pageviews - we have several times this).
Other answers I found on StackOverflow suggest building your own, but this was 3-5 years ago. Any ideas?
Heard some good things about Woopra and they offer 1.2m page views for the same price as Relic.
https://www.woopra.com/pricing/
If that's too expensive then it's live loading your logs and using an elastic search service to read them to get he data you want but you will need access to your logs whilst they are being written to.
A service like Loggly might suit you which would enable you to "live tail" your logs (view whilst being written) but again there is a cost to that.
Failing that you could do something yourself or get someone on freelancer to knock something up for you enabling logs to be read and displayed in a format you recognise.
https://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the metrics that you need to track are just limited to the ones that you have listed (Page Views, Unique Users, Referrers) you may think of collecting the logs of your web servers and using a log analyzer.
There are several free tools available on the Internet to get real-time statistics out of those logs.
Take a look at www.elastic.co, for example.
Hope this helps!
Google Analytics offers real time data viewing now, if that's what you want?
https://support.google.com/analytics/answer/1638635?hl=en
I believe their API is now released as we are now looking at incorporating this!
If you have access to web server logs then you can actually set up Elastic Search as a search engine and along with log parser as Logstash and Kibana as Front end tool for analyzing the data.
For more information: please go through the elastic search link.
Elasticsearch weblink
I am sure this question may seem a bit lacking, but I literally do not know where to begin with. I want to develop a solution that will allow me to manage ALL of my Amazon and Rakuten/Buy.com inventory from my own website.
My main concern is keeping the inventory in sync, so the process would be as follows:
1.Fetch Orders sold today
a.Subtract the respective quantities
2.Fetch Rakuten orders sold
a.Subtract the respective quantities
3.Update Internal DB of products
a.Send out updated feeds to Amazon and Rakuten.
Again, I apologize if this question may seem a bit lacking, but I am having trouble understanding how exactly to implement this, any tips would be appreciated
For the Amazon part look at https://developer.amazonservices.com/
Rakuten, I think you will be able to do what you want with it via the FTP access, I'm still researching this. If I find more I'll respond with a better answer.
In order to process orders, you'll need to use be registered with Rakuten in order to get an authorisation token. For the API doc etc... try sending an email to support#rakuten.co.uk.
Incidentally, to send out updated feeds, you'll need to use the inventory API in order to update stock quantities (given that you'll be selling the same item Amazon etc..).
I’m preparing my graduation project from computer science, I made this website and it's running perfectly but my supervisor requested me to apply data mining on the website.
But I don’t understand what I should do.
The website is a social network, each user will have a profile and blog and access to some e-books that required you to be registered so you can download. The website also contains a music server that contains songs that a registered user can choose a song to download or to add it as a favorite in his profile page, the website contains ads (I used OpenX script), so this is most of the website services where I can perform data mining, the website is www.sy-stu.com.
I need ideas and what is the best way to present it in the interview?
You can ask your professor what was his intention of using data mining. Data mining algorithms can do various tasks, you need first define what you want to accomplish and then find some algorithms for this and technical possibilities.
Some ideas that came to my mind about usage of data mining in your project:
you can use data mining to find what songs (ebooks,etc.) can be favorited by a user based on other people favorites songs (find similarities, probably association rules would be a good algorithm for this).
you can use some clustering algorithms to group users based on some parameters and suggest them that they could become a connections with other people from the same group (if you have something like this)
Good luck!:)
Firstly, ask for clarification from your supervisor. Don't say 'What do you mean?', but ask 'Are you expecting something like this?' because it shows that you've at least thought about it.
If you can't think of anything, or your supervisor is vague, perform some simple data retrieval and analysis, e.g.
most active members
the most / least popular songs and books.
number of ads clicked etc
most popular website features
Just elementary analysis should suffice - you aren't doing a statistics degree. Work out the most songs downloaded in a day or per user, the average songs per user, how many users visit each day and how many sign up and never visit.
The purpose is to demostrate that your website is logging all activity, so that when you are asked 'how many books did the 20 most active users download in June' you will be able to work out the answer.
The alternative is a website that just runs and you don't have any knowledge of how your users are behaving and what they are doing, which means you aren't able to focus on things that they find important.
I dont know exactly what kind of data you are trying to mine, but have you check out google analytics? It is very easy to setup, once you register all you need is to include the javascript provided to your web pages. Google analytics will give you plenty of statistic about access to your site information regarding your site and visits. Is that what you need? The data produced is very easy to read as well and will be suitable for you to present I reckon.