How to create an API and then dynamically retrieve data from and add new data to it? - web-services

To start off, I am extremely sorry if my question is not clear but I have very little knowledge about web services in general and the vast nature of varying available information has driven me crazy over the past few weeks. So please do bear with me.
Summary: I want to create a live score update app for android. (I haven't added android as a tag because I do know how to retrieve data from say twitter's JSON api.) However, like the twitter JSON api, I want to be able to add(POST maybe?) data to the Apache 7.0 service that I have running. I then want the app to be able to be able to retrieve this data that I have posted.
I had asked a more generic question earlier and I was told that I should look up some api's. I did that but I have still not been unable to make a break through.
So my questions is:
Is setting up an API on my local web service the correct way to do this?
If so, how can I setup an API that will return JSON objects to the Android app. Also, I would need to be able to constantly update this API with new data.
Additionally, would I also need to setup a database for all this?
Any links to well explained matter would be appreciated too.
Note: I would like to carry this out using a RESTful Web Service through Jersey and use JSON Objects during retrieval.
Again, I am sorry about my terrible knowledge with web services in general despite trying my best to research a lot. The best I could do was get my RESTful Web to respond to a GET with some pre-defined text that I had set in Eclipse.
Thanks.

If I understand you correctly, what you try to do is something like this:
There will be a match or multiple matches of some sort. Whenever a team/player scores someone (i.e. you) will use the app to update the score. People who previously subscribed to the match, will be notified and see the updated score.
Even though I'm not familiar with backends based on Java, the implementation should be fairly similar to other programming languages.
First of all a few words to REST in general. REST is generally needed, when you need to share information between multiple devices and or users. This seems to be the case here. To implement the REST you are going to need an API of some sorts. Within the web APIs are implemented by webservers answering to certain predefined HTTP Requests.
Thus setting up an API on a web server is the correct way.
Next a few words on databases. A database is generally needed, if you want to store information persistently. This might, or might not be what you are planning to do. If there are just going to be a few matches at the same time and you don't care about persistence of the data, you can use Java to store a collection of match objects in memory. I'm just saying it is possible, not that it is a good idea. Once your server crashes or you run out of memory due to w/e reason, data is going to be lost. (Of course within the actual implementation you want to cache data for current matches in some way and keeping objects in memory is way to do so).
I'd recommend to use a database.
Within the database, you can then store and access information about the matches like the score, which users subscribed, who played, etc.
JSON is just a way to represent the data/objects that will be shared between the server and the client. You can use JSON to encode request and response data/bodies.
The user has to be informed about the updated score. There are two basic ways to do so. Push or Pull. With pull, the client will check for updated scores after fixed intervals or actions. With push, the server will notify the client about changed scores which will cause him to update the information. Since you are planning on doing a live application and using Java anyways, push seems to be the better way to go.
Last but not least let's have a look at a possible implementation using
Webserver (API endpoints + database)
Administrator (keeps score updated)
User (receives updates)
We assume that the server will respond to HTTP Requests (POST#/api/my-endpoint) with JSON-Objects.
Possible flow
1)
First the administrator creates a match
REQUEST
POST # /api/matches
body: team1=someteam&team2=someotherteam
The server now will create a match object and store it in the database. The response will contain information about the object and whether the action was successful.
2)
The user asks for a list of matches
REQUEST
GET # /api/matches/curret
The response will be a JSON object containing a list of current matches.
RESPONSE
{
matches: [
{id: 1, teams:...}, ...
]
}
3)
(If push)
A user subscribes to a match
REQUEST
GET # /api/SOME_MATCH_ID/observe
The user will now be added as an observer for the match. Again, the response contains information about whether the action was successful or not.
4)
The administrator updates a score
REQUEST
UPDATE # /api/SOME_MATCH_ID
body: team1scored...
The score now gets update on the server (in memory/database) and the user will be notified about the updated score.
5)
The user gets the updated score
REQUEST
GET # /api/SOME_MATCH_ID
RESPONSE
... (Updated score in some way)

Related

How to click a button on website with C++

I'm designing a web crawler with C++,but there is a web page asking me "Do you at least 18 years of age?" when I first fetch the web page by using URLDownloadToFileW,and of course I must click YES.
In javascript,I can use document.getElementsByTagName('button')[0].click(); to simulate a button click,so is there any other way to solve such problem with C++?
That is not really easy to do, but if you want to do it, you need several requests.
What the click (i.e. document.getElementsByTagName('button')[0].click(); in JavaScript) does is to trigger an associated click event. Your first step should be to find the event handler code and take a look into it. The event may for example send another (AJAX) request to the website. If that is the case, you have to perform the request in C++ in your crawler, too. Many sites also use cookies to store the user's answer to such questions (or at least the fact that the user selected "I'm at least 18 years of age"). So your crawler has to accept such cookies, too, and store them between requests.
I am aware of the fact that this answer is rather general, but it is difficult to give a more specific answer without knowing the exact website you are crawling.
Alternative approach: Instead of writing a crawler that downloads the website content directly, you might utilize frameworks like Selenium. Selenium allows to automate a browser and is intended to be used for testing, but one could also use it to crawl a website. The advantage is that you can also perfom things like clicks easier in the browser, given you know the ID or the XPath of the element you want to click. This might be easier to do than a "classical" crawler.
However, you should be aware that many websites have some kind of protection against flooding them with requests in place. That is, if you intent to do a lot of request to the same server in a short amount of time, you might get blocked from the server. So try to limit the requests to the absolute minimum.

Hande Series of Web Requests in a specific way

I am sorry in advance; I am just learning Web development and my knowledge of it is quite limited.
I will describe my problem first.
I have relatively large amount of data (1.8-2 GB), which should be hidden from a public web access. However, a user should be able to request via url call a specific small subset of data and see it on his / her webpage.
Ideally, I would like to write a program on a web server. Let's call it ./oracle, which stores the large amount of data in primary memory.
Each web user should be able to make a specific string calls to oracle and see oracle'sresponse on a web page as html elements.
There should only one instance of oracle, and web users should make asynchronous calls to it.
Can I accomplish the above task with FastCGI or any other protocols?
If yes could you please explain which tools / protocols should I use / learn?
I would recommend setting up an Apache server because it's very common and you'll be able to find a lot of answers to any specific questions here on StackOverflow already.
You could also look into things like http://Swagger.io which can help you generate your API.
Unfortunately, everything past this really depends on what you use to set up your server. Big picture though:
You'll need to open up a port to listen to incoming requests
You'll need to have requests include the parameters they want to send to oracle
You could accomplish this the URI, like localhost/oracle-request?PARAMETER="foo"
You could alternatively use JSON in the body of the http request
Again, this largely depends on how you set up step 1
You'll need to route those requests to the oracle
This implementation depends entirely on step 1
You'll need to capture the output from the oracle and return it to the user
Once you decide on how you want to set up your server, feel free to edit your question and we may be able to provide more specific help.

Fetch data from website real time

Ok basically i'm fetching data from website using curl and parsing the contents using CkHtmlToText.
My issue is how to fetch new data website is writing down.
For example website contents are as follow:
-test1
-test2
After 1 second contents are :
-test1
-test2
-test3
How to fetch only the next line website wrote down that i didnt get yet which is " test3".
Any ideas ? Thank you.
Language im using is : Visual c++
HTTP requests are stateless. You make a request, you get a result, then you make another completely independent request, you get another result, and so on. If the resource you are trying to access is changing over time, you need to make multiple requests, where each time you will get the full updated resource.
I imagine you may be describing a web page that automatically updates while you are looking at it (like a Twitter feed, for example). In that case, the response contains a script that allows the browser to fetch new data and inject it into the DOM. Unless you also plan to build the DOM and use a JavaScript engine (basically implementing a web browser) to run the script, this is probably not useful to you. Instead, you are better off finding an API that gives you data in a format that is easy to parse and get updates for. If this API is a REST API (built on HTTP), then you will still need to make independent requests to get updates.

Can Datomic simplify querying data contained in dynamically accessed HTML documents?

I need to write an API which would provide access to data being served as HTML documents from a web server. I need for my users to be able to perform queries over the data.
Say on a web site there is a page which lists items and their owners. Then there is additional set of profile pages for owners which for each owner provide information about their reputation. An example query I may need to answer is "Give me ID's and owners of all items submitted in 2013 whose owners have reputation of at least 10".
Given a query to answer, I need to be able to screen scrape only the parts of the web site I need for answering the query at hand. And ideally cache the obtained information for future use with new queries.
I have no problem writing the screen scraping part, but I am struggling with designing the storage/query/cache part. Is there something about Clojure/Datomic that makes it an especially suitable technology choice for this kind of processing of data? I have been pointed in this direction before.
It seems a nice challenge but not sure about a few things: a) would you like to expose to your users a Datalog query box and so make them learn datalog-like syntax? b) what exact kind of results do you wish to cache, raw DB responses, html fomatted text, json ?
Anyway I suggest you to install and play a little bit with the Datomic console to get a grasp if you didn't before as it seems to me the more close idea to what you want to achieve atm https://www.youtube.com/watch?v=jyuBnl0XQ6s http://blog.datomic.com/2013/10/datomic-console.html
For the API I suggest you to use http://clojure-liberator.github.io/liberator/ as it provides sane defaults to implement REST services and let you focus on your app behaviour

Is it feasible to have a single sign on page for multiple datasources?

I am in the beginning stages of planning a web application using ColdFusion and SQL Server 2012.
In researching the pros and cons of using multiple databases (one per customer) vs one large database, for my purposes I have decided multiple databases would be the best approach.
With this in mind I am now wondering the best way to proceed regarding logging clients in. I have two thoughts here:
I could use sub-domains with each one being for a specific client. The sub-domain also being the datasource name.
I could have a single sign on page with the datasource for this client stored in a universal users table.
I like the idea of option 2 best however I am wondering how this may work in the real world. Making each user unique would not be ideal (although I suppose I could make this off of an email address instead of a username).
I was thinking of maybe adding something along the lines of a "company code" that would need to be entered along with the username and password.
I feel like this may be asking too much of clients though.
With all of this said, would you advise going with option 1 or option 2? Would also love to hear any thoughts or ideas that may differ.
Thanks!
If you are expecting to have a large amount of data per client, it may be a good idea to split each client into their own database.
You can create a global database that contains client information, client datasource, settings, etc. for each client and then set the client database in the application.cfc.
This also makes it easier at the end if a client request their data or you would like to remove a client from the system.