Developing/Testing Twitter apps without slamming the API - web-services

I'm currently working on an app that works with Twitter, but while developing/testing (especially those parts that don't rely heavily on real Twitter data), I'd like to avoid constantly hitting the API or publishing junk tweets.
Is there a general strategy people use for taking it easy on the API (caching aside)? I was thinking of rolling my own library that would essentially intercept outgoing requests and return mock responses, but I wanted to make sure I wasn't missing anything obvious first.

I would probably start by mocking the specific parts of the API you need for your application. In fact, this may actually force you to come up with a cleaner design for your app, because it more or less requires you to think about your application in terms of "what" it should do rather than "how" it should do it.
For example, if you are using the Twitter Search API, your application most likely should not care whether or not you are using the JSON or the Atom format option. The ability to search Twitter using a given query and get results back represents the functionality you want, so you should mock the API at that level of abstraction. The output format is just an implementation detail.
By mocking the API in terms of functionality instead of in terms of low-level implementation details, you can ensure that the application does what you expect it to do, before you actually connect to Twitter for real. At that point, you've already verified that the app works as intended, so the only thing left is to write the code to make the REST requests and parse the responses, which should be fairly straightforward, so you probably won't end up hitting Twitter with a lot of junk data at that point.

Caching is probably the best solution. Besides that, I believe the API is limited to 100 requests per hour. So maybe make a function that keeps counting each request and as it gets close to 100, it says, OK, every 10 API requests I will pull data. It wouldn't be hard set, probably a gradient function that curbs off when you are nearing the limit.

I've used Tweet#, it caches and should do everything you need since it has 100% of twitter's api covered and then some...
http://dimebrain.com/2009/01/introducing-tweet-the-complete-fluent-c-library-for-twitter.html

Cache stuff in a database... If the cache is too old then request the latest data via the API.
Also think about getting your application account white-listed, it will allow you to have a 20,000 api request limit per hour vs the measly 100 (which is made for a user not an application).
http://twitter.com/help/request_whitelisting

Related

Building a web service: what options do I have?

I'm looking to build my first web service and I would like to be pointed in the right direction right from the start.
Here are the interactions that must take place:
First, a smartphone or computer will send a chunk of data to my web service.
The web service will persist the information to the database.
Periodically, an algorithm will access and modify the database.
The algorithm will periodically bundle data and send it out to smartphones or computer (how?)
The big question is: What basic things do I need to learn to in order to implement something like this?
Now here are the little rambling questions that I've also got rolling around in my head. Feel free to answer these if you wish. (...maybe consider them extra credit?)
I've heard a lot of good things about RESTful services, I've read the wiki article, and I've even played around with the Twitter's webservice which is RESTful. Is this the obvious way to go? Or should I still consider something else?
What programming language do I use to persist things to the database? I'm thinking php will be the first choice for this.
What programming language do I use to interact with the database? I'm thinking anything is probably acceptable, right?
Do I have to worry about concurrent access to the database, or does MySQL handle that for me? (I'm fairly new to databases too.)
How on earth do I send information back? Obviously, if it's a reply to an HTTP request that's no problem, but there will be times when the reply may take quite a long time to compute. Should I just let the HTTP request remain outstanding until I have the answer?
There will be times when I need to send information to a smartphone regardless of whether or not information has been sent to me. How can I push information to my users?
Other information that may help you know where I'm coming from:
I am pretty familiar with Java, C#, C++, and Python. I have played around with PHP, Javascript, and Ruby.
I am relatively new to databasing, but I get the basic idea.
I've already set up my server and I'm using a basic LAMP stack. My understanding of L, A, M and P is fairly rudimentary.
Language: Python for it's ease of use assuming the GIL is not a particular concern for your requirements (e.g. multi-threading). It has drivers for most databases and supports numerous protocols. There are several web frameworks for it - the most popular probably being Django.
Protocols: if you are HTTP focused study SOAP and REST. Note, SOAP tends to be verbose, which causes problems moving volumes of data. On the other hand, if you are looking at other options study socket programming and perhaps some sort of binary format such as Google's protocol buffers. Flash is also a possibility (see: Flash Remoting). Note, binary options require users install something onto their machine (e.g. applet or standalone app).
Replies: if the process is long running, and the client should be notified when it's done, I would recommend developing an app for the client. Browser's can be programmed with JavaScript to periodically poll, or a Flash movie can be embedded to real time updates, but these are somewhat tricky bits of browser programming. If you're dealing with wireless phones, look at SMS. Otherwise I would just provide a way for clients to get status, but not send out notification (e.g. push vs. pull). As #jcomea_ictx wrote, AJAX is an option if it's a browser based solution - study jQuery.
Concurrency: understand what ACID means with regards to databases. Think about what should happen if you receive multiple writes to the same data - database may not necessarily solve this problem the way you'd want.
Please, for the love of programming, don't use PHP if you're already comfortable with Python. The latter makes for far cleaner, more maintainable code. Not that it's impossible to write good code in PHP, but it's a relative rarity. You can use Python for all the server-end stuff including MySQL interaction, with the MySQLdb module. Either with standard CGI, or FCGI, or mod_python.
As for the database, use of transactions will eliminate conflicts. But you can usually design a system in such a way that conflicts will not happen. For example, use of auto-incrementing primary-key IDs on each insert will make sure that every entry is unique.
You can "pull" data with Javascript, perhaps using AJAX methodology, or "push" using SMS or other technologies.
When replies take a while to compute, you can "poll" using AJAX. This is a very common technique. The server just returns "we are working on this" (or equivalent) with a built-in refresh until the results are ready.
I'm no expert on REST, but AJAX, especially when using polling rather than simply responding to user input, can be said to violate RESTful principles. But you can be a purist, or you can do whatever works. It's up to you.
I don't believe I've ever used any "push" technologies other than SMS, and that was years ago when many companies had free SMS gateways. So if you want to "push" data, better hope someone else joins in the conversation!
Use Java. The latest version of Java EE 6 makes coding RESTful and SOAP services a breeze, and it interoperates with databases very easily as well.
The benefits of using a true language instead of a script are: full server-side state, strong typing, multithreading, and countless other things that may or may not come in handy, but knowing they are available makes your project future proof.

How can I scale a webapp with long response time, which currently uses django

I am writing a web application with django on the server side. It takes ~4 seconds for server to generate a response to the user. It makes use of a weather api. My application has to make ~50 query to that api for each user request.
Server side uses urllib of python for using the weather api. I used pythons threading to speed up the process because urllib is synchronous. I am using wsgi with apache. The problem is wsgi stack is fully synchronous and when many users use my application, they have to wait for one anothers request to finish. Since each request takes ~4 seconds, this is unacceptable.
I am kind of stuck, what can I do?
Thanks
If you are using mod_wsgi in a multithreaded configuration, or even a multi process configuration, one request should not block another from being able to do something. They should be able to run concurrently. If using a multithreaded configuration, are you sure that you aren't using some locking mechanism on some resource within your own application which precludes requests running through the same section of code? Another possibility is that you have configured Apache MPM and/or mod_wsgi daemon mode poorly so as to preclude concurrent requests.
Anyway, as mentioned in another answer, you are much better off looking at caching strategies to avoid the weather lookups in the first place, or offloading to client.
50 queries to an outside resource per request is probably a bad place to be, and probably not neccesary at all.
The weather doesn't change all that quickly, and so you can probably benefit enormously by just caching results for a while. Then it doesn't matter how many requests you're getting, you don't need to do more than a few queries per day
If that's not your situation, you might be able to get the client to do the work for you. Refactor the code so that the weather api aggregation happens on the client in javascript, rather than funneling it all through the server.
Edit: based on comments you've posted, what you are asking for probably cannot be optimized within the constraints of the API you are using. The problem is that the service is doing a good job of abstracting away the differences in the many sources of weather information they aggregate into a nearest location query. after all, weather stations provide only point data.
If you talk directly to the technical support people that provide the API, you might find that they are willing to support more complex queries (bounding box), for which they will give you instructions. More likely, though, they abstract that away because they don't want to actually reveal the resolution that their API actually provides, or because there is some technical reason in the way that they model their data or perform their calculations that would make such queries too difficult to support.
Without that or caching, you are just out of luck.

Rest vs. Soap. Has REST a better performance?

I read some questions already posted here regarding Soap and Rest
and I didn't find the answer I am looking for.
We have a system which has been built using Soap web services.
The system is not very performant and it is under discussion
to replace all Soap web services for REST web services.
Somebody has argued that Rest has a better performance.
I don't know if this is true. (This was my first question)
Assuming that this is true, is there any disadvantage using
REST instead of Soap? (Are we loosing something?)
Thanks in advance.
Performance is broad topic.
If you mean the load of the server, REST has a bit better performance because it bears minimal overhead on top of HTTP. Usually SOAP brings with it a stack of different (generated) handlers and parsers. Anyway, the performance difference itself is not that big, but RESTful service is more easy to scale up since you don't have any server side sessions.
If you mean the performance of the network (i.e. bandwidth), REST has much better performance. Basically, it's just HTTP. No overhead. So, if your service runs on top of HTTP anyway, you can't get much leaner than REST. Furthermore if you encode your representations in JSON (as opposed to XML), you'll save many more bytes.
In short, I would say 'yes', you'll be more performant with REST. Also, it (in my opinion) will make your interface easier to consume for your clients. So, not only your server becomes leaner but the client too.
However, couple of things to consider (since you asked 'what will you lose?'):
RESTful interfaces tend to be a bit more "chatty", so depending on your domain and how you design your resources, you may end up doing more HTTP requests.
SOAP has a very wide tool support. For example, consultants love it because they can use tools to define the interface and generate the wsdl file and developers love it because they can use another set of tools to generate all the networking code from that wsdl file. Moreover, XML as representation has schemas and validators, which in some cases may be a key issue. (JSON and REST do have similar stuff coming but the tool support is far behind)
SOAP requires an XML message to be parsed and all that <riduculouslylongnamespace:mylongtagname>extra</riduculouslylongnamespace:mylongtagname> stuff to be sent and receieved.
REST usually uses something much more terse and easily parsed like JSON.
However in practice the difference is not that great.
Building a DOM from XMLis usually done a superfast superoptimised piece of code like XERCES in C++ or Java whereas most JSON parsers are in the roll your own or interpreted catagory.
In fast network environment (LAN or Broadband) there is not much difference between sending a one or two K versus 10 to 15 k.
You phrase the question as if REST and SOAP were somehow interchangeable in an existing system. They are not.
When you use SOAP (a technology), you usually have a system that is defined in 'methods', since in effect you are dealing with RPC.
When you use REST (an architectural style, not a technology) then you are creating a system that is defined in terms of 'resources' and not at all in methods. There is no 1:1 mapping between SOAP and REST. The system architecture is fundamentally different.
Or are you merely talking about "RPC via URI", which often gets confused with REST?
I'm definitely not an expert when it comes to SOAP vs REST, but the only performance difference I know of is that SOAP has a lot of overhead when sending/receiving packets since it's XML based, requires a SOAP header, etc. REST uses the URL + querystring to make a request, and thus doesn't send that many kB over the wire.
I'm sure there are other ppl here on SO who can give you better and more detailed answers, but at least I tried ;)
One thing the other answers seem to overlook is REST support for caching and other benefits of HTTP. While SOAP uses HTTP, it does not take advantage HTTP's supporting infrastructure. The SOAP 1.1 binding only defines the use of the POST verb. This was fixed with version 1.2 with the introduction of GET bindings, however this may be an issue if using the older version or not using the appropriate bindings.
Security is another key performance concern. REST applications typically use TLS or other session layer security mechanisms. TLS is much faster than using application level security mechanisms such as WS Security (WS Security also suffers from security flaws).
However, I think that these are mostly minor issues when comparing SOAP and REST based services. You can find work arounds for either SOAP's or REST's performance issues. My personal opinion is that neither SOAP, nor REST (by REST I mean HTTP-based REST services) are appropriate for services requiring high throughput and low-latency. For those types of services, you probably want to go with something like Apache Thrift, 0MQ, or the myriad of other binary RPC protocols.
It all depends. REST doesn't really have a (good) answer for the situation where the request data may become large. I feel this point if sometimes overlooked when hyping REST.
Let's imagine a service that allows you to request informational data for thousands of different items.
The SOAP developer would define a method that would allow you retrieve the information for one or as many items as you like ... in a single call.
The REST developer would be concerned that his URI would become too long so he would define a GET method that would take a single item as parameter. You would then have to call this multiple times, once for each item, in order to get your data. Clean and easy to understand ... but.
In this case there would be a lot more round-trips required for the REST service to accomplish what can be done with a single call on the SOAP service.
Yes, I know there are workarounds for how to handle large request data in the REST scenario. For example you can pack stuff into the body of your request. But then you will have to define carefully (on both the server and the client side) how this is to be interpreted. In these situations you start to feel the pain that REST is not really a standard (like SOAP) but more of a way of doing things.
For situations where only relatively limited amount of data is exchanged REST is a very good choice. At the end of the day this is the majority of use cases.
Just to add a little to wuher's answer.
Http Header bytes when requesting this page using the Chrome web browser: 761
Bytes required for the sample soap message in wikipedia article: 299
My conclusion: It is not the size of bytes on the wire that allows REST to perform well.
It is highly unlikely that simply converting a SOAP service over to REST is going to gain any significant performance benefits. The advantage REST has is that if you follow the constraints then you can take advantage of the mechanisms that HTTP provides for producing scalable systems. Caching and partitioning are the tools in your toolbelt.
In general, a REST based Web service is preferred due to its simplicity, performance, scalability, and support for multiple data formats. SOAP is favored where service requires comprehensive support for security and transactional reliability.
The answer really depends on the functional and non-functional requirements. Asking the questions listed below will help you choose.
REf: http://java-success.blogspot.ca/2012/02/java-web-services-interview-questions.html
Does the service expose data or business logic? (REST is a better choice for exposing data, SOAP WS might be a better choice for logic).
Do the consumers and the service providers require a formal contract? (SOAP has a formal contract via WSDL)
Do we need to support multiple data formats?
Do we need to make AJAX calls? (REST can use the XMLHttpRequest)
Is the call synchronous or asynchronous?
Is the call stateful or stateless? (REST is suited for statless CRUD operations)
What level of security is required? (SOAP WS has better support for security)
What level of transaction support is required? (SOAP WS has better support for transaction management)
Do we have limited band width? (SOAP is more verbose)
What’s best for the developers who will build clients for the service? (REST is easier to implement, test, and maintain)
You don't have to make the choice, modern frameworks allow you to expose data in those formats with a minimum change. Follow your business requirements and load-test the specific implementation to understand the throughput, there is no correct answer to this question without correct load test of a specific system.

Is the end of SOAP near?

Following "A well earned retirement for the SOAP Search API" from Google announcing they have recently removed their SOAP APIs, I'm curious what the community thinks of SOAP in 2009. I can see its use for remoting and more verbose client-server stateless communication, but for more generalised [Ajax] web usage is it now redundent?
Have REST URLs removed the need for SOAP and that kind of web service once and for all?
There is nothing in REST that says you can't use form POST fields to PUT data for when you need to send complex requests over. You can even post big blocks of bulky XML if you want to try and make it as SOAPy as possible.
IMHO SOAP gives you nothing but a wrapper that you never needed to begin with. The thing that killed it for me was the way Axis and other engines compile stubs of your WSDL into their code and then every time you add something to the WSDL it breaks the consumers, even if everything was designed to be backward compatible. REST forever.
SOAP is here to stay - and rightfully so.
In an enterprise environment, things like self-describing services (with the help of WSDL), ability to use transactions and reliable messaging are paramount. They're much more important than the running after the "rave of the day".
REST has its good uses - but it cannot ever replace SOAP totally, nor should it. REST is great for lightweight communcation - twittering and the like. But there's also good reason to have and know about SOAP.
SOAP currently has the much better tooling support in most environments - it'll be some time before REST has something comparable.
SOAP allows for machine-readable service description and service discovery - REST has nothing like that, your REST service might - or might not - be documented, and the quality of the English prose documenting your REST services varies wildly.
Yes, REST is all the rage right now - and it does make a lot of fun scenarios a lot easier to handle. But I don't think it's ready for "prime-time, enterprise-level" use, really. Maybe some day - but not today.
Oh if only SOAP were dead. I can assure you that some companies are still pursuing SOAP-based, RPC strategies as fast as they can.
As you already said -- REST can't deal with the verbose cases.
If you can tell me how I can take an arbitrary number of complex arguments through a RESTful web service, without hitting URL length restrictions, I'd love to hear it.
But for complex queries of scientific data, we need something more than positional parameters or key/value pairs.
My prediction is that SOAP won't die until sometime after COBOL and Fortran.

Web Service vs Form posting

I have 2 websites(www.mysite1.com and myweb2.com, both sites are in ASP.NET with SQL server as backend ) and i want to pass data from one site to another.Now i am confused whether to use a web service or a form posting (from mysite1 to a page in myweb2)
Can any one tell me the Pros and Cons of both ?
By web service I assume you mean SOAP based web service?
Anyway both are equal with few advantages. Posting is more lightweight, while SOAP is standardized (sort of). I would go with more restful approach, because I think SOAP is too much overhead for simple tasks while not giving much of advantage.
Webservices are SOAP messages (the SOAP protocol uses XML to pass messages back and forth), so your server on both ends must understand SOAP and whatever extensions you want to talk about between them, and they probably (but don't have to) be able to grok WMDL files (that "explains" the various services endpoints and remote functionality available). Usually we call this the SOAP / WS-* stack, with emphasis on 'stack' as there's a few bits of software that needs to be available, and the more complex the SOAP calls, the more of this stack needs to be available and maintained.
Using POST, on the other hand, is mostly associated with RESTful behaviours, and as an example of a protocol of such, look to HTTP. Inside the POST you can of course post complex XML, but people tend to use plain POST to simplify the calling, and use HTTP responses as replies. You don't need any extra software, probably, as most if not all webkits has got HTTP support. My own bias leans towards REST, in case you wonder. Through using HATEOAS you can create really good infrastructure for self-aware systems that can modify themselves with load and availability in real-time as opposed to the SOAP way, and this lies at the centre of the argument for it; HTTP was designed for large distributed networks in mind, dealing with performance and stability. SOAP tends to be a one-stop if-it-breaks-you're-stuffed kinda thing. (Again, remeber my bias. I've written about this a lot on my blog, especially the architecture side and the impact of SOA vs. ROA. :)
There's a great debate as to which is "better", to which I can only say "it depends completely on what you want to do, how you prefer to do it, what you need it to do, your environment, your experience, the position of the sun and the moon(s), and the mood my cat is in." Eh, meaning, a lot.
I'm all for a healthy debate about this, but I tend to think that SOAP is a reinvention; SOAP is an envelope with a header and body, and if that sounds familiar, it is exactly how HTML was designed, a fact very few people tend to see. HTTP as just a protocol for shifting stuff around is well understood and extremely well supported, and SOAP uses it to shift their XML envelopes around. Is there a real difference between shifting SOAP and HTML around? Well, yes, the big difference is that SOAP reinvents all the niceties of HTTP (caching, addressability, state, scaling), and then use HTTP only for delivering the message and nothing else and let the stack itself have to deal with those niceities mentioned earlier. So, a lot of the goodness of HTTP is ignored and recreated in another layer (hence, you need a SOAP stack to deal with it), which to me seems wasteful, ignorant and adding complexity.
Next up is what you want to do. For really complex things, there's lots in the webservices stack of standards (I think it's about 1200 pages combined these days) that could help you out, but if your needs are more modest (ie. not that crazy about seriously complex security, for example) a simple POST (or GET) of a request and an envelope back with results might be good enough. Results in HTTP is, as you probably know, HTTP content-type, so lots is already supported but you can create your own, for example application/xml+myformat (or more correctly, application/x-xml+myformat if I remember correctly). Get the request, if it's a response code 200, and parse.
Both will work. One is heavy (WS-* stack) depending on what your needs are, the other is more lightweight and already supported. The rest is glue, as they say.
I would say the webservice is definitely the best choice. A few pro's:
If in the future you need to add another website, your infrastructure (the webservice) is already there
Cross-site form posting might give you problems when using cookies or
might trigger browser privacy restrictions
If you use form posting you have to
write the same code over and over
again, while with using the
webservice you write the code once,
then use it at multiple locations.
Easier to maintain, less code to
write.
Maintainability (this is related to
the above point) ofcourse, all the
code relevant to exchanging data is
all at one location (your webservice)
There's probably even more. Like design time support/code completion.
From my little experience I'd say that you'd be best using a web service since you can see the methods and structure of the service in your code, once you've created it at the recieving end that is.
Also using the form posting methos would eman you have to fake form submissions which isn't as neat as making a web service call.
Your third way would be to get the databases talking, though I'm guessing they're disparate and can't 'see' each other?
I would suggest a web service (or WCF). As Beanie said, with a service you are able to see the methods and types of the service (that you expose), which would make for much easier and cleaner moving of data.
I agree with AlexanderJohannesen that it is debatable whether SOAP webservices or RESTful apis are better, however if both sites are under your control and done with asp.net, definitely go with SOAP webservices. The tools that Visual Studio provides for creating and consuming webservices are just great, it won't take you more than a few minutes to create the link between the two sites.
In the site you want to receive communication create the web service by selecting add item in VS. Select web service and name it appropriately. Then just create a method with the logic you want to implement and add the attribute [WebMethod], eg.
[WebMethod]
public void AddComment(int UserId, string Comment) {
// do stuff
}
Deploy this on your test server, say tst.myweb2.com.
Now on the consuming side (www.myweb1.com), select Add Web Reference, point the url to the address of the webservice we just created, give it a name and click Add refence. You have a proxy class that you can call just like a local class. Easy as pie.