Should CFHTTP be used to make local server calls?

Should CFHTTP be used to make local server calls? - coldfusion

I've heard that CFHTTP calls that point to a local server can cause additional delays for the user. As this is a HTTP request, it seems to me that any delays would be negligible - though I don't know a whole lot about networking/systems and load-balancing.
Are there any downsides/cons to using CFHTTP to make local calls and if so, are there ways to mitigate those?

What is it that you're trying to do with these HTTP calls? If whatever you're fetching by HTTP is on the local file system, actually using the file system (fileRead(), include, etc) rather than the network to get them will be more expedient.
That said, if you do actually want to perform a request for whatever reason (which is completely legit in some situations), then I don't think the performance overhead would be something to worry about. Although I'd not want to be doing this sort of thing on every onRequestStart(), etc.
I think you need to elaborate on what it is you're setting out to achieve here.

it seems to me that any delays would be negligible
Perhaps, it would depend on your traffic and any other latency problems you're having. I did a simple test that fetched 5 paragraphs of pre-generated Lorem Ipsum dummy text that was served from the same CF server. The result took between 15 to 47 milliseconds, which as I said, depends on you to decide if it is negligible. Personally I think it's a bit high but from a user perspective, it wouldn't be noticeable in my environment.
As far mitigation, if you're trying to re-use code I'd consider putting your authentication into a cfc. This is beneficial because you can use it in multiple local applications as well as a web service if needed (which is technically what you're trying to do by using cfhttp).
What is the problem with including the process in a cfinclude as you said you were doing before? testing proved including the same pre-generated Lorem Ipsum text took 0 MS every time.
<cfset start = getTickCount()>
<cfhttp url="http://myServer.com/test/lipsum.cfm" method="get" >
<cfset end = getTickCount()-start>
<cfoutput>it took #end# MS to get the Lipsum.</cfoutput><br />
<cfset start = getTickCount()>
<!-- <cfinclude template="lipsum.cfm"> -->
<cfset end = getTickCount()-start>
<cfoutput>it took #end# MS to include the Lipsum.</cfoutput>
it took 43 MS to get the Lipsum.
it took 0 MS to include the Lipsum.

You don't specify whether you're calling back into the ColdFusion server itself or another service hosted locally. It might help to expand the question a little to provide a use-case.
The speed penalty of calling a local server will be relatively small and in some cases that may be fine, e.g. making a call to a local Solr server to get search results. It will however likely be much slower than calling a function locally within CF.
What I'd be concerned about would be calling back into the server you are executing the ColdFusion on. This is probably a sign that things could be better organised or arranged. I've done this in the past (as a cheap'n'cheerful way of running a background task), but wouldn't do it now that <cfthread> exists.

Related

Code to run that can help benchmark a new SAS drive?

We just got a new Server at work that will be used only for running SAS code, and I've been asked to run some tests and make sure that it's performing better than our other servers. I'm not an expert at this so I want to make sure I avoid making naive mistakes that don't properly measure the performance of the server. My header looks like this:
options fullstimer;
%LET BenchStartTime = %sysfunc(datetime(),22.);
Which I use as a check for the "real time" report in the log. I have a vague understanding of the difference between "user cpu time" and "system cpu time", but if anyone wants to offer up additional information on that, that would be helpful.
Anyway, the main point of this post is that I want to know if there are any standard benchmark tests that I should be using in order to see if this new server is better than the old ones. Currently I'm using something I found online which is just appending a bunch of copies of sashelp.class (but I think this might be a bad idea because pulling from the C: and loading it into a different drive might be the same across all servers, right? If the C: is the slowest drive, doesn't that become the bottleneck?), and I'm also using code that basically generates a bunch of random data of a fixed size and comparing runtimes. Is this the correct approach? Is there something else I should be doing? How many times should I be running these benchmark tests to make sure that it isn't a fluke?
Thanks for your help!

I would test by doing the things that you normally do. If you run larger merges, then you're basically talking I/O; so just make a very large dataset, write it out, read it in, etc., and perform the same test on the other machine. Perform each test a few times each in a fresh SAS session.
Further, it sounds like you need to make sure the new server can handle multiple concurrent sessions. You can simulate this in part by submitting many connections from one computer by using SAS/CONNECT; that allows you to start multiple concurrent sessions. Then do so, perhaps I would create a script that starts a local SAS session, signs on to the server and rsubmits a job of a normal difficulty that takes maybe 5 to 10 minutes, and then run that script 20 times concurrently (you can script this or just double click a .bat file 20 times). See how it handles it as compared to the other server.

On SAS 9.2 and later you can use the %SASHOME%/SASFoundation/9.x/sasiotest.exe utility. You will find some further Guidance in Support note 51659.
Edit:
A similiar utility can be downloaded for Unix Plattform, details in Support note 51660.

Why bother using Apache or Nginx, etc?

I've been assigned a project which requires me to add some HTML page serving. This embedded system (running Linux CentOS 6.3) has some extra juice available, but also already has numerous responsibilities.
I considered Apache but tossed it due to bloat, I looked into Nginx but am now shying from that too. It just seems that I'm getting way more 'functionality' and as a result, more CPU usage than I need.
Can someone enlighten me as to why I wouldn't just implement the HTTP protocol myself using async sockets?
My specific needs are:
Receive and decode GETs and POSTs.
Send CSS, JS and JPG files as requested.
Output header, cookie, head and body data based upon the decode of the GETs/POSTs.
Given that I don't need the myriad things these webservers offer, am I being naive in assuming this course of doing it myself? What would you suggest or warn against?

Basically, you use a web server because then you get the functionality you want in a form that's already been tested, is more reliable than your first code is likely to be, and is supported by a large community of others. If Apache and nginx are too heavyweight for you (although nginx is pretty much characterized by how lightweight it is for heavy loads) and especially if the load you expect is very light, then look around for other options.
Wiki has a whole page of comparisons of lightweight web servers.

An easy trap to fall into: thinking "I don't need all the functionality in Product X, I'll just write my own with just the functionality that I need" only to end up reimplementing Product X entirely, one newly-discovered requirement at a time.
I sort of doubt that an embedded system that can run CentOS okay is so resource-starved that it can't run Nginx comfortably (or even Apache, which people run on the Raspberry Pi just fine with appropriate configuration tweaks), given reasonable assumptions about how many pages you are actually serving. I ran it on a Pentium 266 with something like 256MB of RAM serving a few simple PHP apps that served roughly a page every two seconds, with no issues. As I recall, it's fairly modular, so you can just choose not to load the functionality you don't think you need. And, later, when your requirements change and you find out you do need it, you can just plug it back in :)
If you are really and truly concerned about resource consumption, look into web servers designed for embedded applications. I hear Cherokee is quite nice. Mongoose looks promising as well.

Go further you can, I began with this http://www.w3.org/Protocols/HTTP/HTTP2.html

Choice of storage and caching

I hope the title is chosen well enough to ask this question.
Feel free to edit if not and please accept my apologies.
I am currently laying out an application that is interacting with the web.
Explanation of the basic flow of the program:
The user is entering a UserID into my program, which is then used to access multiple xml-files over the web:
http://example.org/user/userid/?xml=1
This file contains several ID's of products the user owns in a DRM-System. This list is then used to access stats and informations about the users interaction with the product:
http://example.org/user/appid/stats/?xml=1
This also contains links to various images which are specific to that application. And those may change at any time and need to be downloaded for display in the app.
This is where the horror starts, at least for me :D.
1.) How do I store that information on the PC of the user?
I thought about using a directory for the userid, then subfolders with the appid to cache images and the xml-files to load them on demand. I also thought about using a zipfile while using the same structure.
Or would one rather use a local db like sqlite for that?
Average Number of Applications might be around ~100-300 and stats and images per app from basically 5-700.
2.) When should I refresh the content?
The bad thing is, the website from where this data is downloaded, or rather the xmls, do not contain any timestamps when it was refreshed/changed the last time. So I would need to hash all the files and compare them in the moment the user is accessing that data, which can take an inifite amount of time, because it is webbased. Okay, there are timeouts, but I would need to block the access to the content until the data is either downloaded and processed or the timeout occurs. In both cases, the application would not be accessible for a short or maybe even long time and I want to avoid that. I could let the user do the refresh manually when he needs it, but then I hoped there are some better methods for that.
Especially with the above mentioned numbers of apps and stuff.
Thanks for reading and all of that and please feel free to ask if I forgot to explain something.

It's probably worth using a DB since it saves you messing around with file formats for structured data. Remember to delete and rebuild it from time to time (or make sure old stuff is thoroughly removed and compact it from time to time, but it's probably easier to start again, since it's just a cache).
If the web service gives you no clues when to reload, then you'll just have to decide for yourself, but do be sure to check the HTTP headers for any caching instructions as well as the XML data[*]. Decide a reasonable staleness for data (the amount of time a user spends staring at the results is a absolute minimum, since they'll see results that stale no matter what you do). Whenever you download anything, record what date/time you downloaded it. Flush old data from the cache.
To prevent long delays refreshing data, you could:
visually indicate that the data is stale, but display it anyway and replace it once you've refreshed.
allow staler data when the user has a lot of stuff visible, than you do when they're just looking at a small amount of stuff. So, you'll "do nothing" while waiting for a small amount of stuff, but not while waiting for a large amount of stuff.
run a background task that does nothing other than expiring old stuff out of the cache and reloading it. The main app always displays the best available, however old that is.
Or some combination of tactics.
[*] Come to think of it, if the web server is providing reasonable caching instructions, then it might be simplest to forget about any sort of storage or caching in your app. Just grab the XML files and display them, but grab them via a caching web proxy that you've integrated into your app. I don't know what proxies make this easy - you can compile Squid yourself (of course), but I don't know whether you can link it into another app without modifying it yourself.

Porting REST API connection code to Android NDK in c++

I have an Android application in the market which connects and send POST and GET queries to a REST API, and then stores the results in a DB which are then queries and displayed in an appropriate manner in the application.
I'm interested in speeding up the application and have noticed quite a lot of lag between the time of receiving the data back from the api and the data being ready to use. I'd like to investigate if and how I can write similar code in c++ using the NDK to connect to the REST API, process the results and store in a DB or raise an error. I've no previous c++ experience and need to know firstly if I can access the same DB in the C++ as the Java, secondly if there are any other caveats which I should be aware of?
Also I guess I should ask - is it worth doing this? Will I notice any difference?
Any links to similar code, or an overview of where I should look to get started in c++ would be greatly appreciated.

I'm doing the EXACT same thing, and trust me: if you have no previous C++ experience, this might be a bit too costly for little benefit.
In my case, after some profiling, I reordered things around and had an initial jump in performance only by dropping DOM and using SAX. All the rest is only making things marginally better, like processing the response while packets are still being transmitted (i.e. not wait for the full response to start processing), and multiplexing requests on the same thread instead of starting a new thread for each.
What you should be looking for in Google is POSIX sockets, HTTP and REST codes, if you wish to do it all by hand. A better option might be using CURL or something similar for the Socket/HTTP part. I did it all myself, but only because I have already done this a few times.

how to perform profiling for a website?

I currently have a django site, and it's kind of slow, so I want to understand what's going on. How can I profile it so to differentiate between:
effect of the network
effect of the hosting I'm using
effect of the javascript
effect of the server side execution (python code) and sql access.
any other effect I am not considering due to the massive headache I happen to have tonight.
Of course, for some of them I can use firebug, but some effects are correlated (e.g. javascript could appear slow because it's doing slow network access)
Thanks

client side:
check with firebug if/which page components take long to load, and how long the browser needs to render the page after loading is completed. If everything is fast but rendering takes its time, then probably your html/css/js is the problem, otherwise it's server side.
server side (i assume you sit on some unix-alike server):
check the web server with a small static content (a small gif or a little html page), using apache bench (ab, part of the apache webserver package) or httperf, the server should be able to answerat least 100 requests per second (of course this depends heavily on the size of your test content, webserver type, hardware and other stuff, so dont take that 100 to seriously). if that looks good,
test django with ab or httperf on a "static view" (one that doesnt use a database object), if thats slow it's a hint that you need more cpu power. check cpu utilization on the server with top. if thats ok, the problem might be in the way the web server executes the python code
if serving semi-static content is ok, your problem might be the database or IO-bound. Database problems are a wide field, here is some general advice:
check i/o throughput with iostat. if you see lot's of writes then you have get a better disc subsystem, faster raid, SSD hard drives .. or optimize your application to write less.
if its lots of reads, the host might not have enough ram dedicated as file system buffer, or your database queries might not be optimized
if i/o looks ok, then the database might be not be suited for your workload or not correctly configured. logging slow queries and monitoring database activity, locks etc might give you some idea
if you let us know what hardware/software you use i might be able to give more detailed advice
edit/PS: forgot one thing: of course your app might have a bad design and does lots of unnecessary/inefficient things ...

Take a look at the Django debug toolbar - that'll help you with the server side code (e.g. what database queries ran and how long they took); and is generally a great resource for Django development.
The other non-Django specific bits you could profile with yslow.

There are various tools, but problems like this are not hard to find because they are big.
You have a problem, and when you remove it, you will experience a speedup. Suppose that speedup is some factor, like 2x. That means the program is spending 50% of its time waiting for the slow part. What I do is just stop it a few times and see what it's waiting for. In this case, I would see the problem 50% of the times I stop it.
First I would do this on the client side. If I see that the 50% is spent waiting for the server, then I would try stopping it on the server side. Then if I see it is waiting for SQL queries, I could look at those.
What I'm almost certain to find out is that more work is being requested than is actually needed. It is not usually something esoteric like a "hotspot" or an "algorithm". It is usually something dumb, like doing multiple queries when one would have been sufficient, so as to avoid having to write the code to save the result from the first query.
Here's an example.

First things first; make sure you know which pages are slow. You might be surprised. I recommend django_dumpslow.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js