Internet connection speed vs. Programming language speed for HTTP Requests? - c++

I know how to program in Python but I am also interested in learning C++. I have heard that it is much faster than python and for the programs I am writing currently, I would prefer them to run as quickly and efficiently as possible. I know that a lot of that comes from just writing good code but I was also wondering if using another language, such as C++, would help.
While I was pondering this, I realized that since most of my programs will be mainly using the internet (as in implementing Google APIs and using the information from them to submit data to other websites) then maybe the speed of the language doesn't matter if the speed of my internet connection is always going to be relatively the same. I have two ways I am connecting to the internet: Selenium (or some kind of automated browser) for things that require a browser, and just HTTP requests.
How much difference would I see between python and a different language even though the major focus of my programs is on the internet?
Thanks.

Scenarios
The main benefit you would get from using language that is compiled to machine code is that you can do lots of byte and bit-magic. Lets say, modifying image data, transforming audio, analysing indices of a genomic sequence database.
Typical tasks
Serving web-pages you typically have problems if a completely different sort: You will be loading a resource from hard disk, serve them directly if its an image or audio, or you will be executing different transformation steps on a text resource until it becomes the final HTML document. The latter will be using template engines, database queries, and so on.
If you look at that you can see that most of the things, say 90-99% are pretty high-level stuff -- in Python you will use an API that is optimized by many, many users for optimal performance (meaning: time and space). "Open a file" will be almost as fast in C as it is in Python, so is reading from it and serving it to some Socket. Transforming text data could be a bit faster in C++ then it is in Python, but... how fast does it have to be? A use is very likely willing to wait 200ms, isnt't he? And that is a lot of time for a nice high-level template engine to transform a bit of text.
What C++ and Python can do for you
A typical Python web-service is much faster to write and a easier to deploy then a server written in C++. If you would do it in C++ you firstly need to handle sockets and connections -- and for those would either use an existing library or write your own handling. If you use an existing library (which I strongly recommend) you are basically not doing anything differently then Python does. If you write your own handling, you have many, many low-level things you can do wrong that will burn the performance you wish for. No, that is not an option.
If you need speed, and Python and the server and template framework is not enough you should re-think your architectural approach. Then take a look at the c10k-problem and write tiny pieces in C. (Look at this c10k very hot topic, too) But I can not see many reasons not to use a high-level language like Python, if you are only looking for performance in a medium-complex web-serving application.
Summary: The difference
If you are just serving files from the hard-drive I guess your Python program would even be faster then your hand-crafted C++-server. If you use a framework written in C or C++ and just drop in your static pages, I guess you get a boost like 2-5fold against Python. Then again, if your web-application is a bit more complex then serving static content, I estimate that the difference will diminish very quickly and you will get 1-2fold speed gain at most.
It's not all about speed...
One note about another difference between C++ and Python one should not forget: Since C++ is really compiled and not as dynamic as Python you would gain a lot of static error analysis by using Python. Writing correct code is always difficult, but can be done in C++ and Python with good tests and static analysis -- the latter is simpler in C++ (my opinion). If that is an issue for you, you may think again, but you asked about speed.

Related

Why is using more than one language in application server projects? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
After a while surfing on source code of big projects especially application servers like this, I have understand these projects are not developing by one language. Many of them are using python for secondary language.
Now I have four questions:
Why is used more than one language?
Why python is used for secondary often?
Why python is not used to develop all parts of projects and they still is using c/c++?
And which the parts of projects should be developed with python and which part of projects is using c/c++?
Hard and soft layers
Programming language designs tend to trade off between "high-level" features, which enhance programmer productivity at the cost of speed, and "low-level" features, which require a great deal of programmer effort but produce very fast code.
Therefore it sometimes makes sense to use two languages on a project:
Write 90% of the code in an expressive, high level language which is easy to write and maintain.
Write the 10% of performance-critical code in a low-level language which is harder to write, but allows for comprehensive optimisation.
c2wiki calls this the HardAndSoftLayers pattern:
By virtue of the first rule of optimization, go ahead and write most of your code in the highest level language you can find.
By virtue of the third rule of optimization, when you must, use a profiler and find the slow parts of your program. Take those parts, and write them in a lower-level language.
For reference, the rules of optimisation are:
First Rule Of Optimization - Don't.
Second Rule Of Optimization - Don't... yet.
Profile Before Optimizing
The rule is pretty simple: the developers choose the language(s) based more or less on the following criterias:
their familiarity with it
how easily you can do the task using that language
how well is suited the language to the specific task
Today most of the development done in this multilingual environments are huge solutions, where different components need to communicate, exchange data or simply do work which is comprised of more than one step. It is easier to write the communication/data interpretation/whatever wrapping necessary part in a language such as python and then leave the real time and speed needy work to be done by some lower level language which compiles directly without the need for an interpreter.
Let's dig a little bit deeper.
How familiar are the developers with the programming language depends on the background of each developer. If they are given a free choice, obviously they will pick the language they know the best, unless there is a lobby from someone else... usually higher in the management chain. Python is not necessarily the language of choice, python is simply an easy to use and learn language, which is well suited for most tasks. Our project has no bit of python in it, only tons of ruby code. Because the main developer liked ruby at that time, so we're stuck with it.
If you know more than one programming language you know that each of them is doing the same thing differently. For example, creating a socket, connecting to a server, reading the stuff and printing it out is just a few lines of Erlang code, but it takes a lot more to do it in C++ (for example...) So again, if you have a task you know how to solve easily in a specific language you are going to stuck to it. People are lazy, they don't necessarily learn new stuff unless needed.
Obviously you are not going to write a device driver in python, and it is much easier to create a complete web service with java than with plain C... but you still would need the part of the solution that does the hardware close thing. When you have a task you carefully measure the requirements and implications and wisely choose the language you will do it in. Because it will stuck to it forever.
Sometimes python is not good enough.
Dealing with computer vision, image or sound processing, calculate tones of data is not really what python is good at. Other language like C or C++ is really good at those fields.
support your primary language is java, and you want to glue other languages into one project. That where we need Python. Python is well known glue language. you can use ctype,SWIG, Jython, ironPython or other method to bind multiple language.
Guess I answered this question at 1.
Need for speed. go for C or C++ . Care more about productivity, go with Python.
Without referring to the project you sent, I'll give you my 50c for why the company I work for, as to why we use python quite often in our projects.
Primarily, we have no python code relating to the software solution itself. All python code either relates to assist with development, machine set up, common framework tools deployment for testing, and vastly for code generation.
Why is used more than one language?
No project we work on has only one language, when looking at all our enterprise level solutions or large scale implementations.
This is mostly due to the fact that our tiers are written in languages that provide best performance and usability at each level separately.
For instance, C++ for speedy core back-end services, and C#.NET for rapidly developed and provide good UI for the front-end.
Why python is used for secondary often?
Personally, apart for the reasons I explained above, we don't make use of python 'secondary often'. We use C++/C# as the most common pair, but depending on the platform, might be other pairs.
Why python is not used to develop all parts of projects and they still is using c/c++?
Python is great for quick solutions and doing things you wish your shell could do. This largely involves file management, etc.
C++ is perhaps the fastest compiled language, providing optimal usage for core and largely used actions.
Based on that, and the fact that the market has more knowledge and experience in C++ (for many reasons), C++ is the more popular choice.
And which the parts of projects should be developed with python and which part of projects is using c/c++?
I believe I may have already addressed that above.
-
I hope I could help, please remember this is only my personal opinion and by no means should this be taken as a fact.

Speed - embedding python in c++ or extending python with c++

I have some big mysql databases with data for calculations and some parts where I need to get data from external websites.
I used python to do the whole thing until now, but what shall I say: its not a speedster.
Now I'm thinking about mixing Python with C++ using Boost::Python and Python C API.
The question I've got now is: what is the better way to get some speed.
Shall I extend python with some c++ code or shall I embedd python code into a c++ programm?
I will get fore sure some speed increment using c++ code for the calculating parts and I think that calling the Python interpreter inside of an C-application will not be better, because the python interpreter will run the whole time. And I must wrap things python-libraries like mysqldb or urllib3 to have a nice way to work inside c++.
So what whould you suggest is the better way to go: extending or embedding?
( I love the python language, but I'm also familiar with c++ and respect it for speed )
Update:
So I switched some parts from python to c++ and used multi threading (real one) in my c modules and my programm now needs instead of 7 hours 30 minutes :))))
In principle, I agree with the first two answers. Anything coming from disk or across a network connection is likely to be a bigger bottleneck than the application.
All the research of the last 50 years indicates that people often have inaccurate intuition about system performance issues. So IMHO, you really need to gather some evidence, by measuring what is actually happening, then chose a solution based on that evidence.
To try to confirm what is causing the slow performance, measure the system and user time of your application (e.g time python prog.py), and measure the load on the machine.
It the application is maxing-out the CPU, and most of that time is spent in the application (user time), then there may be a case for using a more effective technology for the application.
But if the CPU is not maxed, or the application spends most of its time in the system (system time), and not in the application (user time), then it is unlikely that changing the application programming technology will help significantly. (This is an example of Amdahl's Law http://en.wikipedia.org/wiki/Amdahl%27s_law)
You may also need to measure the performance of your database server, and maybe network connection, to identify the source of the bottle neck, but start with the easiest part.
In my opinion, in your case it makes no sense to embed Python in C++, while the reverse could be beneficial.
In most of programs, the performance problems are very localized, which means that you should rewrite the problematic code in C++ only where it makes sense, leaving Python for the rest.
This gives you the best of both world: the speed of C++ where you need it, the ease of use and flexibility of Python everywhere else. What is also great is that you can do this process step by step, replacing the slow code paths by the by, leaving you always with the whole application in an usable (and testable!) state.
The reverse wouldn't make sense: you'd have to rewrite almost all the code, sacrificing the flexibility of the Python structure.
Still, as always when talking about performance, before acting measure: if your bottleneck is not CPU/memory bound switching to C++ isn't likely to produce much advantages.

What's a good and safe language for drawing intensive particle systems over a long period of time?

Another one of my rather ambiguous question today, sorry.
Currently I have written some half decent software that has a 'roll your own' RESTful client, which pulls data from twitter. This data is then visualized with a number of particle systems using Open FrameWorks (a framework that works with c++).
My plans for this were to run the software indefinitely on my VPS, and build some kind of front end GUI allowing users to explore the pretty particles and so on. Between the JSON library I am using, C/C++, OpenFrameworks, and freaking Xcode4 I have produced way too many SIGBIRT and GDB errors to care for. I have go to the ends of the virtual world to fix them, and re wrote everything over and over. I even managed to SIGBIRT the openframeworks draw circle method, HAH!
(TL;DR starts here) Ok so anyway I am starting from scratch, looking for a powerful language that can crunch maths and blast through a good set of particles, and run quite well over the longest periods of time. Right now I am thinking about haskell, any ideas?
Thanks in advance all!
Haskell's (or more specifically GHC's) number crunching speed is approaching that of C++ but it's a little way behind. However, it's certainly not terrible, and Haskell's advantages in parallelism may become important. That is, if you write it in straight Haskell first, there's a good chance that it'll be easy to refactor it to run in parallel now or in the future. That isn't so true of C++.
The 'vector' package (on Hackage) would be a good choice for arrays suitable for number crunching. It supports mutable arrays in case that sort of approach is needed. However, if you're prepared to go more on the bleeding edge and your algorithm can be parallelized, you might want to look at the 'repa' package, and for extreme performance on a GPU, take a look at 'Accelerate' (which works but is still categorized as experimental).
The crashes you mention sound like they could be an indication of a bit of complexity in your problem. Where Haskell does well is in managing the complexity of... well, anything. So, if the problem is complex, then Haskell will serve you very well.
The foreign function interface in Haskell is well designed, though you will need to write C glue between Haskell and C++. So, that's another option for your number crunching.
For the web interface, take a look at 'yesod' which is seeing very active development and advertises itself as doing RESTful.
AFAIK, number crunching speed is not Haskell's strongest point - it's a highly abstract language, far from the 'metal'; its strength in a numeric processing context lies in the "mathiness" of its semantics - Haskell code often reads much like a Mathematical proof, and many of its concepts are borrowed from various fields of Mathematics.
For plain old number crunching, C++ is probably still your best choice, as it allows you to stay close to the hardware and optimize tight loops at the machine level, while offering higher-level programming constructs to manage complexity.
OTOH, if you have a library in place for the heavy lifting, and you merely need to write the glue to make the various parts work together, then go with whatever you're most comfortable with - python, C#, java, haskell, C++, ... - as long as they have bindings for all your libraries, you're good. If you don't have a library, then you might also consider writing the performance critical parts in C, and then pull them into your favorite high-level language - this is trivial in C++, slightly harder in python or haskell, and pretty damn inconvenient in java.

Will web development in c++ cgi really a huge performance gain?

I'm asking the question after reading this article
http://stevehanov.ca/blog/index.php?id=95
Also isn't it a penalty to use cgi instead of fastcgi ?
Update: why some people do pretend like in answer "that you get 20-30% performance improvement" ? Is it pure guess or is this number coming from solid benchmark ? I have looked at HipHop performance is more in the scale of 10 times.
I've done webdev in a few languages and frameworks, including python, php, and perl. I host them myself and my biggest sites get around 20k hits a day.
Any language and framework that has reasonable speed can be scaled up to take 20k hits a day just by throwing resources at it. Some take more resources than others. (Plone, Joomla. I'm looking at you).
My Witty sites (none in production yet) take a lot more (from memory around 5000% more) pounding (using seige) than for example my python sites. Ie. When I hit them as hard as I can with seige, the witty sites serve a lot more pages per second.
I know it's not a true general test though.
Other speed advantages that witty gives you:
Multi threading
If you deploy with the built in websrever (behind ha-proxy for example) and have your app be multi-threaded .. it'll load a lot less memory than say a perl or php app.
Generally with php and perl apps, you'll have Apache fire up a process for each incoming connection, and each process loads the whole php interpreter, all the code and variables and objects and what not. With heavy frameworks like Joomla and Wordpress (depending on the number of plugins), each process can get pretyy humungous on memory consumption.
With the Wt app, each session loads a WApplication instance (a C++ object) and it's whole tree of widgets and stuff. But the memory the code uses stays the same, no matter how many connections.
The inbuilt Web2.0 ness
Generally with traditional apps, they're still built around the old 'http request comes in' .. 'we serve a page' .. 'done' style of things. I know they are adding more and more AJAXy kind of thigns all the time.
With Wt, it defaults to using WebSockets where possible, to only update the part of the page that needs updating. It falls back to standard AJAX, then if that's not supported http requests. With the AJAX and WebSockets enabled clients, the same WApplication C++ object is continually used .. so no speed is lost in setting up a new session and all that.
In response to the 'C++ is too hard for webdev'
C++ does have a bit of a learning curve. In the mid nineties we did websites in Java j2ee. That was considered commercially viable back then, and was a super duper pain to develop in, but it did have a good advantage of encouraging good documentation and coding practices.
With scripting websites, it's easy to take shortcuts and not realize they're there. For example one 8 year old perl site I worked on had some code duplicated and nobody noticed. Each time it showed a list of products, it was running the same SQL query twice.
With a C++ site, I think it'd have less chance because, in the perl site, there wasn't that much programming structure (like functions), it was just perl and embedded html. In C++ you'd likely have methods with names and end up with a name clash.
Types
One time, there was a method that took an int identifier, later on we changed it to a uuid string. The Python code was great, we didn't think we needed to change it; it ran fine. However there was little line buried deep down that had a different effect when you passed it a string. Very hard to track down bug, corrupted the database. (Luckily only on dev and test machines).
C++ would have certainly complained a lot, and forced us to re-write the functions involved and not be lazy buggers.
With C++ and Java, the compiler errors and warns a lot of those sorts of mistakes for you.
I find unit testing is generally not as completely necessary with C++ apps (don't shoot me), compared to scripting language apps. This is due to the language enforcing a lot of stuff that you'd normally put in a unit test for say a python app.
Summary
From my experience so far .. Wt does take longer to develop stuff in than existing frameworks .. mainly because the existing frameworks have a lot more out of the box stuff there. However it is easier to make extremely customized apps in Wt than say Wordpress imho.
From people I've spoken with who've moved from PHP to Wt (a C++ web framework) reported significant improvements. From the small applications I've created using Wt to learn it, I've seen it run faster than the same PHP type applications I created. Take the information for what you will, but I'm sold.
This reminds me how 20-30 years ago people were putting Assembly vs C, and then 10-20 years ago C vs C++. Of course C++ will be faster than PHP/Rails but it'll take 5x more effort to build maintainable and scalable application.
The point is that you get 20-30% performance improvement while sacrificing your development resources. Would you rather have you app work 30% faster or have 1/2 of the features implemented?
Most web applications are network-bound instead of processor-bound. Writing your application in C++ instead of a higher-level language doesn't make much sense unless you're doing really heavy computation. Also, writing correct C++ programs is difficult. It will take longer to write the application and it is more likely that the program will fail in spectacular ways due to misused pointers, memory errors, undefined behavior, etc. In general, I would say it is not worth it.
Whenever you eliminate a layer of interpretive or OS abstraction, you are bound to get some performance gain. That being said, the language or technology itself does not automatically mean all your problems are solved. I've fixed C++ code that took many hours to process a relatively simple set of records. The problem was in the implementation, and the fix was not related to the language's features or limitations.
Assuming things are all implemented correctly, you're sure to get better performance. The problem will be in finding the bugs. One of the problems with C++ is that many developers are currently "trained" or accustomed to having a lot of details related to memory management behind objects. This eliminates the need to consider things like, "What can happen if I pass this pointer around to several threads?" Sometimes it works well, but not always. You still have some subtleties of the language that you need to consider regardless of how the objects hide the nasty details.
In my experience, you'll need several seasoned C++ developers watching over the code to be able to keep the bugs and memory leaks from getting out of hand.
I'm certainly not sold on this. If you want a performance gain over PHP why not use a Java (or better yet Scala) framework? These are much better for web development, have nice, relatively easy to use frameworks and avoid a lot of the headaches of C++. I've always seen one of the main pluses of web-development (and most modern non-scientific/high performance applications) as being able to avoid the headaches that come along with C/C++ development.

C++ Server-Side-Scripting

For once, I have come across a lot of stuff about the use of C++ being not advisable for SSS and recommending the use of so called interpreted languages like PERL and PHP for the same. But I need the advanced OO features and flexibility of C++ to ensure a scalable and more manageable code.
I have tried many internet articles and searches and none where helpful to the point that I still have no idea if it is possible to write SS-Scripts in C++ and if yes, then how.
I have thought of couple ideas, including writing a web-server in C++ and responding accordingly after parsing the HTTP request. But it would be re-inventing the wheel and I'll end up deviating from my main project and dedicating a lot of work to ensure a functional-cum-secure HTTP server.
I have also considered PHP extensions but again the approach also comes with its own baggage and overhead.
My questions are:
Is it possible to program SSS in C++?
If yes, then what are the approaches at my disposal.
Thanks!
Ignoring, for the moment, the advisability of using C++ for SSS, your first choice would probably be Wt. Contrary to the implications in some of the other answers, no development time is not likely to increase by 10x (or anywhere close to it). No, you're not missing all the nice infrastructure features you'd expect in things like PHP, Perl or Python either.
In fact, my own experience is rather the opposite: while PHP (for example) makes it pretty easy to get a web site up and running fairly quickly, producing a web site that's really stable, secure, and responsive is a whole different story. With Wt, rather the opposite seems to be the case (at least in my, admittedly limited, experience). Getting the initial site up and running will probably take a little longer -- but about as soon as it looks, acts, and feels the way you want, it's likely to need only rather minor tweaks to be ready for public use.
Getting back to the advisability question: developing in C++ may be a bit more complex than in some languages that are more common in the SSS market -- but it's still a piece of cake compared to doing security well. If somebody has even the slightest difficulty writing C++ (e.g., tracking and freeing memory when it's no longer needed), I definitely don't want them getting close to the code for my web site.
I wouldn't recommend it, but you can certainly write CGI scripts in C++ (or in C, or in FORTRAN). But why bother? Languages like PHP do a much better job more easily, and they seem to scale well for some pretty major sites.
CGI is the "standard" way to have C or C++ code handling web requests, but you might also look into writing a module that gets linked into the web server at runtime. Google for "apache module API" (if using Apache) or "IIS module" (if using IIS).
Can you afford 10x as much development time? All the infrastructure-ish bits that you take for granted in php, perl, python are non existent or much harder to use in C++.
I see only two valid reasons to do this:
1. You only have C++ on your platform.
2. The server really has very high performance needs that would benefit from problem specific optimizations.
You can write a CGI application in C++ using an appropriate framework (like this one). But I'd recommend just going with perl or php. It will save you much time. Those tools are just better suited for this kind of job.
EDIT: corrected the link
I couldn't understand your exact requirements (license, etc) but this might be what you are looking for http://cppcms.sourceforge.net.