C++ interacting with a dynamic webpage? - c++

I was thinking recently about what projects I could start that would be of use to me and this came up. I post on various forums a daily updated journal entry that is the same for each forum. I also keep a log of the journal entries as individual docx files on my hard drive. I figured it would be great if I could create a program that would be given an input docx file and then post its contents as a new reply to all the daily journal threads on the forums that I have.
I am well versed in c++ for college like programming (algorithms, programming competitions, science based assignments and such), but not at all experienced with practical applications. My first question to get me started with this new idea of mine is if there are any libraries for c++ that allow for an interaction to a dynamic webpage like I described.
Thanks Much,
Michael

That problem can be approached as simply as using cURL (or a similar library) to GET pages and POST form data, or it could be as complicated as writing a Firefox XPCOM extension.

Any interaction with web page (posting a reply or tweeting or searching) is typically either POST or GET http request. As meantioned by greyfade - to construct such requests - you whouls use cURL or smth like asio from Boost library.

Related

Best way of logging a user in C++

I am trying to get into C++ programming so apologise if this is a bit of a stupid question.
I am attempting to create a program in C++ that is linked to a website via the database, that's all sorted. In this program, the user must log into it to be able to use its features, I've also managed to do this fine. My question is, what is the best way of storing that users session so I can refer to their username, display that users settings from the database e.c.t?
I am unsure, but I don't think c++ has session options like in php so I cannot do it that way. I did some googling before I posted this, spent all night trying to find a solution, I found nothing.
My knowledge if c++ is slim and this may sound like a more complicated or unnecessary route to take, but it was thinking of perhaps when the user logs in, to create a txt file storing that users username and then calling on it when I need to refer to that users username for queries and such, then when the user logs out or closes the program it deletes the file. Is that stupid? Forgive me if it is.
Is there better way to go about this?
Thanks for your time!
EDIT
I read your comments, if it needs to be a stand-alone application, like some sort of client, you could take a look at the C++ libraries I mentioned, but I'd use any higher level language (Java or C# have good documentation and there are many tutorials for creating GUIs, if that's what your're looking for. I think even Python would make a good candidate).
If you really must use C++, your best bet would be to use an existing library to implement your web solution. POCO includes an HTTP server framework, and a library for sockets and other forms of low-level network programming. Boost ASIO can also serve your purposes. But this is hardly something I'd recommend to start learning programming, or C++ for that matter.
If you want to learn about web programming, then you should definitely take a look at other languages. PHP or ASP.NET come to mind. AS you learn, you'll most likely also end up writing some form of Javascript. You can find a lot of info out there, just Google for tutorials. A site to get started is w3Schools, but any site with tutorials will do. Good luck!

Magic Software (uniPaaS)- get data from webservice

Can someone show me a simple example of how to get data from web service on Magic software.
Assume that I have WSDL file and the function I need called "getData()"
Well, it's been awhile since you asked, and hopefully you found an answer by now. But, I have not perused this site before. Anyway, the manual that comes with Magic XPA has a manual, called "Mastering Magic". In Chapter 34 is a step-by-step "how to" for consuming web services.
We (some of the users) also have meetings weekly and you are welcome to join us, to get live walkthroughs for debugging details:
http://unipaasusers.blogspot.com/p/meeting-schedule.html
By and large, web services are very easy in Magic. Being able to automatically define XML files or XML BLOBS as data sources saves a lot of time. The problems tend to occur when there are syntax issues with the XML, which can be "fun"!

Very Simple C++ Web Crawler/Spider?

I am trying to do a very simple web crawler/spider app in C++. I have been searching using Google for a simple one to understand the concept. I found this:
spider_simpleCrawler
However, it is complicated to understand for me, since I started learning C++ about 1 month ago.
This is, for example, what I'm trying to do:
Enter the URL: www.example.com (I will use bash->wget, to get the contents/source code),
Look for, maybe "a href" link, and then store in some data file.
Is there a simpler tutorial or guide on the Internet?
All right, I'll try to point you in the right direction. Conceptually, a webcrawler is pretty simple. It revolves around a FIFO queue data structure which stores pending URLs. C++ has a built-in queue structure in the standard libary, std::queue, which you can use to store URLs as strings.
The basic algorithm is pretty straightforward:
Begin with a base URL that you
select, and place it on the top of
your queue
Pop the URL at the top of the queue
and download it
Parse the downloaded HTML file and extract all links
Insert each extracted link into the queue
Goto step 2, or stop once you reach some specified limit
Now, I said that a webcrawler is conceptually simple, but implementing it is not so simple. As you can see from the above algorithm, you'll need: an HTTP networking library to allow you to download URLs, and a good HTML parser that will let you extract links. You mentioned you could use wget to download pages. That simplifies things somewhat, but you still need to actually parse the downloaded HTML docs. Parsing HTML correctly is a non-trivial task. A simple string search for <a href= will only work sometimes. However, if this is just a toy program that you're using to familiarize yourself with C++, a simple string search may suffice for your purposes. Otherwise, you need to use a serious HTML parsing library.
There are also other considerations you need to take into account when writing a webcrawler, such as politeness. People will be pissed and possibly ban your IP if you attempt to download too many pages, too quickly, from the same host. So you may need to implement some sort of policy where your webcrawler waits for a short period before downloading each site. You also need some mechanism to avoid downloading the same URL again, obey the robots exclusion protocol, avoid crawler traps, etc... All these details add up to make actually implementing a robust webcrawler not such a simple thing.
That said, I agree with larsmans in the comments. A webcrawler isn't the greatest way to learn C++. Also, C++ isn't the greatest language to write a webcrawler in. The raw-performance and low-level access you get in C++ is useless when writing a program like a webcrawler, which spends most of its time waiting for URLs to resolve and download. A higher-level scripting language like Python or something is better suited for this task, in my opinion.
Check this Web crawler and indexer written in C++ at: Mitza web crawler
The code can be used as reference. Is clean and provides good start for a
webcrawler codding. Sequence diagrams can be found at the above link pages.
A web-crawler has the following components in it:
Downloading an HTML file
Extracting links from it
Pushing all the links into a queue
{web indexing and ranking if necessary}
Repeating this with the front element of the queue
This one has it all Web-Crawler.
It would be very helpful for beginners to learn about complete understanding of a web-crawler, concepts of multithreading and web-ranking.

Django: create an internal twitter like wall

I'm building a multi user system and I'm creating an somewhat experimental idea for the users to interact.
The site is for professional actors so they can post up their profile and so casting directors can find them. All that is going fine.
What I now want to do it create a wall/twitter group area where people can post short messages just like in twitter.
I'm developing all this in Django and while I have a really good Django developer working on the site, I've decided to take on this part myself. I'm relatively new to django, I have 10 years PHP/Java experience.
I've set up the basics of posting a message and parsing urls etc. What I want to do now is create a reply to and a direct message feature.
Is there any other projects out there that would have done something similar to this that could help me in not re-inventing the wheel completely.
Also on a general note as an idea, any suggestions as to what to do different considering my environment and audience.
Check out Trillr (including considering binning any work you've done and just including this wholesale, open-source licence permitting).

Any way to display C++ on a webpage?

Is there a relatively easy way to display the output of a C++ program on a webpage? And I don't mean manually, in other words, you see it on a webpage as it runs not as in I make a code tag and write it in myself.
EDIT: Just so everybody can get this clear I am going to post this up here. I am NOT trying to make a webpage in C++. Please excuse me if this sounds spiteful or anything but I am getting a lot of answers relating to that.
Step one, get yourself a server-side language. Be that PHP, ASP, Python, Ruby, whatever. Get it set up so you can serve it.
Step two, find your language's exec equivalent. Practically all of them have them. It'll let you run a command as if it were from the command line, usually with arguments and capture the output. Here's PHP's:
http://php.net/manual/en/function.exec.php
Of course, if you're passing user-input as arguments, sanitise!
I've just seen that you accepted Scott's answer. I usually wouldn't chase up a SO thread so persistently but I fear you're about to make a mistake that you'll come to regret down the line. Giving direct access to your program and its own built-in server is a terrible idea for two reasons:
You waste a day implementing this built-in server and then getting it to persist and testing it
More importantly, you've just opened up another attack vector into your server. When it comes to security, keep it simple.
You're far better having your C++ app running behind another (mature) server side language as all the work is done for you and it can filter the input to keep things safe.
You could write a CGI app in C++, or you could use an existing web server language to execute the command and send the output to the client.
You want to use Witty.
Wt (pronounced 'witty') is a C++
library for developing interactive web
applications.
The API is widget-centric and similar
to desktop GUI APIs. To the developer,
it offers complete abstraction of any
web-specific implementation details,
including event handling, graphics
support, graceful degradation (or
progressive enhancement), and pretty
URLs.
Unlike many page-based frameworks, Wt
was designed for creating stateful
applications that are at the same time
highly interactive (leveraging
techinques such as AJAX to their
fullest) and accessible (supporting
plain HTML browsers), using automatic
graceful degradation or progressive
enhancement.
The library comes with an application
server that acts as a stand-alone web
server or integrates through FastCGI
with other web servers.
I am not sure this is what you are looking for but you may want CGI You may want to look at this SO question, C++ may not be the best language for what you want to do.
based off the questions you posted Writing a web app like what you want is no simple task. What I would recommend is use some other library (this is one i found with a quick google) to get a web console on your server and give the user it is running under execute deny permissions on every folder except the folder you have your app installed.
This is still is a risky method if you don't set up the security correctly but it is the easiest solution without digging around too much on existing libraries to just have the application interactive.
EDIT --
The "Best" solution is learn AJAX and have your program post its own pages with it but like I said, it will not be easy.
It sounds like you want something like a telnet session embedded in a webpage. A quick google turns up many Java telnet apps, though I'm not qualified to evaluate which would be most ideal to embed in html.
You would set up the login script on the host machine to run your c++ app and the user would interact with it through the shell window. Note though that this will only work for pure command line apps. If you want to use a GUI app in this way, then you should look into remote desktop software or VNC.
It may be worth looking into Adobe's "Alchemy" project on Adobe Labs
This may help you with what you're trying to achieve.
:)
Are you looking for something like what codepad.org does? I believe they explain how they did it here.
There is a library called C++ Server Pages - Poco. I used it for one of my college project, its pretty good. There is also good documentation to get started with, u can find it here http://pocoproject.org/docs/