Embedding Mongoose Web Server in C++ - c++

I just embedded the Mongoose Web Server into my C++ dll (just a single header and recommended in most of the stack overflow threads) and I have it up and running properly with the very minimal example code.
However, I am having a rough time finding any sort of tutorials, examples, etc. on configuring the very basic necessities of a web server. I need to figure out the following...
1) How to allow directory browsing
2 Is it possible to restrict download speeds on the files?
3) Is it possible to have a dynamic list of IPs addresses allowed to download files?
4) How to allow the download of specific file extensions (.bz2 in this case) ANSWERED
5) How to bind to a specific IP Address ANSWERED
Most of the information I have found is in regards to using the pre-compiled binary release, so I am a bit stumped right now. Any help would be fantastic!

1) "enable_directory_listing" option
2) Not built into Mongoose (at least not the version I have, which is about 6 months old). [EDIT:] Newer versions of Mongoose support throttling download speed. From the manual...
Limit download speed for clients. throttle is a comma-separated list
of key=value pairs, where key could be:
* limit speed for all connections
x.x.x.x/mask limit speed for specified subnet
uri_prefix_pattern limit speed for given URIs
The value is a floating-point number of bytes per second, optionally
followed by a k or m character, meaning kilobytes and megabytes
respectively. A limit of 0 means unlimited rate. The last matching
rule wins. Examples:
*=1k,10.0.0.0/8=0 limit all accesses to 1 kilobyte per second,
but give connections from 10.0.0.0/8 subnet
unlimited speed
/downloads/=5k limit accesses to all URIs in `/downloads/` to
5 kilobytes per secods. All other accesses are unlimited
3) "access_control_list" option. In the code accept_new_connection calls check_acl that compares the client's IP to a list of IPs to accept and/or ignore. From the manual...
Specify access control list (ACL). ACL is a comma separated list of IP
subnets, each subnet is prepended by '-' or '+' sign. Plus means
allow, minus means deny. If subnet mask is omitted, like "-1.2.3.4",
then it means single IP address. Mask may vary from 0 to 32 inclusive.
On each request, full list is traversed, and last match wins. Default
setting is to allow all. For example, to allow only 192.168/16 subnet
to connect, run "mongoose
-0.0.0.0/0,+192.168/16". Default: ""
http://code.google.com/p/mongoose/wiki/MongooseManual

Of course as soon as I give up and post, I find most of the answers were right in front of my face. Here is the options for them...
const char *options[] =
{
"document_root", "C:/",
"listening_ports", "127.0.0.1:8080",
"extra_mime_types", ".bz2=plain/text",
NULL
};
However, I am still not sure how to make enable directory browsing. Right now, my callback function is just the basic one out of the example (as seen below). What would I need to do to get it so the files are listed?
static void *callback(enum mg_event event, struct mg_connection *conn, const struct mg_request_info *request_info)
{
if (event == MG_NEW_REQUEST)
{
// Echo requested URI back to the client
mg_printf(conn, "HTTP/1.1 200 OK\r\n"
"Content-Type: text/plain\r\n\r\n"
"%s", request_info->uri);
return ""; // Mark as processed
}
else
{
return NULL;
}
}

Related

EPP Server SSL_Read hang after greeting

I have strange problems in ssl_read/ssl_write function with EPP server
After connected I read greeting message successfully.
bytes = SSL_read(ssl, buf, sizeof(buf)); // get reply & decrypt
buf[bytes] = 0;
ball+= bytes;
cc = getInt(buf);
printf("header: %x\n",cc);
printf("Received: \"%s\"\n",buf+4);
First 4 bytes are 00, 00, 09, EB and read 2539 bytes in greeting message.
After that, all operations like hello or logins are hand when SSL_read();
xml= "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?><eppxmlns=\"urn:ietf:params:xml:ns:epp-1.0\"><hello/></epp>";
char bb[1000] = {0};
makeChar(strlen(xml)+4, bb);
memcpy(bb+4, xml, strlen(xml)+4);
bytes = SSL_write(ssl,xml,strlen(xml)+4);
usleep(500000); //sleep 0.5 sec
memset(buf, 0, 1024);
printf("read starting.\n");
bytes = SSL_read(ssl, buf, 1024); //always hang here
buf[bytes]=0;
printf("%d : %s", bytes, buf);
I am confused. I read RFC documentations but I can not find answer.
in EPP documentation, they said "In order to verify the identity of the secure server you will need the ‘Verisign Class 3 Public Primary Certification Authority’ root certificate available free from www.verisign.com".
is it important?
is it important?
Yes, as outlined in RFC 5734 "Extensible Provisioning Protocol (EPP) Transport over TCP", the whole security of an EPP exchange is bound to 3 properties:
access list based on IP address
TLS communication and verification of certificates (mutually, which is why you - as registrar aka client in EPP communication - have often to send in advance the certificate you will use ot the registry)
the EPP credentials used at <login> command.
Failure to properly secure the connection can mean:
you as registrar sending confidential information (your own EPP login, various details on domains you sponsor or not, including <authInfo> values, etc.) to a third party not being the registry
and in reverse, someone mimicking you in the eyes of the registry hence doing operations on which you will have to get the burden of, including financially for all domains bought, and legally.
But even in general for all cases of TLS handshake, if you want to be sure to be connected, as client, to the server you think you are, you need to verify its certificate.
Besides trivial things (dates, etc.), the certificate:
should at least be signed by an AC you trust (your choice who you trust)
and/or is a specific certificate with specific fingerprint/serial and other characteristics (but you will have to maintain that when the other party changes its certificate)
and/or matches DNS TLSA records
In short, if you are new to both EPP and TLS and C/C++ (as you state yourself in your other question about Verisign certificate), I hugely recommend you do not try to do all of this by yourself at a so low level (for example you should never manipulate XML as you do above, it shouldn't be a string. Again, there are libraries to properly parse and generate XML documents). You should use an EPP library that leverage most of the things for you. Your registry may provide an "SDK" that you can use, you should ask it.
PS: your read is probably hanging because you are not sending the payload in the correct fashion (again, something an EPP library will do for you). You need to send the length, as 4 bytes (which you need to compute after converting your string to bytes using the UTF-8 encoding), and then the payload itself. I am not sure this is what your code does. Also your reading part is wrong: you should first read 4 bytes from server, this will give you the length (but do note they can theoretically arrive not necessarily in a single packet so one "ssl read" might not give all 4 of them, you need a loop), after which you know the length of the payload you will get which allows you to set up proper buffers, if needed, as well as detecting properly when you received everything.

Need kind of 'interval ranges' computations

I need a library for my C++ program.
But the problem, I don't know the name of this data type I want.
I have NPAPI plugin (I know this API is deprecated and removed from modern browsers) which issues to a server
HTTP range requests. Request is asyncronious and the data may arraive in any order with any chunks size.
So I need to track ranges I already have requested from a server.
For example, if initially I requested bytes [10-20] (inclusevely), then I requested [30-40] the data type I need should keep it as two intervals:
[10-20],[30-40]
But if I request [21-29] or even [15-35] it should be merged in one interval:
[10-20],[30-40] + [15-35] = [10-40]
Also I need a substraction when a requested block arrives:
[10-40] - [20-30] = [10-19],[31-40]
(requested - arrived = we're still waiting for)
I had a look at boost::numeric::intervals library but at first glance it is too big for this task (1583 files, 13 Mb of sources after './dist/bin/bcp numeric/interval ~/boost').
Also, GNU ddrescue has some similar arithmetics inside but the code isn't a library there, it coupled too much with the applications specifics.
UPDATE:
Here is what I've found on my way:
A container for integer intervals, such as RangeSet, for C++
https://en.wikipedia.org/wiki/Interval_tree
Boost.ICL
NCBI C++ Toolkit, CIntervalTree

Aerospike error: All batch queues are full

I am running an Aerospike cluster in Google Cloud. Following the recommendation on this post, I updated to the last version (3.11.1.1) and re-created all servers. In fact, this change cause my 5 servers to operate in a much lower CPU load (it was around 75% load before, now it is on 20%, as show in the graph bellow:
Because of this low load, I decided to reduce the cluster size to 4 servers. When I did this, my application started to receive the following error:
All batch queues are full
I found this discussion about the topic, recommending to change the parameters batch-index-threads and batch-max-unused-buffers with the command
asadm -e "asinfo -v 'set-config:context=service;batch-index-threads=NEW_VALUE'"
I tried many combinations of values (batch-index-threads with 2,4,8,16) and none of them solved the problem, and also changing the batch-index-threads param. Nothing solves my problem. I keep receiving the All batch queues are full error.
Here is my aerospace.conf relevant information:
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
paxos-recovery-policy auto-reset-master
pidfile /var/run/aerospike/asd.pid
service-threads 32
transaction-queues 32
transaction-threads-per-queue 4
batch-index-threads 40
proto-fd-max 15000
batch-max-requests 30000
replication-fire-and-forget true
}
I use 300GB SSD disks on these servers.
A quick note which may or may not pertain to you:
A common mistake we have seen in the past is that developers decide to use 'batch get' as a general purpose 'get' for single and multiple record requests. The single record get will perform better for single record requests.
It's possible that you are being constrained by the network between the clients and servers. Reducing from 5 to 4 nodes reduced the aggregate pipe. In addition, removing a node will start cluster migrations which adds additional network load.
I would look at the batch-max-buffer-per-queue config parameter.
Maximum number of 128KB response buffers allowed in each batch index
queue. If all batch index queues are full, new batch requests are
rejected.
In conjunction with raising this value from the default of 255 you will want to also raise the batch-max-unused-buffers to batch-index-threads x batch-max-buffer-per-queue + 1 (at least). If you do not do that new buffers will be created and destroyed constantly, as the amount of free (unused) buffers is smaller than the ones you're using. The moment the batch response is served the system will strive to trim the buffers down to the max unused number. You will see this reflected in the batch_index_created_buffers metric constantly rising.
Be aware that you need to have enough DRAM for this. For example if you raise the batch-max-buffer-per-queue to 320 you will consume
40 (`batch-index-threads`) x 320 (`batch-max-buffer-per-queue`) x 128K = 1600MB
For the sake of performance the batch-max-unused-buffers should be set to 13000 which will have a max memory consumption of 1625MB (1.59GB) per-node.

Is there a maximum concurrency for AWS s3 multipart uploads?

Referring to the docs, you can specify the number of concurrent connection when pushing large files to Amazon Web Services s3 using the multipart uploader. While it does say the concurrency defaults to 5, it does not specify a maximum, or whether or not the size of each chunk is derived from the total filesize / concurrency.
I trolled the source code and the comment is pretty much the same as the docs:
Set the concurrency level to use when uploading parts. This affects
how many parts are uploaded in parallel. You must use a local file as
your data source when using a concurrency greater than 1
So my functional build looks like this (the vars are defined by the way, this is just condensed for example):
use Aws\Common\Exception\MultipartUploadException;
use Aws\S3\Model\MultipartUpload\UploadBuilder;
$uploader = UploadBuilder::newInstance()
->setClient($client)
->setSource($file)
->setBucket($bucket)
->setKey($file)
->setConcurrency(30)
->setOption('CacheControl', 'max-age=3600')
->build();
Works great except a 200mb file takes 9 minutes to upload... with 30 concurrent connections? Seems suspicious to me, so I upped concurrency to 100 and the upload time was 8.5 minutes. Such a small difference could just be connection and not code.
So my question is whether or not there's a concurrency maximum, what it is, and if you can specify the size of the chunks or if chunk size is automatically calculated. My goal is to try to get a 500mb file to transfer to AWS s3 within 5 minutes, however I have to optimize that if possible.
Looking through the source code, it looks like 10,000 is the maximum concurrent connections. There is no automatic calculations of chunk sizes based on concurrent connections but you could set those yourself if needed for whatever reason.
I set the chunk size to 10 megs, 20 concurrent connections and it seems to work fine. On a real server I got a 100 meg file to transfer in 23 seconds. Much better than the 3 1/2 to 4 minute it was getting in the dev environments. Interesting, but thems the stats, should anyone else come across this same issue.
This is what my builder ended up being:
$uploader = UploadBuilder::newInstance()
->setClient($client)
->setSource($file)
->setBucket($bicket)
->setKey($file)
->setConcurrency(20)
->setMinPartSize(10485760)
->setOption('CacheControl', 'max-age=3600')
->build();
I may need to up that max cache but as of yet this works acceptably. The key was moving the processor code to the server and not relying on the weakness of my dev environments, no matter how powerful the machine is or high class the internet connection is.
We can abort the process during upload and can halt all the operations and abort the upload at any instance. We can set Concurrency and minimum part size.
$uploader = UploadBuilder::newInstance()
->setClient($client)
->setSource('/path/to/large/file.mov')
->setBucket('mybucket')
->setKey('my-object-key')
->setConcurrency(3)
->setMinPartSize(10485760)
->setOption('CacheControl', 'max-age=3600')
->build();
try {
$uploader->upload();
echo "Upload complete.\n";
} catch (MultipartUploadException $e) {
$uploader->abort();
echo "Upload failed.\n";
}

C Program to find UP and DOWN machines.?

Suppose I have a scenario like below :
There are about 225 Computers having the following range of IP addresses and hostnames:-
PC-LAB IP ADDRESS RANGE HOSTNAME RANGE
PC-LAB1 10.11.2.01 - 10.11.2.30 ccl1pc01 - ccl1pc30
PC-LAB2 10.11.3.01 - 10.11.3.30 ccl2pc01 - ccl2pc30
PC-LAB3 10.11.4.01 - 10.11.4.45 ccl3pc01 - ccl3pc45
PC-LAB4 10.11.5.01 - 10.11.5.50 ccl4pc01 - ccl4pc50
PC-LAB5 10.13.6.01 - 10.13.6.65 ccl5pc01 – ccl5pc65
I want to write a program (in C / C++) that will take the above IP address and hostname ranges as input and create two separate files, one with 225 entries of IP
addresses and another with 225 entries of hostnames..
Then the program will figure out which of these machines are up and which are down and then create two files one containing
hostname and IP addresses of the systems which are UP and another which are DOWN.
E.g.
FILE1.down
Hostname IP address
ccl1pc10 10.8.2.10
ccl5pc25 10.11.5.25
Note : If any ubuntu command simplifies this work..we can use that in the program for sure..!!
Look at fping
Run this command, sit back and relax:
fping ccl{1..6}pc{01..60}
this will print a list of
All hosts that have a DNS name/IP record and are up
All hosts that have a DNS name/IP record but are down
All hosts that are not in the DNS (I guess you can ignore these)
Regards
Check out Nmap. You might need to create a small wrapper to handle the input and output in the format you want, but it should do exactly what you need.
Is this homework, ie do you have to do it programmatically in C?
Otherwise there a dozen of already existing monitoring frameworks, with several already in Ubuntu: munin, nagios, zabbix, ...