Realtime TCP/IP stack - c++

I want to program (as efficiently as possible) a TCP/IP communication stack in C or C++. It really must run as fast as possible.
Does anyone have a good example or suggestion of where to start?

This is not meant as an insult, the guys who have developed the stacks for the well established operating systems have been doing this for years. This is what they do, unless you are in the business, I suggest you look at a different approach.
Different approach being, pick a stack that has decent performance (I hear that the latest tcp/ip stack in Solaris is nifty), then tune the hell out of it (there are lots of different flags and settings you can tune). If that fails to meet your needs, consider hardware solutions such as tcp offloading etc.
Writing your own stack, means you have to be confident enough to know that you can beat maybe 1000s of man years worth of effort in this field.
If this is for self development and learning, I suggest something simple like the source code for minix, it may have a simple to understand stack.
m2c.

This is a huge task. I would recommend the Contiki operating system as a possible starting point. It has a TCP/IP stack.

As Steve points out in the comments you do need quite a bit of experience to do this well. So rather than jumping directly to your end goal I recommend these possible steps:
Write a reliable transport using UDP as a normal user-land protocol.
Write a custom protocol using raw sockets in user-land.
Write a kernel level protocol module/driver
Write your stack on a FPGA network card
Linux is a good option as the details you need are easily accessible and documented.
And oh yeah, stop as soon as you realize you won't likely outperform the Linux kernel.

This may be worth looking at:
Implementing a High Performance Object Oriented TCP/IP Protocol Stack
Thesis for the Degree of Master of Science Peter Kjellerstedt and
Henrik Baard

lwip - A Lightweight TCPIP stack it's best to start learning about TCP/IP Stack
git clone git://git.savannah.nongnu.org/lwip.git

Related

How to make a GameBoy / GameBoy Advance Emulator? [duplicate]

Closed. This question is off-topic. It is not currently accepting answers.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How do emulators work? When I see NES/SNES or C64 emulators, it astounds me.
Do you have to emulate the processor of those machines by interpreting its particular assembly instructions? What else goes into it? How are they typically designed?
Can you give any advice for someone interested in writing an emulator (particularly a game system)?
Emulation is a multi-faceted area. Here are the basic ideas and functional components. I'm going to break it into pieces and then fill in the details via edits. Many of the things I'm going to describe will require knowledge of the inner workings of processors -- assembly knowledge is necessary. If I'm a bit too vague on certain things, please ask questions so I can continue to improve this answer.
Basic idea:
Emulation works by handling the behavior of the processor and the individual components. You build each individual piece of the system and then connect the pieces much like wires do in hardware.
Processor emulation:
There are three ways of handling processor emulation:
Interpretation
Dynamic recompilation
Static recompilation
With all of these paths, you have the same overall goal: execute a piece of code to modify processor state and interact with 'hardware'. Processor state is a conglomeration of the processor registers, interrupt handlers, etc for a given processor target. For the 6502, you'd have a number of 8-bit integers representing registers: A, X, Y, P, and S; you'd also have a 16-bit PC register.
With interpretation, you start at the IP (instruction pointer -- also called PC, program counter) and read the instruction from memory. Your code parses this instruction and uses this information to alter processor state as specified by your processor. The core problem with interpretation is that it's very slow; each time you handle a given instruction, you have to decode it and perform the requisite operation.
With dynamic recompilation, you iterate over the code much like interpretation, but instead of just executing opcodes, you build up a list of operations. Once you reach a branch instruction, you compile this list of operations to machine code for your host platform, then you cache this compiled code and execute it. Then when you hit a given instruction group again, you only have to execute the code from the cache. (BTW, most people don't actually make a list of instructions but compile them to machine code on the fly -- this makes it more difficult to optimize, but that's out of the scope of this answer, unless enough people are interested)
With static recompilation, you do the same as in dynamic recompilation, but you follow branches. You end up building a chunk of code that represents all of the code in the program, which can then be executed with no further interference. This would be a great mechanism if it weren't for the following problems:
Code that isn't in the program to begin with (e.g. compressed, encrypted, generated/modified at runtime, etc) won't be recompiled, so it won't run
It's been proven that finding all the code in a given binary is equivalent to the Halting problem
These combine to make static recompilation completely infeasible in 99% of cases. For more information, Michael Steil has done some great research into static recompilation -- the best I've seen.
The other side to processor emulation is the way in which you interact with hardware. This really has two sides:
Processor timing
Interrupt handling
Processor timing:
Certain platforms -- especially older consoles like the NES, SNES, etc -- require your emulator to have strict timing to be completely compatible. With the NES, you have the PPU (pixel processing unit) which requires that the CPU put pixels into its memory at precise moments. If you use interpretation, you can easily count cycles and emulate proper timing; with dynamic/static recompilation, things are a /lot/ more complex.
Interrupt handling:
Interrupts are the primary mechanism that the CPU communicates with hardware. Generally, your hardware components will tell the CPU what interrupts it cares about. This is pretty straightforward -- when your code throws a given interrupt, you look at the interrupt handler table and call the proper callback.
Hardware emulation:
There are two sides to emulating a given hardware device:
Emulating the functionality of the device
Emulating the actual device interfaces
Take the case of a hard-drive. The functionality is emulated by creating the backing storage, read/write/format routines, etc. This part is generally very straightforward.
The actual interface of the device is a bit more complex. This is generally some combination of memory mapped registers (e.g. parts of memory that the device watches for changes to do signaling) and interrupts. For a hard-drive, you may have a memory mapped area where you place read commands, writes, etc, then read this data back.
I'd go into more detail, but there are a million ways you can go with it. If you have any specific questions here, feel free to ask and I'll add the info.
Resources:
I think I've given a pretty good intro here, but there are a ton of additional areas. I'm more than happy to help with any questions; I've been very vague in most of this simply due to the immense complexity.
Obligatory Wikipedia links:
Emulator
Dynamic recompilation
General emulation resources:
Zophar -- This is where I got my start with emulation, first downloading emulators and eventually plundering their immense archives of documentation. This is the absolute best resource you can possibly have.
NGEmu -- Not many direct resources, but their forums are unbeatable.
RomHacking.net -- The documents section contains resources regarding machine architecture for popular consoles
Emulator projects to reference:
IronBabel -- This is an emulation platform for .NET, written in Nemerle and recompiles code to C# on the fly. Disclaimer: This is my project, so pardon the shameless plug.
BSnes -- An awesome SNES emulator with the goal of cycle-perfect accuracy.
MAME -- The arcade emulator. Great reference.
6502asm.com -- This is a JavaScript 6502 emulator with a cool little forum.
dynarec'd 6502asm -- This is a little hack I did over a day or two. I took the existing emulator from 6502asm.com and changed it to dynamically recompile the code to JavaScript for massive speed increases.
Processor recompilation references:
The research into static recompilation done by Michael Steil (referenced above) culminated in this paper and you can find source and such here.
Addendum:
It's been well over a year since this answer was submitted and with all the attention it's been getting, I figured it's time to update some things.
Perhaps the most exciting thing in emulation right now is libcpu, started by the aforementioned Michael Steil. It's a library intended to support a large number of CPU cores, which use LLVM for recompilation (static and dynamic!). It's got huge potential, and I think it'll do great things for emulation.
emu-docs has also been brought to my attention, which houses a great repository of system documentation, which is very useful for emulation purposes. I haven't spent much time there, but it looks like they have a lot of great resources.
I'm glad this post has been helpful, and I'm hoping I can get off my arse and finish up my book on the subject by the end of the year/early next year.
A guy named Victor Moya del Barrio wrote his thesis on this topic. A lot of good information on 152 pages. You can download the PDF here.
If you don't want to register with scribd, you can google for the PDF title, "Study of the techniques for emulation programming". There are a couple of different sources for the PDF.
Emulation may seem daunting but is actually quite easier than simulating.
Any processor typically has a well-written specification that describes states, interactions, etc.
If you did not care about performance at all, then you could easily emulate most older processors using very elegant object oriented programs. For example, an X86 processor would need something to maintain the state of registers (easy), something to maintain the state of memory (easy), and something that would take each incoming command and apply it to the current state of the machine. If you really wanted accuracy, you would also emulate memory translations, caching, etc., but that is doable.
In fact, many microchip and CPU manufacturers test programs against an emulator of the chip and then against the chip itself, which helps them find out if there are issues in the specifications of the chip, or in the actual implementation of the chip in hardware. For example, it is possible to write a chip specification that would result in deadlocks, and when a deadline occurs in the hardware it's important to see if it could be reproduced in the specification since that indicates a greater problem than something in the chip implementation.
Of course, emulators for video games usually care about performance so they don't use naive implementations, and they also include code that interfaces with the host system's OS, for example to use drawing and sound.
Considering the very slow performance of old video games (NES/SNES, etc.), emulation is quite easy on modern systems. In fact, it's even more amazing that you could just download a set of every SNES game ever or any Atari 2600 game ever, considering that when these systems were popular having free access to every cartridge would have been a dream come true.
I know that this question is a bit old, but I would like to add something to the discussion. Most of the answers here center around emulators interpreting the machine instructions of the systems they emulate.
However, there is a very well-known exception to this called "UltraHLE" (WIKIpedia article). UltraHLE, one of the most famous emulators ever created, emulated commercial Nintendo 64 games (with decent performance on home computers) at a time when it was widely considered impossible to do so. As a matter of fact, Nintendo was still producing new titles for the Nintendo 64 when UltraHLE was created!
For the first time, I saw articles about emulators in print magazines where before, I had only seen them discussed on the web.
The concept of UltraHLE was to make possible the impossible by emulating C library calls instead of machine level calls.
Something worth taking a look at is Imran Nazar's attempt at writing a Gameboy emulator in JavaScript.
Having created my own emulator of the BBC Microcomputer of the 80s (type VBeeb into Google), there are a number of things to know.
You're not emulating the real thing as such, that would be a replica. Instead, you're emulating State. A good example is a calculator, the real thing has buttons, screen, case etc. But to emulate a calculator you only need to emulate whether buttons are up or down, which segments of LCD are on, etc. Basically, a set of numbers representing all the possible combinations of things that can change in a calculator.
You only need the interface of the emulator to appear and behave like the real thing. The more convincing this is the closer the emulation is. What goes on behind the scenes can be anything you like. But, for ease of writing an emulator, there is a mental mapping that happens between the real system, i.e. chips, displays, keyboards, circuit boards, and the abstract computer code.
To emulate a computer system, it's easiest to break it up into smaller chunks and emulate those chunks individually. Then string the whole lot together for the finished product. Much like a set of black boxes with inputs and outputs, which lends itself beautifully to object oriented programming. You can further subdivide these chunks to make life easier.
Practically speaking, you're generally looking to write for speed and fidelity of emulation. This is because software on the target system will (may) run more slowly than the original hardware on the source system. That may constrain the choice of programming language, compilers, target system etc.
Further to that you have to circumscribe what you're prepared to emulate, for example its not necessary to emulate the voltage state of transistors in a microprocessor, but its probably necessary to emulate the state of the register set of the microprocessor.
Generally speaking the smaller the level of detail of emulation, the more fidelity you'll get to the original system.
Finally, information for older systems may be incomplete or non-existent. So getting hold of original equipment is essential, or at least prising apart another good emulator that someone else has written!
Yes, you have to interpret the whole binary machine code mess "by hand". Not only that, most of the time you also have to simulate some exotic hardware that doesn't have an equivalent on the target machine.
The simple approach is to interpret the instructions one-by-one. That works well, but it's slow. A faster approach is recompilation - translating the source machine code to target machine code. This is more complicated, as most instructions will not map one-to-one. Instead you will have to make elaborate work-arounds that involve additional code. But in the end it's much faster. Most modern emulators do this.
When you develop an emulator you are interpreting the processor assembly that the system is working on (Z80, 8080, PS CPU, etc.).
You also need to emulate all peripherals that the system has (video output, controller).
You should start writing emulators for the simpe systems like the good old Game Boy (that use a Z80 processor, am I not not mistaking) OR for C64.
Emulator are very hard to create since there are many hacks (as in unusual
effects), timing issues, etc that you need to simulate.
For an example of this, see http://queue.acm.org/detail.cfm?id=1755886.
That will also show you why you ‘need’ a multi-GHz CPU for emulating a 1MHz one.
Also check out Darek Mihocka's Emulators.com for great advice on instruction-level optimization for JITs, and many other goodies on building efficient emulators.
I've never done anything so fancy as to emulate a game console but I did take a course once where the assignment was to write an emulator for the machine described in Andrew Tanenbaums Structured Computer Organization. That was fun an gave me a lot of aha moments. You might want to pick that book up before diving in to writing a real emulator.
Advice on emulating a real system or your own thing?
I can say that emulators work by emulating the ENTIRE hardware. Maybe not down to the circuit (as moving bits around like the HW would do. Moving the byte is the end result so copying the byte is fine). Emulator are very hard to create since there are many hacks (as in unusual effects), timing issues, etc that you need to simulate. If one (input) piece is wrong the entire system can do down or at best have a bug/glitch.
The Shared Source Device Emulator contains buildable source code to a PocketPC/Smartphone emulator (Requires Visual Studio, runs on Windows). I worked on V1 and V2 of the binary release.
It tackles many emulation issues:
- efficient address translation from guest virtual to guest physical to host virtual
- JIT compilation of guest code
- simulation of peripheral devices such as network adapters, touchscreen and audio
- UI integration, for host keyboard and mouse
- save/restore of state, for simulation of resume from low-power mode
To add the answer provided by #Cody Brocious
In the context of virtualization where you are emulating a new system(CPU , I/O etc ) to a virtual machine we can see the following categories of emulators.
Interpretation: bochs is an example of interpreter , it is a x86 PC emulator,it takes each instruction from guest system translates it in another set of instruction( of the host ISA) to produce the intended effect.Yes it is very slow , it doesn't cache anything so every instruction goes through the same cycle.
Dynamic emalator: Qemu is a dynamic emulator. It does on the fly translation of guest instruction also caches results.The best part is that executes as many instructions as possible directly on the host system so that emulation is faster. Also as mentioned by Cody, it divides the code into blocks ( 1 single flow of execution).
Static emulator: As far I know there are no static emulator that can be helpful in virtualization.
How I would start emulation.
1.Get books based around low level programming, you'll need it for the "pretend" operating system of the Nintendo...game boy...
2.Get books on emulation specifically, and maybe os development. (you won't be making an os, but the closest to it.
3.look at some open source emulators, especially ones of the system you want to make an emulator for.
4.copy snippets of the more complex code into your IDE/compliler. This will save you writing out long code. This is what I do for os development, use a district of linux
I wrote an article about emulating the Chip-8 system in JavaScript.
It's a great place to start as the system isn't very complicated, but you still learn how opcodes, the stack, registers, etc work.
I will be writing a longer guide soon for the NES.

Monitor kernel registry changes

Could people please give me pointers (no pun intended) for topics I will need to research in order to be able to do this? I'm not really an expert on Windows, however I'm very quick at picking up new concepts.
I saw the process monitor program which Mark Russinovich and Bryce Cogswell wrote:
http://technet.microsoft.com/en-gb/sysinternals/bb896645
which can look at everything happening registry key-wise within the kernel. I've been able to do this sort of thing using C# and user-level registry accesses in the past, but i couldnt reach the kernel using the wrapper suite I got from codeproject.
Can people please help with me regards to where i should start? I guess i'm asking more for help on the Windows/OS aspect of this.
Reason for doing this:
(I'm more of a Java than C++ programmer, however I want to get into the latter. The best way to learn is to do something which interests you, so as i'm interested in real-time applications, this is the cheapest one I could think of (without having to pay for data).)
For kernel-mode, take a look at CmRegisterCallback.
I believe Process Monitor uses the Event Tracing for Windows functions, however; see, for example, EtwRegister.
Writing a kernel-mode driver to intercept registry reads/writes is extremely difficult. If you just want to see both user and kernel-mode registry accesses, the best way to do so is via a real-time ETW trace listener. With this, you get all of the monitoring you want, without the terrifying proposition of modifying a running kernel. Mark doesn't use this because at the time it didn't exist, but nowadays I'm sure he'd recommend you do this instead. If you're familiar with DTrace on Linux, ETW is Windows' closest equivalent (it's as performant as DTrace, but not nearly as user-friendly or scriptable)
Check out http://blogs.msdn.com/b/matt_pietrek/archive/2005/03/23/401080.aspx for an intro to ETW, and here's a question on SO related to real-time ETW consumers: How do I register as a real-time ETW consumer for NT Kernel Events?

discrete event simulators for C++

I am currently looking for a discrete event simulator written for C++. I did not find much on the web written specifically in OO-style; there are some, but outdated. Some others, such as Opnet, Omnet and ns3 are way too complicated for what I need to do. And besides, I need to simulate agent-based algorithms capable of simulating systems of thousands of nodes.
Does anybody know anything suitable for my needs?
Others have good direct answers, but I'm going to suggest an alternative. If I understand you right, you want a system in C++ or such where you can post events that fire in the future, and code is run when those events fire.
I had a project to do like this, and I started out trying to write such an event system in C++ and then quickly realized I had a better solution.
Have you considered writing your program in behavioral Verilog? That may seem strange to write software in a hardware description language, but a Verilog simulator is an event-based system underneath, and behavioral Verilog is a very convenient way to express events, timing, triggers, etc. There is a free Verilog simulator (which is what I used) called Icarus Verilog. If you're not using Ubuntu or some Linux distro with Icarus already in a package, building from source is straightforward.
I would recommend having a second look to OmNet++. At first sight it may look quite complex, but if you look it into more detail you will find that most of the complexity is in the network add-on (the INET Framework). Unless you are going to do a detailed network simulation you do not need the INET.
Using OmNet++ core is not specially difficult and it may be simpler than other similar tools.
You may want to have a look to an intro.
One of the things that makes OmNet++ attractive to me is its scalability. Is possible to run large simulations in a desktop. Besides, it is possible to scale the same simulation to a cluster without rewriting the code.
You should consider SystemC, although I'd also recommend taking a second look at OmNet++.
We use SIMLIB at my school. It is very fast, easy to understand, object oriented, discrete and continuous simulator. It might look outdated but it is still maintained.
There is CSIM from Mesquite Software which supports developing models in C, C++ and Java. However, it is paid-commercial, AFAIK.
Take a look at GBL library. It's written in modern C++ and even supports C++0x features like move semantics and lambda functions. It offers several modeling mechanisms: synchronous and asynchronous event handlers, preemptive threads, and fibers. You can create purely behavioral, cycle accurate, and real-time models, or any mixture of those.

Linux IPC - Multiple writers, single reader

I have never written any IPC C++ on Linux before.
My problem is that I will have multiple clients (writers), and a single server (reader). All of these will be on the same machine. The writers will deliver chunks of data (a string/struct) to the reader. The reader will then read them in FIFO and do something with them.
The types of IPC on Linux are either Pipes or Sockets/Message Queues as far as I can tell.
I was just wondering if someone could recommend me a path to go down. I'm leaning towards sockets, but I have no real basis for that. Is there anything I should read/understand before embarking on this journey?
Thanks
The main issue you should consider is what kind of data you are passing as this will in part determine your options. This comes down to whether your data is bounded or not. If it isn't bounded then something stream oriented like FIFOs or sockets are appropriate; if it is then you might make better use of of things like MQs or shared memory. Since you mention both strings and structs it is hard to say what is appropriate in your case, though if your strings are bounded within some reasonable maximum you can use anything with some minor fiddling.
The second is speed. There is never a completely correct answer for this but generally it goes something like: shared memory, MQs, FiFOs, domain sockets, network sockets.
The third is ease of use. Shared memory is the biggest PITA since you have to handle your own synchronization. Pipes are easy so long as your message lengths stay below PIPE_BUF size. The OS handles most of your headaches with MQs. Sockets are easy enough but you have the setup boilerplate.
Lastly several of the IPC mechanisms have both POSIX and SYSV variants. Generally POSIX is the way to go unless the SYSV type has some feature you really need or want.
EDIT: Count0's answer reminded me that you might be interested in something more abstract and higher level. In addition to ACE you can look at Poco. And, of course, no SO answer is complete if it doesn't mention Boost somewhere.
System V IPC is somewhat fiddly to use but it is a mature, robust technology. Message queues would probably do what you want and support atomic queuing/de-queuing.
Sockets are easy to use and also support communication over a network. However, they do not do any queuing, so you would have to write the queue management code within your server. Using sockets with C++ is not vastly different to using them with C. There are plenty of guides to this on the net and books such as Stevens' 'Unix Network Programming (vol 1)' that cover this topic in some depth.
A good place to get your feet wet is this sockets tutorial.
You'll then need to bone-up on threads & mutexes and here.
With the above you're all set to start playing ;-)
Though you've not asked for books, and because the answers above are so good, I'm only going to suggest you get your hands on copies of these two tomes:
UNIX Network Programming, Volume 2, Second Edition: Interprocess Communications, W. Richard Stevens
Advanced Programming in the UNIX Environment, Second Edition, W. Richard Stevens and Stephen A. Rago
There are inevitable ins & outs with this kind of coding, these two books will help you through whatever confusion you encounter.
Try to take a look at ACE (Adaptive Communication Environment). The ACE libraries are free available, very mature and cross platform. Unfortunately a good documentation is not, i would recommend this book to look for a good solution. You might try to take a look at this tutorial to get a feel of the patterns (at the end of the document). ACE uses a bunch of patterns to deal very successfully and efficient with those problems especially in a networked context, so it should be a good start to scope for good patterns and methods to use.
Especially Ace_Task using the Message_Queue allow to do what you need.

Implementing Semaphores, locks and condition variables

I wanted to know how to go about implementing semaphores, locks and condition variables in C/C++. I am learning OS concepts but want to get around implementing the concepts in C.
Any tutorials?
Semaphores, locks, condition variables etc. are operating system concepts and must typically be implemented in terms of features of the operating system kernel. It is therefore not generally possible to study them in isolation - you need to consider the kernel code too. Probably the best way of doing this is to take a look at the Linux Kernel, with the help of a book such as Understanding The Linux Kernel.
Semaphore at the very simplest is just a counter you can add and subtract from with a single atomic operation. Wikipedia has an easy to understand explanation that pretty much covers your question about them:
http://en.wikipedia.org/wiki/Semaphore_(programming)
A good starting point for learning OS concepts is probably Andrew Tanenbaum's "Modern Operating Systems". He also has another book on his own OS (Minix), which is called "Operating Systems: Design and Implementation" which goes more into detail about coding. You should be able to find those books in your local library.
Related topics you might want to look up to get the grip on how and why to use semaphores: race conditions, synchronization, multithreading, consumer-producer-problem.
At the ground level, if you want to implement that sort of thing, you're going to need to use assembly language. C and C++ simply don't expose the sort of features necessary to write concurrent code --- except by using libraries, which use assembler in their implementation.
The minix stuff is pretty good. A simpler example is the MicroC/OS stuff. It comes with a textbook that goes into good detail, all the source is there. It has the basic elements there and the code is small enough that you can understand it in a relatively short period of time.
http://www.micrium.com/products/rtos/kernel/rtos.html
http://en.wikipedia.org/wiki/MicroC/OS-II
Another thing you can do, is make a faked out OS in an application on linux. I did this by setting up the basic tick with an itimer, then swapping threads around with the function call swapcontext (man 2 swapcontext) which will save the regs on the stack. That gets the ugly stuff out of the way and you are left to implement the semaphores/mutexes/timers and all that. It was quite fun.
Despite what some of the posts say, assembler is not required. A knowledge of it will always help. It never hurts to understand how the internals/complilers/etc work when you are writing even high level applications.
For basic understanding, you can refer book Operating System Concepts, by Avi Silberschatz, Peter Baer Galvin, Greg Gagne and is really good.
You can also visit Dave Marshall's site for some support. Refer Semaphore section there.
Funny, Stevens Book is one of the classic texts for describing the use of synchronisation primitives and their uses. He certainly seems to think they can be used to control inter process communication. I tend to agree with him. Networking, no, IPC yes. most certainly yes.
You can catch up with alot of IPC( Inter Process Communication) Books, that can explain the ins and outs of what you need. There is one classic book. Unix Network Programming Inter Process Communication by Richard Stevens. you will get all you need. :)