boost::filesystem:path detect that two paths share same physical drive - c++

Background
I have some complex application which can consume lots of disc space (~10TB). To prevent unheated errors which comes from disc full scenarios my application has some logic which mages stored data.
Currently runs on Windows platform, but it is ported to Linux.
Problem
It is possible that two kinds of data are stored on different physical drive. Depending on that business logic is a bit different. Now on Windows physical drive can be identified by boost::filesystem::path::root_path() (it is not perfect, but good enough in my scenarios), but on other platforms this logic falls apart, since root_path() is always empty().
Question
I'm looking for some multi-platform solution (boost preferably) to detect if two paths are sharing same physical drive.
If there is not such thing I will have to use platform specific API and I prefer to avoid that.

I think your best bet is taking a step back and rethink your approach:
If your OS and file system support it, try creating a hard link. Now you know relatively reliably whether they are the same file system. (Unfortunately, using network file systems and the like one can still avoid the OS knowing the file systems are really the same.)
Knowing whether it is the same hard drive finally seems rather pointless for preventing stuffing too much crap onto it, even if it is interesting for throughput, and probably needs OS specific handling.
And if you know the paths should be the same, creating a test file lets you avoid any kind of flawed simulation and just let the system figure it out for you.

Related

Writing raw data "signature" to disk without corrupting filesystem

I am trying to create a program that will write a series of 10-30 letters/numbers to a disk in raw format (not to a file that the OS will read). Perhaps to make my attempt clearer, if you were to open the disk in a hex editor, you would see the 10-30 letters/numbers but a file manager such as Windows Explorer would not see it (because the data is not a file).
My goal is to be able to "sign" a disk with a series of characters and to be able to read and write that "signature" in my program. I understand NTFS signs its partitions with a NTFS flag as do other file systems and I have to be careful to not write my signature to any of those critical parts.
Are there any libraries in C++/C that could help me write at a low level to a disk and how will I know a safe sector to start writing my signature to? To narrow this down, it only needs to be able to write to NTFS, FAT, FAT32, FAT16 and exFAT file systems and run on Windows. Any links or references are greatly appreciated!
Edit: After some research, USB drives allow only 1 partition without applying hacking tricks that would unfold further problems for the user. This rules out the "partition idea" unfortunately.
First, as the commenters said, you should look at why you're trying to do this, and see if it's really a good idea. Most apps which try to circumvent the normal UI the user has for using his/her computer are "bad", in various ways.
That said, you could try finding a well-known file which will always be on the system and has some slack in the block size for the disk, and write to the slack. I don't think most filesystems would care about extra data in the slack, and it would probably even be copied if the file happens to be relocated (more efficient to copy the whole block at the disk level).
Just a thought; dunno how feasible it would be, but seems like it could work in theory.
Though I think this is generally a pretty poor idea, the obvious way to do it would be to mark a cluster as "bad", then use it for your own purposes.
Problems with that:
Marking it as bad is non-trivial (on NTFS bad clusters are stored in a file named something like $BadClus, but it's not accessible to user code (and I'm not even sure it's accessible to a device driver either).
There are various programs to scan for (and attempt to repair) bad clusters/sectors. Since we don't even believe this one is really bad, almost any of these that works at all will find that it's good and put it back into use.
Most of the reasons people think of doing things like this (like tying a particular software installation to a particular computer) are pretty silly anyway.
You'd have to scan through all the "bad" sectors to see if any of them contained your signature.
This is very dangerous, however, zero-fill programs do the same thing so you can google how to wipe your hard drive with zero's in C++.
The hard part is finding a place you KNOW is unused and won't be used.

Crossplatform library for uniquely identifying the machine my app is currently running on?

So I have next situation - shared file system, over N alike machines. My app is run on all of them. I need to understand on which machine my app runs in each instance - some unique ID... Is there such thing, is it possible to emulate it? Is there any crossplatform library that would help with that?
There are two concerns here, security and stability of your matching.
Hardware characteristics are a good place to start. Things like MAC address, CPU, hdd identifiers.
These things theoretically can change. If a hdd failed you probably would lose whatever configuration you had on the system as well. I could see a system that sent a hash of each characteristic separately work alright. If 4 out of 5 matched, you could probably guess that their network card caught on fire and it was replaced.
If you just need a head count, you may not even be interested that this new system with a different signature used to be another one.
Usually, people aren't too concerned with security with these systems; they just want to track resources on a network. If someone wanted to spoof the hardware identifiers they could. For simple cases, I would look into an installer that registered a salted identifier. If you really need something terribly secure you might start looking at commercial products (or ask another question about the security aspects specifically).
Both of these are error prone obviously. I'm not sure you should even fully automate it in those cases. Think about a case where network cards were behaving weird and you swapped them with another machine.
Human eyes are pretty good, let an administrator use them. At worst, they can probably figure things out with a quick email. Just give them enough information to make an informed decision when something does go wrong. Really, if you just log everything a human should be able to recreate the scenario and make a decision. Most of these things won't change daily. There is more work when hardware fails, not every day.

How to make a GameBoy / GameBoy Advance Emulator? [duplicate]

Closed. This question is off-topic. It is not currently accepting answers.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How do emulators work? When I see NES/SNES or C64 emulators, it astounds me.
Do you have to emulate the processor of those machines by interpreting its particular assembly instructions? What else goes into it? How are they typically designed?
Can you give any advice for someone interested in writing an emulator (particularly a game system)?
Emulation is a multi-faceted area. Here are the basic ideas and functional components. I'm going to break it into pieces and then fill in the details via edits. Many of the things I'm going to describe will require knowledge of the inner workings of processors -- assembly knowledge is necessary. If I'm a bit too vague on certain things, please ask questions so I can continue to improve this answer.
Basic idea:
Emulation works by handling the behavior of the processor and the individual components. You build each individual piece of the system and then connect the pieces much like wires do in hardware.
Processor emulation:
There are three ways of handling processor emulation:
Interpretation
Dynamic recompilation
Static recompilation
With all of these paths, you have the same overall goal: execute a piece of code to modify processor state and interact with 'hardware'. Processor state is a conglomeration of the processor registers, interrupt handlers, etc for a given processor target. For the 6502, you'd have a number of 8-bit integers representing registers: A, X, Y, P, and S; you'd also have a 16-bit PC register.
With interpretation, you start at the IP (instruction pointer -- also called PC, program counter) and read the instruction from memory. Your code parses this instruction and uses this information to alter processor state as specified by your processor. The core problem with interpretation is that it's very slow; each time you handle a given instruction, you have to decode it and perform the requisite operation.
With dynamic recompilation, you iterate over the code much like interpretation, but instead of just executing opcodes, you build up a list of operations. Once you reach a branch instruction, you compile this list of operations to machine code for your host platform, then you cache this compiled code and execute it. Then when you hit a given instruction group again, you only have to execute the code from the cache. (BTW, most people don't actually make a list of instructions but compile them to machine code on the fly -- this makes it more difficult to optimize, but that's out of the scope of this answer, unless enough people are interested)
With static recompilation, you do the same as in dynamic recompilation, but you follow branches. You end up building a chunk of code that represents all of the code in the program, which can then be executed with no further interference. This would be a great mechanism if it weren't for the following problems:
Code that isn't in the program to begin with (e.g. compressed, encrypted, generated/modified at runtime, etc) won't be recompiled, so it won't run
It's been proven that finding all the code in a given binary is equivalent to the Halting problem
These combine to make static recompilation completely infeasible in 99% of cases. For more information, Michael Steil has done some great research into static recompilation -- the best I've seen.
The other side to processor emulation is the way in which you interact with hardware. This really has two sides:
Processor timing
Interrupt handling
Processor timing:
Certain platforms -- especially older consoles like the NES, SNES, etc -- require your emulator to have strict timing to be completely compatible. With the NES, you have the PPU (pixel processing unit) which requires that the CPU put pixels into its memory at precise moments. If you use interpretation, you can easily count cycles and emulate proper timing; with dynamic/static recompilation, things are a /lot/ more complex.
Interrupt handling:
Interrupts are the primary mechanism that the CPU communicates with hardware. Generally, your hardware components will tell the CPU what interrupts it cares about. This is pretty straightforward -- when your code throws a given interrupt, you look at the interrupt handler table and call the proper callback.
Hardware emulation:
There are two sides to emulating a given hardware device:
Emulating the functionality of the device
Emulating the actual device interfaces
Take the case of a hard-drive. The functionality is emulated by creating the backing storage, read/write/format routines, etc. This part is generally very straightforward.
The actual interface of the device is a bit more complex. This is generally some combination of memory mapped registers (e.g. parts of memory that the device watches for changes to do signaling) and interrupts. For a hard-drive, you may have a memory mapped area where you place read commands, writes, etc, then read this data back.
I'd go into more detail, but there are a million ways you can go with it. If you have any specific questions here, feel free to ask and I'll add the info.
Resources:
I think I've given a pretty good intro here, but there are a ton of additional areas. I'm more than happy to help with any questions; I've been very vague in most of this simply due to the immense complexity.
Obligatory Wikipedia links:
Emulator
Dynamic recompilation
General emulation resources:
Zophar -- This is where I got my start with emulation, first downloading emulators and eventually plundering their immense archives of documentation. This is the absolute best resource you can possibly have.
NGEmu -- Not many direct resources, but their forums are unbeatable.
RomHacking.net -- The documents section contains resources regarding machine architecture for popular consoles
Emulator projects to reference:
IronBabel -- This is an emulation platform for .NET, written in Nemerle and recompiles code to C# on the fly. Disclaimer: This is my project, so pardon the shameless plug.
BSnes -- An awesome SNES emulator with the goal of cycle-perfect accuracy.
MAME -- The arcade emulator. Great reference.
6502asm.com -- This is a JavaScript 6502 emulator with a cool little forum.
dynarec'd 6502asm -- This is a little hack I did over a day or two. I took the existing emulator from 6502asm.com and changed it to dynamically recompile the code to JavaScript for massive speed increases.
Processor recompilation references:
The research into static recompilation done by Michael Steil (referenced above) culminated in this paper and you can find source and such here.
Addendum:
It's been well over a year since this answer was submitted and with all the attention it's been getting, I figured it's time to update some things.
Perhaps the most exciting thing in emulation right now is libcpu, started by the aforementioned Michael Steil. It's a library intended to support a large number of CPU cores, which use LLVM for recompilation (static and dynamic!). It's got huge potential, and I think it'll do great things for emulation.
emu-docs has also been brought to my attention, which houses a great repository of system documentation, which is very useful for emulation purposes. I haven't spent much time there, but it looks like they have a lot of great resources.
I'm glad this post has been helpful, and I'm hoping I can get off my arse and finish up my book on the subject by the end of the year/early next year.
A guy named Victor Moya del Barrio wrote his thesis on this topic. A lot of good information on 152 pages. You can download the PDF here.
If you don't want to register with scribd, you can google for the PDF title, "Study of the techniques for emulation programming". There are a couple of different sources for the PDF.
Emulation may seem daunting but is actually quite easier than simulating.
Any processor typically has a well-written specification that describes states, interactions, etc.
If you did not care about performance at all, then you could easily emulate most older processors using very elegant object oriented programs. For example, an X86 processor would need something to maintain the state of registers (easy), something to maintain the state of memory (easy), and something that would take each incoming command and apply it to the current state of the machine. If you really wanted accuracy, you would also emulate memory translations, caching, etc., but that is doable.
In fact, many microchip and CPU manufacturers test programs against an emulator of the chip and then against the chip itself, which helps them find out if there are issues in the specifications of the chip, or in the actual implementation of the chip in hardware. For example, it is possible to write a chip specification that would result in deadlocks, and when a deadline occurs in the hardware it's important to see if it could be reproduced in the specification since that indicates a greater problem than something in the chip implementation.
Of course, emulators for video games usually care about performance so they don't use naive implementations, and they also include code that interfaces with the host system's OS, for example to use drawing and sound.
Considering the very slow performance of old video games (NES/SNES, etc.), emulation is quite easy on modern systems. In fact, it's even more amazing that you could just download a set of every SNES game ever or any Atari 2600 game ever, considering that when these systems were popular having free access to every cartridge would have been a dream come true.
I know that this question is a bit old, but I would like to add something to the discussion. Most of the answers here center around emulators interpreting the machine instructions of the systems they emulate.
However, there is a very well-known exception to this called "UltraHLE" (WIKIpedia article). UltraHLE, one of the most famous emulators ever created, emulated commercial Nintendo 64 games (with decent performance on home computers) at a time when it was widely considered impossible to do so. As a matter of fact, Nintendo was still producing new titles for the Nintendo 64 when UltraHLE was created!
For the first time, I saw articles about emulators in print magazines where before, I had only seen them discussed on the web.
The concept of UltraHLE was to make possible the impossible by emulating C library calls instead of machine level calls.
Something worth taking a look at is Imran Nazar's attempt at writing a Gameboy emulator in JavaScript.
Having created my own emulator of the BBC Microcomputer of the 80s (type VBeeb into Google), there are a number of things to know.
You're not emulating the real thing as such, that would be a replica. Instead, you're emulating State. A good example is a calculator, the real thing has buttons, screen, case etc. But to emulate a calculator you only need to emulate whether buttons are up or down, which segments of LCD are on, etc. Basically, a set of numbers representing all the possible combinations of things that can change in a calculator.
You only need the interface of the emulator to appear and behave like the real thing. The more convincing this is the closer the emulation is. What goes on behind the scenes can be anything you like. But, for ease of writing an emulator, there is a mental mapping that happens between the real system, i.e. chips, displays, keyboards, circuit boards, and the abstract computer code.
To emulate a computer system, it's easiest to break it up into smaller chunks and emulate those chunks individually. Then string the whole lot together for the finished product. Much like a set of black boxes with inputs and outputs, which lends itself beautifully to object oriented programming. You can further subdivide these chunks to make life easier.
Practically speaking, you're generally looking to write for speed and fidelity of emulation. This is because software on the target system will (may) run more slowly than the original hardware on the source system. That may constrain the choice of programming language, compilers, target system etc.
Further to that you have to circumscribe what you're prepared to emulate, for example its not necessary to emulate the voltage state of transistors in a microprocessor, but its probably necessary to emulate the state of the register set of the microprocessor.
Generally speaking the smaller the level of detail of emulation, the more fidelity you'll get to the original system.
Finally, information for older systems may be incomplete or non-existent. So getting hold of original equipment is essential, or at least prising apart another good emulator that someone else has written!
Yes, you have to interpret the whole binary machine code mess "by hand". Not only that, most of the time you also have to simulate some exotic hardware that doesn't have an equivalent on the target machine.
The simple approach is to interpret the instructions one-by-one. That works well, but it's slow. A faster approach is recompilation - translating the source machine code to target machine code. This is more complicated, as most instructions will not map one-to-one. Instead you will have to make elaborate work-arounds that involve additional code. But in the end it's much faster. Most modern emulators do this.
When you develop an emulator you are interpreting the processor assembly that the system is working on (Z80, 8080, PS CPU, etc.).
You also need to emulate all peripherals that the system has (video output, controller).
You should start writing emulators for the simpe systems like the good old Game Boy (that use a Z80 processor, am I not not mistaking) OR for C64.
Emulator are very hard to create since there are many hacks (as in unusual
effects), timing issues, etc that you need to simulate.
For an example of this, see http://queue.acm.org/detail.cfm?id=1755886.
That will also show you why you ‘need’ a multi-GHz CPU for emulating a 1MHz one.
Also check out Darek Mihocka's Emulators.com for great advice on instruction-level optimization for JITs, and many other goodies on building efficient emulators.
I've never done anything so fancy as to emulate a game console but I did take a course once where the assignment was to write an emulator for the machine described in Andrew Tanenbaums Structured Computer Organization. That was fun an gave me a lot of aha moments. You might want to pick that book up before diving in to writing a real emulator.
Advice on emulating a real system or your own thing?
I can say that emulators work by emulating the ENTIRE hardware. Maybe not down to the circuit (as moving bits around like the HW would do. Moving the byte is the end result so copying the byte is fine). Emulator are very hard to create since there are many hacks (as in unusual effects), timing issues, etc that you need to simulate. If one (input) piece is wrong the entire system can do down or at best have a bug/glitch.
The Shared Source Device Emulator contains buildable source code to a PocketPC/Smartphone emulator (Requires Visual Studio, runs on Windows). I worked on V1 and V2 of the binary release.
It tackles many emulation issues:
- efficient address translation from guest virtual to guest physical to host virtual
- JIT compilation of guest code
- simulation of peripheral devices such as network adapters, touchscreen and audio
- UI integration, for host keyboard and mouse
- save/restore of state, for simulation of resume from low-power mode
To add the answer provided by #Cody Brocious
In the context of virtualization where you are emulating a new system(CPU , I/O etc ) to a virtual machine we can see the following categories of emulators.
Interpretation: bochs is an example of interpreter , it is a x86 PC emulator,it takes each instruction from guest system translates it in another set of instruction( of the host ISA) to produce the intended effect.Yes it is very slow , it doesn't cache anything so every instruction goes through the same cycle.
Dynamic emalator: Qemu is a dynamic emulator. It does on the fly translation of guest instruction also caches results.The best part is that executes as many instructions as possible directly on the host system so that emulation is faster. Also as mentioned by Cody, it divides the code into blocks ( 1 single flow of execution).
Static emulator: As far I know there are no static emulator that can be helpful in virtualization.
How I would start emulation.
1.Get books based around low level programming, you'll need it for the "pretend" operating system of the Nintendo...game boy...
2.Get books on emulation specifically, and maybe os development. (you won't be making an os, but the closest to it.
3.look at some open source emulators, especially ones of the system you want to make an emulator for.
4.copy snippets of the more complex code into your IDE/compliler. This will save you writing out long code. This is what I do for os development, use a district of linux
I wrote an article about emulating the Chip-8 system in JavaScript.
It's a great place to start as the system isn't very complicated, but you still learn how opcodes, the stack, registers, etc work.
I will be writing a longer guide soon for the NES.

Are there specialities within the embedded fields

I've started learning embedded and its 2 main languages (c and c++). But I'm starting to realize that despite the simple learning requirements, embedded is a whole world in and of itself. And once you deal with real projects, you start to realize that you need to learn more "stuff" specific to the hardware used in the device you're working on. This is an issue that rarely came up with the software-only projects I currently work on.
Is it possible to fragment this field into sub-fields? I'm thinking that those with experience in the field may have noticed that some types of projects are different from other types, which has led them to maybe maybe come up with their own categories. For example, when you run into a project, you may think to yourself that it's "outside your field"? Does that happen to you? and if so, what would you call your sub-field or what other sub-fields have you encountered?
Here are a few sub-specialities I can think of:
Assembly Language Specialist
Yep. You need to know C and C++. But some people also specialize in assembly. These are the experts that are called up to port a RTOS to a new chip, or to squeeze every drop of performance from a highly constrained embedded system (usually to save $$ per unit).
This person probably is not needed that much these days... but.. yet still critical from time to time.
Device Driver Specialist
comfortable living between a real OS or RTOS and a piece of hardware. This person is usually comfortable with lab tools like o-scopes or logic analyzers, thinking in "hex", and understanding the critical nature of timing with HW. This person reads device data sheets for fun at night, and gets excited about creating the perfect porting driver for some new device.
DSP Specialist
Digital Signal Processing seems to be its own sub-specialty of embedded, although perhaps a software engineer may not know the exact algorithm details, and may only be implementing what a system or electrical engineer requires. However, understanding sampling rate theory, FFTs, and some foundational elements from "DSP" is handy and maybe required. And you still generally must be very aware of timing and your target hardware's restrictions ( sampling rate, noise, bits per sample, etc).
Control Theory Specialist
Perhaps the same issue as with DSP: a system or electrical engineer may provide the detailed specs. But, then again, familiarity with various motors, sensors, and other controllers handled by a microcontroller, would be great. Throw in a Bode plot, some Laplace transforms or two and some higher math skills... that couldn't hurt too much!
Networking Specialist
basically the same as the PC world "networking". Many embedded devices are adding networking connectivity features these days. TCP/IP sockets, http, etc good to know and understand how to use in a resource constrained device. Throw in USB and Bluetooth for good measure.
UI Specialist
more and more embedded devices include 2D graphics, and now more include 3D graphics thanks to the influence of iPhones, etc. Even though these are still "fat" systems by other embedded device standards, they are still limited. Just read a bit in the Android Development Guide, and you will realize that you still must consider responsiveness, performance, etc, even in a high end cell phone.
http://developer.android.com/guide/practices/design/performance.html
And then, of course, every industry is a specialization unto itself. Consumer Electronics, Military, Avionics, Robotics, Industrial Machines, Medical Devices, etc...
Have fun and good luck!
Yes, there certainly are several sub-fields. I don't think I can list them all from the top of my head, but the way I see it, there are at least 3 big sub-divisions, and from there, they are further sub-divided. There are micro-controllers, micro-processors and sand-boxed/VMs. For example, using a 16bit micro-controller in a drive-by-wire would be an example of the first, a set-top-box like TiVo would be and example of the second, and iPhones and Androids are the latter.
Micro-controllers are very limited, and usually can't even be programmed in C++. Most of them either has no OS running, or, the most expensive ones, have an RTOS. Set-top-boxes and any ARM/MIPS/SuperH4/Broadcom chips are much more like a PC, in that they have a linux distribution running in them and you can find most of the same facilities as a PC, and if you can't find one, cross-compiling to it is usually simple. The sand-boxed guys, are well, sand-boxed; so it is exactly what it sonds, usually the SDK isolates you from the hardware and you don't really get the 'full embedded experience'.
Sure, for example, there are many operating systems in use in the embedded world. Working with embedded Linux is very different than working with a bare micro controller.
"Learning embedded" sounds impossible to me. I do some work on headless linux computers controlling large machinery - which can be referred to as embedded (but it's not much different to programming any other computer, bar a few hardware interfaces). That's totally different to a phone, and totally different to an air conditioner or home automation system.
Control systems and mobile devices would be two categories of 'embedded' - but I'm sure there are plenty more.
I work on embedded linux on Mobile devices, and its whole lot different from a full flegded Ubuntu image where i write my code and cross compile it for the mobile device.
First of all a embedded system is stripped down to meet the bare requirements of the device, very much unlike the traditional desktop operating system where you can have as many functionalities/libraries etc.
The memory constraints also are a major part of a embedded system. Hence all the programs/applications have to be written inorder to fit into the architecture. This may not be much of a concern on a traditional system.
Basically my point is to emphasize that working on embedded cannot be summed up into a few lines as each have a different purpose.
However programming keeping in view the overall architecture may help you gain confidence if you can fit into a project or not.
PS: I may not be good in categorizing which is what the question expects, however this is my bit on embedded systems.
Lots of good answers already to this question. I think you need to decide what the word embedded software means to you and/or what you want it to mean. Maybe your definition isnt really embedded. My definition means no operating system. And that will probably upset many embedded software engineers, but the experienced ones like ones that have already answered will certainly understand our variations in definition and why. I think they would call me a microcontroller specialist, and that is certainly true, but I spend most of my time on full speed processors with gobs of memory and rom and I/O, networking, etc. I am the guy that brings the hardware up the first time, flushes out board and chip bugs, then hands it off to what most would call the embedded software engineers. I am an electrical engineer by training and software engineer by trade, so I straddle the line.
It is very possible, and not uncommon that you could remain in the C/C++ embedded world, never have to read a datasheet or schematic, all you would do is call api's that someone else has created. There is a large and increasingly larger market for that as what used to be (my definition of) true embedded, or rtos based embedded (which is often api calls and not the full experience) to this linux embedded thing that has exploded. There is nothing wrong with it, it is fairly close to the experience of developing code for a desktop, but you have to try just a little harder for reliable code since it may be flash/rom based and they may not want to have weekly/monthly updates to units in the field. Ideally never update, but that is also becoming more rare.
The rtos/embedded linux api based embedded is and can still be a different experience than what I call application programming. You may still want or need to read a datasheet or schematic, you may still need to know assembler for the target platform.
I like all of the answers thus far to this question, I guess we are struggling to understand what you are really asking or what you are really looking for in life, add to that what we enjoy about our choices and you get this mix of answers.
I see a few groups, there is certainly the good old true embedded microcontroller stuff, but even that is turning into libraries and apis instead of on the metal, look at the arduino community and stellaris and a bunch of others. I spend a lot of my time in board bring up and test, you have to know a fair amount about the whole system hardware, registers, schematic, etc. Have to know enough assembler both to boot the thing out of reset as well as debug things by staring at disassembly dumps and looking for signs of life in the I/O or on memory busses, etc. If lucky you will get to work on chip design as well and get to watch your instructions execute in simulation. The next group is bootloader/operating system. The hardware working well enough at this point, chip boots, memory appears to work, rom is there. This team writes the production boot code and gets the product from power up into the embedded system, rtos, linux, vxworks, bsd, whatever. this is a talent in and of itself, toolchain, root file system, etc. The next group is the masses, the software engineers that write the applications for that operating system, now some will be reading datasheets, schematics, etc, writing device drivers or apis for others to use, and the highest level may be someone that is all application level programming, the api and sdk calls, some of which may be company developed some may be purchased or other.
Bottom line: Absolutly, there are specialties within embedded. Are you going to know everything? NO, maybe 20 years ago, likely 40 years ago, not today the field is too big and wide. What is the best things you can do for yourself in this field? Learn assembler for a few different instruction sets. The popular ones, arm definitely, thumb version of arm, maybe mips or powerpc or others. If you lean toward microcontrollers, learn (arm, thumb,) avr, pic (blah), msp430, maybe 8051. Read some data sheets, microcontrollers can teach you this even if that is not the field you want, tons of sub $50 development/eval boards (sparkfun.com for example) that give data sheets, simple schematics, assembler, C, etc. If you are a software guy, learn to speak hardware guy, software and hardware folks do not speak the same language, if you can avoid picking sides and stay neutral and speak both languages you will help yourself, your career and whomever you work for and with. Despite any personal views you may have about endians or bit or byte numbering, you are likely to have to deal with some screwy things, and speak to customers/coworkers that can only deal with octal (yeah really) or only deal with the msbit of anything being zero. I recommend looking into verilog and maybe vhdl. At least in a readable sense, not necessarily create it from scratch. If you can already program and know C it is very readable. Depending on the job and the coworkers the verilog and the schematic may be your only documentation you use to write your software. If you cant do it they may replace you with someone who can (rather than get the hardware folks to document their stuff).

Machine ID for Mac OS?

I need to calculate a machine id for computers running MacOS, but I don't know where to retrieve the informations - stuff like HDD serial numbers etc. The main requirement for my particular application is that the user mustn't be able to spoof it. Before you start laughing, I know that's far fetched, but at the very least, the spoofing method must require a reboot.
The best solution would be one in C/C++, but I'll take Objective-C if there's no other way. The über-best solution would not need root privileges.
Any ideas? Thanks.
Erik's suggestion of system_profiler (and its underlying, but undocumented SystemProfiler.framework) is your best hope. Your underlying requirement is not possible, and any solution without hardware support will be pretty quickly hackable. But you can build a reasonable level of obfuscation using system_profiler and/or SystemProfiler.framework.
I'm not sure your actual requirements here, but these posts may be useful:
Store an encryption key in Keychain while application installation process (this was related to network authentication, which sounds like your issue)
Obfuscating Cocoa (this was more around copy-protection, which may not be your issue)
I'll repeat here what I said in the first posting: It is not possible, period, not possible, to securely ensure that only your client can talk to your server. If that is your underlying requirement, it is not a solvable problem. I will expand that by saying it's not possible to construct your program such that people can't take out any check you put in, so if the goal is licensing, that also is not a completely solvable problem. The second post above discusses how to think about that problem, though, from a business rather than engineering point of view.
EDIT: Regarding your request to require a reboot, remember that Mac OS X has kernel extensions. By loading a kernel extension, it is always possible to modify how the system sees itself at runtime without a reboot. In principle, this would be a Mac rootkit, which is not fundamentally any more complex than a Linux rootkit. You need to carefully consider who your attacker is, but if your attackers include Mac kernel hackers (which is not an insignificant group), then even a reboot requirement is not plausible. This isn't to say that you can't make spoofing annoying for the majority of users. It's just always possible by a reasonably competent attacker. This is true on all modern OSes; there's nothing special here about Mac.
The tool /usr/sbin/system_profiler can provide you with a list of serial numbers for various hardware components. You might consider using those values as text to generate an md5 hash or something similar.
How about getting the MAC ID of a network card attached to a computer using ifconfig?