Trace32 practice script: DATA.SET how to use - trace32

What does the following command mean? What does EA mean?
&HEAD=0x146BF94C
DATA.SET EA:&HEAD+0x4 %LONG DATA.LONG(EA:&HEAD+0x4)&0xFFFFFF

The command Data.Set writes raw data to your target's memory at the given address.
The command follows this schema:
  Data.Set <address> <access width> <data>
where
<address> has the form <access class> : <address offset>where the "access class" are several letters specifying which memory is accessed in which way.
<access width> is %Byte for 8-bit, %Word for 16-bit, %Long for 32-bit or %Quad for 64-bit
<data> is the data you actually want to write.
For the "access class" check the chapter Access Classes in your Processor Architecture Manual (menu → Help → Processor Architecture Manual). The types of available access classes vary from the used processor architecture. (e.g. different classes for ARM and PowerPC)
The "access class" EA: means:
Access the memory while the CPU is running (E).
Access the memory via absolute (physical) memory addresses (A) bypassing the MMU.
Finally the data (<data>) you want to write to the memory can be a fixed value (e.g. 0x42) or calculated via an expression (0x40+0x02). Such an expression can also use so called "PRACTICE functions". The function used in your example is Data.Long(<address>), which reads 32-bit from the given address.
(Note: Expressions may not contain blanks.)
And then you have a macro &HEAD= which contains the string "0x146BF94C". This means that any &HEAD appearing in any later command gets replaces by the content of the macro. This similar to the C-Preprossor.
Thus, your commands
&HEAD=0x146BF94C
DATA.SET EA:&HEAD+0x4 %LONG DATA.LONG(EA:&HEAD+0x4)&0xFFFFFF
have the same meaning than
Data.Set EA:0x146BF950 %LONG Data.Long(EA:0x146BF950)&0x00FFFFFF
and that defines actually a read-modify-write on the 32-bit value at address EA:0x146BF950: The value is read from memory, the upper 8-bit are set to zero and than the result gets written back to the same memory location.
It has (almost) the same meaning than the C code expression
*((volatile uint32_t*) 0x146BF950) &= 0x00FFFFFF;
It is just "almost the same" because the C code expression would not bypass the MMU, like your Data.Set command does, thanks to the "A" in the memory access class of the addresses.

Related

Trace32 CMM script : understanding the Data.Set command

What does the following command mean?
sYmbol.NEW _VectorTable 0x34000000
sYmbol.NEW _ResetVector 0x34000020
sYmbol.NEW _InitialSP 0x34000100
Data.Set EAXI:_VectorTable %Long _InitialSP _ResetVector+1
The command Data.Set writes data values to your target's memory. The syntax of the command is
Data.Set <address>|<address_range> [<access_width>] {value(s)}
The <address> to which the data is written to has the form:
<access_class>:<address_offset>
A full address, just the address offset and the values (you want to write), can also be represented by debug symbols. These symbols are usually the variables, function names and labels defined in your target application and are declared to the debugger, by loading the target application's ELF file.
In this case however the symbols are declared in the debugger manually by the command sYmbol.NEW.
Anyway: By replacing the symbols with their value in the command Data.Set EAXI:_VectorTable %Long _InitialSP _ResetVector+1 we get the command
Data.Set EAXI:0x34000000 %Long 0x34000100 0x34000021
So what does this command actually do?
The access-width specifier %Long indicate that 32-bit values should be written. As a result the address will increment automatically by 4 for each specified data value.
The value 0x34000100 is written to address EAXI:0x34000000
The value 0x34000021 is written to address EAXI:0x34000004
The <access_class> "EAXI" indicates that the debugger should access the address 0x34000000 directly via the AXI bus (Advanced eXtensible Interface). By writing directly to the AXI bus, you bypass your target's CPU core (bypassing any MMU, MPU or caches). The leading 'E' of the access class EAXI indicates that the write operation may also performed while the CPU core is running (or considered to be running (e.g. in Prepare mode)). The description of all possible access classes is specific to the target's core-architecture and thus, you can find the description in the debugger's "Target Architecture Manual".
And what does this exactly mean for your target and the application running on it?
Well, I don't know you chip or SoC (nor do I know your application).
But from the data I see, I guess that you are debugging a chip with an ARM architecture - probably Cortex-M. Your chip's Boot-ROM seems to start at address 0x34000000, while your actual application's start-up code starts at 0x34000020 (maybe with symbol _start).
For Cortex-M cores you have to program at offset 0 of your vector table (in the boot ROM) the initial value of the stack-pointer, while at offset 4 you have to write the initial value of the program counter. In your case the program counter should be initialized with 0x34000021. Why 0x34000021 and not 0x34000020? Because your start-up code is probably encoded in ARM Thumb. (Cortex-M can only execute Thumb code). By setting the least significant bit of the initial value for the program counters to 1, the core knows, that it should start decoding Thumb instructions. (Not setting the least significant bit to 1 on a Cortex-M will cause an exception).

Why inline assembly and optimization generates a problem with rbit [duplicate]

recently i checked the Instruction Set for an ARM Cortex-M3 processor.
For example:
ADD <Rd>, <Rn>, <Rm>
What do those abbriviations mean exactly?
I guess they mean different kinds of addresses, like directely addressed, relatively addressed or so.
But what exactly?
Thanks!
Operands of the form <Rx> refer to general-purpose registers, i.e. r0-r15 (or accepted aliases like sp, pc, etc.).
I'm not sure if it's ever called out specifically anywhere but there is a general pattern of "d" meaning destination, "t" meaning target, "n" meaning the first operand or base register, "m" meaning the second operand, and occasionally "a" meaning an accumulator. Hence why you might spot designations like <Rdn> (in the destructive two-operand instructions), or <Rt>, <Rt2> (where a 64-bit value is held across a pair of GP registers). This is consistent across the other types of register too, e.g. VADD.F32 <Sd>, <Sn>, <Sm>.
They are just there to define registers, the lowercase letter just being there to separate them for explanation. Rd is destination, but Rn, Rm etc are just any register you can use. It's the only way to tell which is which when explaining like "Rd equals Rn bitwise anded with Rm", for example, since you can't use numbers.
They could be Rx, Ry etc, or Ra, Rb... as well.
Basics:
Rd is the destination, Rn and Rm are sources. They're all general-purpose integer registers; FP would use Sd / Sn / Sm or Dd / Dn / Dm for single or double.
ARM syntax puts the destination(s) on the left, before read-only source operands.
See Notlikethat's answer for more. Some small additions to that:
t: in this post, an ARM employee comments that "t" might mean "transfer" instead of "target".
Since t generally appears in memory instructions like LDR and STR, I understand that that means "transfer to/from memory", e.g. on ARMARMv8-fa:
LDR <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}]
STR <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}]
where the t is the source/destination of memory reads and writes.
This is also further suggested in the description of the STR and LDXR instruction registers:
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field.
The LDR instruction however says "loaded":
<Xt> Is the 64-bit name of the general-purpose register to be loaded, encoded in the "Rt" field.
This terminology is especially meaningful because ARM is RISC-y and so there are relatively few instructions that do memory IO, and they tend to do just that (unlike say add and store to memory as is common in x86).
t1 and t2: these are used for memory instructions that load/store two values at once, e.g. the ARMv8 LDP/STP:
LDP <Xt1>, <Xt2>, [<Xn|SP>], #<imm>
STP <Xt1>, <Xt2>, [<Xn|SP>, #<imm>]!
n and m are just commonly used integer variable/index names in mathematics
s:
the STXR instruction stores to memory fom Xt (like STR), but it also gets a second return value (did the write succeed) to Ws:
STXR <Ws>, <Xt>, [<Xn|SP>{,#0}]
so presumably s was chosen because it comes before t like m comes before n.
Some ARMv7/aarch32 instructions could take a shift in a register, and Rs is the name given to that register, e.g.:
ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <shift> <Rs>
I couldn't easily find aarch64 ones.
if it were documented, "Chapter C2 About the A64 Instruction Descriptions" might have been a good location for the information, but it's not there

Data Alignment: Reason for restriction on memory address being multiple of data type size

I understand the general concept of data alignment, but what I do not understand is the restriction on memory address value, forced to be a multiple of the size of underlying data type.
This answer explains the data alignment well.
Quote:
Let's look at a memory map:
+----+
|0000|
|0001|
+----+
|0002|
|0003|
+----+
|0004|
|0005|
+----+
| .. |
At each address there is a byte which can be accessed individually. But words can only be fetched at even addresses. So if we read a word at 0000, we read the bytes at 0000 and 0001. But if we want to read the word at position 0001, we need two read accesses. First 0000,0001 and then 0002,0003 and we only keep 0001,0002.
Question:
Assuming it's true, why " But words can only be fetched at even addresses. " be true? Can't the memory/stack pointer point to 0001 in the example and then read a word of information starting there?
We know the machine can read memory in blocks of 2 bytes with one read action, (in the example [0000, 0001] or [0002, 0003]). So if my address register is pointing to 0001 (odd address instead of even), then I can read 2 bytes from there (i.e. 0001 and 0002) directly in one read action, right?
The assumption about that statement is not necessarily true. I don't want to re-iterate the answer you linked to describing the reasons for using and highly preferring aligned access, but there are architectures that do support unaligned memory access -- ARM for example (check out this SO answer).
But your question, I think, really comes down to hardware architecture, specifically the data bus design, and the accompanying instructions set that engineers at various silicon manufacturers have designed.
Some Cortex-M cores explicitly allow you to enable a CPU to trigger an exception on un- aligned access by configuring a Usage Fault register, which means that you can "utilize" unaligned memory access in rare use-cases.
Usually a processors internal addresses points to a whole word. This is because you don't want your (simple) processor be able to address a word at a random byte (or even worse: bit) because
You waste addressable memory: presuming the biggest possible address your processor is able to process is the max value of its word size, and you can multiply that by the size of your word to calculate the amount of storage you can address. (Each unique address points to a full word) The "address" I'm talking about here is not necessarily looking like the address which might be stored in a pointer of a higher programming language. The pointer address addresses each byte, which will be interpreted by a compiler or interpreter into the corresponding assembly instructions (discarding unwanted bytes from the loaded word)
A word loaded from memory could be anything, a value or the next instruction of the program you are running on your processor - the previous word loaded into the processor will often give an indication what the following word that gets loaded is used for: another instruction (eg arithmetic operation, load or store instruction) which might be followed by operands (values or addresses). Being able to address unaligned words would complicate a processor a lot in easy words.
Assuming it's true, why " But words can only be fetched at even addresses. " be true?
The memory actually stores words. The processor actually addresses the memory in words, and fetches a word at a time.
When you fetch a byte, it actually fetches a word, then ignores either the first half or the second half.
On 32-bit processors, it fetches a 32-bit word, then ignores three quarters; fetching a 16-bit word on a 32-bit processor ignores half the word.
If the 16-bit word you want to fetch (on a 16-bit processor) isn't aligned, then the processor has to fetch two words, take half of each word and then re-combine them. So even on processor designs where it works, it's often slower.
A lot of processor designs don't bother - either they just won't allow it, or they force the operating system to handle it (which is very slow).
(Not all types of processors work this way - e.g. 8-bit processors usually fetch a byte at a time)
Can't the memory/stack pointer point to 0001 in the example and then read a word of information starting there?
If the processor supports it, yes.

(Lower level of C++) When using "cout" on a piece of data, were does it go to before being displayed on screen?

Specifically talking about the C++ part of the code here: [LINK]
(intel x86, .cpp & .asm hybrid program.)
From dealing with chars/strings' pointers in .asm I know it uses dl/dx registers for their storage-before-display (in case of 2h and 9h functions).
How is it in the case when the data (specifically, a floating-point value) gets sent to C++ portion of the hybrid, and then is treated with cout?
Where is that value stored before the cout converts it into a string to be displayed? (Is it a register, or a stack, or something else?)
The lower level stuff of C++ is platform dependent. For example, reading a character from the keyboard. Some platforms don't have keyboards. Some platforms send messages when a character arrives, others wait (poll the input port).
Let's talk one level down from the high level language.
For cin, the underlying level reads characters from the input buffer. If the buffer is empty, the underlying layer reads characters from the standard input and stores them into a buffer until an end-of-line character is detected.
Note: there are methods to bypass this layer, still using C++.
In many OS based platforms, the C++ libraries eventually call an OS function to fetch a single character. In Linux, the OS delegates this request to a driver. The driver has the responsibility of reading the character from the hardware and returning it. The driver is the piece of code that gets the character from the keyboard.
There are exceptions to this path, for example piping. With piping, the OS redirects the requests from standard input to a file or device, depending on command line.
Where is that value stored before the cout converts it into a string to be displayed? (Is it a register, or a stack, or something else?)
The compiler calls a function that converts the internal representation of a floating point variable into a textual representation. This textual representation is sent to the underlying cout function, character by character; or as a pointer to a string. The textual representation can reside almost anywhere: stack, heap, cache, etc. It really doesn't make a difference. Most processor registers are too small to contain all the characters in a textual representation of a floating point number.
The floating point value may be stored in a register, on the stack, or other places before passed to the conversion function. Depends on the optimization level of the compiler and the API for the conversion function. The compiler will try to use the most efficient storage types.

Binary How The Processor Distinguishes Between Two Same Byte Size Variable Types

I'm trying to figure out how it is that two variable types that have the same byte size?
If i have a variable, that is one byte in size.. how is it that the computer is able to tell that it is a character instead of a Boolean type variable? Or even a character or half of a short integer?
The processor doesn't know. The compiler does, and generates the appropriate instructions for the processor to execute to manipulate bytes in memory in the appropriate manner, but to the processor itself a byte of data is a byte of data and it could be anything.
The language gives meaning to these things, but it's an abstraction the processor isn't really aware of.
The computer is not able to do that. The compiler is. You use the char or bool keyword to declare a variable and the compiler produces code that makes the computer treat the memory occupied by that variable in a way that makes sense for that particular type.
A 32-bit integer for example, takes up 4 bytes in memory. To increment it, the CPU has an instruction that says "increment a 32-bit integer at this address". That's what the compiler produces and the CPU blindly executes it. It doesn't care if the address is correct or what binary data is located there.
The size of the instruction for incrementing the variable is another matter. It may very well be another 4 or so bytes, but instructions (code) are stored separately from data. There may be many instructions generated for a program that deal with the same location in memory. It is not possible to formally specify the size of the instructions beforehand because of optimizations that may change the number of instructions used for a given operation. The only way to tell is to compile your program and look at the generated assembly code (the instructions).
Also, take a look at unions in C. They let you use the same memory location for different data types. The compiler lets you do that and produces code for it but you have to know what you're doing.
Because you specify the type. C++ is a strongly typed language. You can't write $x = 10. :)
It knows
char c = 0;
is a char because of... well, the char keyword.
The computer only sees 1 and 0. You are in command of what the variable contains.
you can cast that data also into what ever you want.
char foo = 'a';
if ( (bool)(foo) ) // true
{
int sumA = (byte)(foo) + (byte)(foo);
// sumA == (97 + 97)
}
Also look into data casting to look at the memory location as different data types. This can be as small as a char or entire structs.
In general, it can't. Look at the restrictions of dynamic_cast<>, which tries to do exactly that. dynamic_cast can only work in the special case of objects derived from polymorphic base classes. That's because such objects (and only those) have extra data in them. Chars and ints do not have this information, so you can't use dynamic_cast on them.