In the Gforth OpenGL tutorial I've found a codesnippet for drawing a triangle to the graphics-screen in Forth:
: DrawGLScene
GL_COLOR_BUFFER_BIT GL_DEPTH_BUFFER_BIT OR gl-clear
gl-load-identity
-1.5e 0e -6e gl-translate-f
GL_TRIANGLES gl-begin
0e 2e 0e gl-vertex-3f
-1e -1e 0e gl-vertex-3f
1e -1e 0e gl-vertex-3f
gl-end
sdl-gl-swap-buffers
fps-frames 1+ to fps-frames
Display-FPS
TRUE
;
If I change one of the coordinates for example from “2e” to “1e”, the shape of the resulting object will become different. But how can I draw a single line, instead of triangle? Is this possible with OpenGL and Gforth too?
The code snippet you're showing is using the old-and-busted Fixed Function Pipeline. I know Forth, but I don't know the OpenGL bindings, so I let's stick with the FFP for the moment. Try this, which draws two lines:
: DrawGLScene
GL_COLOR_BUFFER_BIT GL_DEPTH_BUFFER_BIT OR gl-clear
gl-load-identity
-1.5e 0e -6e gl-translate-f
GL_LINES gl-begin
0e 2e 0e gl-vertex-3f
-1e -1e 0e gl-vertex-3f
1e 2e 0e gl-vertex-3f
0e -1e 0e gl-vertex-3f
gl-end
sdl-gl-swap-buffers
fps-frames 1+ to fps-frames
Display-FPS
TRUE
;
Related
so I need to use regex to match a part of a hexadecimal string, but that part is random. Let me try to explain more:
So I have this hexa data:
70 75 62 71 00 7e 00 01 4c 00 06 72 61 6e 64 6f 6d 74 00 1c 4c 6a 2f 73 2f 6e 64 6f 6d 3b 78 70 77 25 00 00 00 20 f2 90 c2 91 c4 c4 ca 91 c0 c0 ca 91 94 cb c5 97 90 c5 90 c2 90 96 c7 ca 91 91 93 94 c6 c5 c6 cb c0 78
I need to match only the f2 in that case. But that is not always the case. Each data will be different. The only thing that is always the same is the '00 00 00' part and the '78' at the end. All the rest is random.
I managed to make the following regex:
/(?=00 00 00).+?(?=78)/
The output is:
00 00 00 20 f2 90 c2 91 c4 c4 ca 91 c0 c0 ca 91 94 cb c5 97 90 c5 90 c2 90 96 c7 ca 91 91 93 94 c6 c5 c6 cb c0
But I dont know how to build a regex to take only the 'f2' (reminder: not always is going to be f2)
Any thoughts?
Given the explanation in this comment, the regex that you need is:
(?<=00 00 00 [0-9a-f]{2} )[0-9a-f]{2}
Providing the first input string from the question, this regex matches f2 (no spaces around it).
Check it online.
How it works:
(?<= # start of a positive lookbehind
00 00 00 # match the exact string ("00 00 00 ")
[0-9a-f] # match one hex digit (lowercase only)
{2} # match the previous twice (i.e. two hex digits)
# there is a space after ")"
) # end of the lookbehind
[0-9a-f]{2} # match two hex digits
The positive lookbehind works like a non-capturing group but it is not part of the match. Basically it says that the matching part ([0-9a-f]{2}) matches only if it is preceded by a match of the lookbehind expression.
The matching part of the expression is [0-9a-f]{2} (i.e. two hex digits).
You need to add i or whatever flag uses the regex engine that you use to denote "ignore cases" (i.e. the a-f part of regex also match A-F). If you cannot (or do not want to) provide this flag you can put [0-9A-Fa-f] everywhere and it works.
If your regex engine does not support lookbehind you can get the same result using capturing groups:
00 00 00 [0-9a-f]{2} ([0-9a-f]{2})
Applied on the same input, this regex matches 00 00 00 20 f2 and its first (and only) capturing group matches f2.
Check it online.
Update
If it is important that the input string contains 78 somewhere after the matching part then add (?=(?: [0-9a-z]{2})* 78) to the first regex:
(?<=00 00 00 [0-9a-f]{2} )[0-9a-f]{2}(?=(?: [0-9a-z]{2})* 78)
(?= introduces a positive lookahead. It behaves similar to a lookbehind but must stay after the matching part of the reged and it is verified against the part of the string located after the matching part of the string.
(?: starts a non-capturing group.
The [0-9a-z]{2} followed or preceded by a space in the lookahead and lookbehind ensure that the entire matching string is composed only of 2 hex digit numbers separated by spaces. You can use .* instead but that will match anything, even if they do not follow the format of 2 hex digit numbers.
For the version without lookaheads or lookbehinds add (?: [0-9a-z]{2})* 78 at the end of the regex:
00 00 00 [0-9a-f]{2} ([0-9a-f]{2})(?: [0-9a-z]{2})* 78
The regex matches the entire string starting with 00 00 00 and ending with 78 and the first capturing group matches the second number after 00 00 00 (your target).
Is the f2 surrounded by asterisks?
Without asterisks:
00 00 00 [a-f0-9]+ (?<hexits>[a-f0-9]+).+78
With asterisks:
\*(?<hexits>[a-f0-9]+)\*
You can use the following regex to match the hexadecimal value after "00 00 00": /00 00 00 ([0-9A-Fa-f]{2})/. The value you want is in the capturing group, represented by \1.
Here is a demo:
import re
s = '70 75 62 71 00 7e 00 01 4c 00 06 72 61 6e 64 6f 6d 74 00 1c 4c 6a 2f 73 2f 6e 64 6f 6d 3b 78 70 77 25 00 00 00 20 f2 90 c2 91 c4 c4 ca 91 c0 c0 ca 91 94 cb c5 97 90 c5 90 c2 90 96 c7 ca 91 91 93 94 c6 c5 c6 cb c0 78'
match = re.search(r'00 00 00 ([0-9A-Fa-f]{2})', s)
if match:
print(match.group(1))
The output will be:
f2
You don't really need a regex for that. Get the offset of 3 bytes of zero in a row and take the 4th one after it:
s = '70 75 62 71 00 7e 00 01 4c 00 06 72 61 6e 64 6f 6d 74 00 1c 4c 6a 2f 73 2f 6e 64 6f 6d 3b 78 70 77 25 00 00 00 20 f2 90 c2 91 c4 c4 ca 91 c0 c0 ca 91 94 cb c5 97 90 c5 90 c2 90 96 c7 ca 91 91 93 94 c6 c5 c6 cb c0 78'
s2 = '01 02 03 00 00 00 05 06 07'
def locate(s):
data = bytes.fromhex(s)
offset = data.find(bytes([0,0,0]))
return data[offset + 4]
print(f'{locate(s):02X}')
print(f'{locate(s2):02X}')
Output:
F2
06
You could also extract the "f2" string directly from the string:
offset = s.index('00 00 00')
print(s[offset + 12 : offset + 14]) # 'f2'
I have a file in UTF-16 (or UCS-2, doesn't really matter since it is UTF-16 LE as far as I know) encoding which I have downloaded from here: http://www.humancomp.org
I'd like to read the contents of that file into a std::wstring, which is my first problem: I haven't been able to read the file correcly yet. The read data always seems to be messed up.
Secondly, I'd like to compare the read std::wstring to a const wchar_t* string literal. And here, I am experiencing my second problem: How do I specify the wchar_t content via hex values?
The file which I want to turn into a const wchar_t* string literal has the following bytes (copied out of a hex editor)
FE FF 05 31 05 65 05 81 05 65 05 70 05 6B 00 20 05 6B 05 74 00 20 05 6C 05 61 05 7E 00 20 00 3F 05 82 05 72 05 6B 05 65 00 20 05 6C 05 61 05 7E 05 61 05 80 05 61 05 80 00 2C 00 0D 00 0A 05 3F 05 75 05 61 05 65 05 62 05 7D 00 20 05 79 05 7F 05 61 05 75 05 6B 00 20 05 6F 05 61 05 7D 05 6F 05 61 05 6E 05 6B 00 20 05 74 05 70 05 63 05 6B 05 65 00 2E 00 2E 00 2E 00 0D 00 0A 05 31 05 75 05 65 05 7A 05 70 05 7D 00 20 05 6F 00 3F 05 82 05 66 05 70 05 6B 00 20 05 74 05 70 05 6F 05 65 00 20 05 6B 05 65 05 6E 00 20 00 3F 05 61 05 7E 05 61 05 7F 05 80 00 2C 00 0D 00 0A 05 31 05 75 05 65 05 7A 05 70 05 7D 00 20 05 6F 00 3F 05 82 05 66 05 70 05 6B 00 20 00 3F 05 61 05 7E 05 61 05 7F 05 61 05 6C 00 20 05 74 05 70 05 6F 05 6B 05 65 05 89
Of course, I can't initialize the string literal with that. I tried to turn it into hex values and apply a reinterpret_cast to get a const wchar_t*
reinterpret_cast<const wchar_t*>("\xFE\xFF\x05\x31\x05\x65\x05\x81\x05\x65\x05\x70\x05\x6B\x00\x20\x05\x6B\x05\x74\x00\x20\x05\x6C\x05\x61\x05\x7E\x00\x20\x00\x3F\x05\x82\x05\x72\x05\x6B\x05\x65\x00\x20\x05\x6C\x05\x61\x05\x7E\x05\x61\x05\x80\x05\x61\x05\x80\x00\x2C\x00\x0D\x00\x0A\x05\x3F\x05\x75\x05\x61\x05\x65\x05\x62\x05\x7D\x00\x20\x05\x79\x05\x7F\x05\x61\x05\x75\x05\x6B\x00\x20\x05\x6F\x05\x61\x05\x7D\x05\x6F\x05\x61\x05\x6E\x05\x6B\x00\x20\x05\x74\x05\x70\x05\x63\x05\x6B\x05\x65\x00\x2E\x00\x2E\x00\x2E\x00\x0D\x00\x0A\x05\x31\x05\x75\x05\x65\x05\x7A\x05\x70\x05\x7D\x00\x20\x05\x6F\x00\x3F\x05\x82\x05\x66\x05\x70\x05\x6B\x00\x20\x05\x74\x05\x70\x05\x6F\x05\x65\x00\x20\x05\x6B\x05\x65\x05\x6E\x00\x20\x00\x3F\x05\x61\x05\x7E\x05\x61\x05\x7F\x05\x80\x00\x2C\x00\x0D\x00\x0A\x05\x31\x05\x75\x05\x65\x05\x7A\x05\x70\x05\x7D\x00\x20\x05\x6F\x00\x3F\x05\x82\x05\x66\x05\x70\x05\x6B\x00\x20\x00\x3F\x05\x61\x05\x7E\x05\x61\x05\x7F\x05\x61\x05\x6C\x00\x20\x05\x74\x05\x70\x05\x6F\x05\x6B\x05\x65\x05\x89");
but this doesn't work. It gives me bogus data.
I've also tried to create a wchar_t string literal directly:
L"\xFEFF\x0531\x0565\x0581\x0565\x0570\x056B\x0020\x056B\x0574\x0020\x056C\x0561\x057E\x0020\x003F\x0582\x0572\x056B\x0565\x0020\x056C\x0561\x057E\x0561\x0580\x0561\x0580\x002C\x000D\x000A\x053F\x0575\x0561\x0565\x0562\x057D\x0020\x0579\x057F\x0561\x0575\x056B\x0020\x056F\x0561\x057D\x056F\x0561\x056E\x056B\x0020\x0574\x0570\x0563\x056B\x0565\x002E\x002E\x002E\x000D\x000A\x0531\x0575\x0565\x057A\x0570\x057D\x0020\x056F\x003F\x0582\x0566\x0570\x056B\x0020\x0574\x0570\x056F\x0565\x0020\x056B\x0565\x056E\x0020\x003F\x0561\x057E\x0561\x057F\x0580\x002C\x000D\x000A\x0531\x0575\x0565\x057A\x0570\x057D\x0020\x056F\x003F\x0582\x0566\x0570\x056B\x0020\x003F\x0561\x057E\x0561\x057F\x0561\x056C\x0020\x0574\x0570\x056F\x056B\x0565\x0589"
This, again, ends up in bogus data. I'm not even sure if this is the correct way of specifying wchar_t data - combining 2 bytes?
Here is the solution which was achieved with the help of the comment by Remy Lebeau:
// BOM: \xFEFF
auto utf16raw = L"\x0531\x0565\x0581\x0565\x0570\x056B\x0020\x056B\x0574\x0020\x056C\x0561\x057E\x0020\x003F\x0582\x0572\x056B";
std::wstring utf16str{utf16raw};
The BOM must be left out of the string.
The UTF-16 string, utf16str can be converted into an UTF-8 encoded string (and vice-versa) with the UTF-8 CPP library available on Sourceforge, for instance.
I ran into this error and I don't quite understand what happens here.
I hope the codesnippet is sufficient to make my point.
I am modifying a callback to inject my own data into a response to a server.
Basically the callstack looks as follows:
mainRoutine(browseFunction(browseInternal), myBrowseFunction))
^ takes response ^ sends response ^ catches and modifies response
and sends callback
So what happens:
I have a server and a few hundred static nodes to be read. Now I want to support a highly dynamic messagingsystem, creating a node takes 300kb so is not an option as they are ten thousands, created and deleted within seconds. Therefore I inject the message into the response and pretend a node was read.
So much for the theory. This system worked in other context already, so there is no doubt the server can handle the fake response...
Some code - written in c++ but with the serverstack in C there are no new() or delete() methods available. All variables are initialized and filled with sensible values, as far as possible.
volatile int pNoOfNodesToAppend = 5;
Boolean xAdapter::Browse(BaseNode *pNode, BrowseContext* pBrowseCtx, int i)
{
[... some initializations....]
BrowseResult* pBrowseResult = &pResponse->Results[i];
int NoOfReferences = pBrowseResult->NoOfReferences + pNoOfNodesToAppend;
pResponse->NoOfResults = NoOfReferences;
// Version one
ReferenceDescription* refDesc = reinterpret_cast <ReferenceDescription *>(realloc(
pBrowseResult->References,
sizeof(OpcUa_ReferenceDescription) * NoOfReferences));
//Version two I tried just out of curiousity to see whether copying "by hand" would cause the programm to crash, as it didn't allocate enough memory - but no problem there.
/*
ReferenceDescription* refDesc = reinterpret_cast <ReferenceDescription *>(malloc( NoOfReferences * sizeof(ReferenceDescription)));
for (int k = 0; k < NoOfReferences; k++)
{
memcpy(&pBrowseResult->References[0], &pBrowseResult->References[k], sizeof(ReferenceDescription));
}
*/
int size = _msize(refDesc);
pBrowseResult->NoOfReferences = NoOfReferences;
if (refDesc != NULL)
{
pBrowseResult->References = refDesc;
}
else
{
return False;
/* Errorhandling ... */
}
[Fill with data... check for errors, handle errors]
return True;
}
I know this code looks cumbersome, but most of it cannot be done easier due to the underlying stack, as it gives me a hard time casting types to and forth, containing lots and lots of structures.
This code compiles and runs fine, once the callback is sent it crashes with an access violation at ABABABAB, which as I found out is a magic number used by Microsoft debug to mark guard bits around heapAlloc() memory (4 bits before and after).
See here: Magic_debug_values 1
Edit: This section is solved. I was just too blind to realize that we are talking about HEX here and thus too dumb to calculate my numbers correctly. So consider it unworthy of reading except for understanding of comments.
What really gives me headache is the memsize of the allocated new array.
NoOfReferences: 6
sizeof(ReferenceDescription) 0x00000080 unsigned int
(NoOfReferences * sizeof(ReferenceDescription)) 0x00000300 unsigned long
sizeof(*refDesc) 0x00000080 unsigned int //pointer to first element of array
_msize says:
size of (*refDesc) 0x00000300 int
Now WHY is the size of the newly allocated space 300? If my mind is not playing tricks on me then 6*80 is 480, even if there where 8 guard bits around every single element it would still be 72*6 > 300 bit. Anyway the system proceeds normally.
Now in the next chunk of code the structures in the array are filled with useful data and handed back to the Response structure.
The Callback is sent, the server ist going back to the ServerMain() and then crashes with first chance and unhandled exception
Unhandled exception at 0x5f95ed6a in demoserver.exe: 0xC0000005:
Access violation reading location 0xabababab.
Memory
0x5F95ED6A f3 a5 ff 24 95 84 ee 95 5f 90 8b c7 ba 03 00 00 00 83 e9 04 72 ó¥ÿ$..î._..Ǻ....ƒé.r
0x5F95ED7F 0c 83 e0 03 03 c8 ff 24 85 98 ed 95 5f ff 24 8d 94 ee 95 5f 90 .ƒà..Èÿ$.˜í._ÿ$.”î._.
0x5F95ED94 ff 24 8d 18 ee 95 5f 90 a8 ed 95 5f d4 ed 95 5f f8 ed 95 5f 23 ÿ$..î._.¨í._Ôí._øí._#
0x5F95EDA9 d1 8a 06 88 07 8a 46 01 88 47 01 8a 46 02 c1 e9 02 88 47 02 83 ÑŠ.ˆ.ŠF.ˆG.ŠF.Áé.ˆG.ƒ
0x5F95EDBE c6 03 83 c7 03 83 f9 08 72 cc f3 a5 ff 24 95 84 ee 95 5f 8d 49 Æ.ƒÇ.ƒù.rÌó¥ÿ$..î._.I
0x5F95EDD3 00 23 d1 8a 06 88 07 8a 46 01 c1 e9 02 88 47 01 83 c6 02 83 c7 .#ÑŠ.ˆ.ŠF.Áé.ˆG.ƒÆ.ƒÇ
0x5F95EDE8 02 83 f9 08 72 a6 f3 a5 ff 24 95 84 ee 95 5f 90 23 d1 8a 06 88 .ƒù.r¦ó¥ÿ$..î._.#ÑŠ.ˆ
0x5F95EDFD 07 83 c6 01 c1 e9 02 83 c7 01 83 f9 08 72 88 f3 a5 ff 24 95 84 .ƒÆ.Áé.ƒÇ.ƒù.rˆó¥ÿ$..
0x5F95EE12 ee 95 5f 8d 49 00 7b ee 95 5f 68 ee 95 5f 60 ee 95 5f 58 ee 95 î._.I.{î._hî._`î._Xî.
0x5F95EE27 5f 50 ee 95 5f 48 ee 95 5f 40 ee 95 5f 38 ee 95 5f 8b 44 8e e4 _Pî._Hî._#î._8î._.DŽä
0x5F95EE3C 89 44 8f e4 8b 44 8e e8 89 44 8f e8 8b 44 8e ec 89 44 8f ec 8b .D.ä.DŽè.D.è.DŽì.D.ì.
0x5F95EE51 44 8e f0 89 44 8f f0 8b 44 8e f4 89 44 8f f4 8b 44 8e f8 89 44 DŽð.D.ð.DŽô.D.ô.DŽø.D
0x5F95EE66 8f f8 8b 44 8e fc 89 44 8f fc 8d 04 8d 00 00 00 00 03 f0 03 f8 .ø.DŽü.D.ü........ð.ø
0x5F95EE7B ff 24 95 84 ee 95 5f 8b ff 94 ee 95 5f 9c ee 95 5f a8 ee 95 5f ÿ$..î._.ÿ”î._œî._¨î._
0x5F95EE90 bc ee 95 5f 8b 45 08 5e 5f c9 c3 90 8a 06 88 07 8b 45 08 5e 5f .î._.E.^_ÉÃ.Š.ˆ..E.^_
0x5F95EEA5 c9 c3 90 8a 06 88 07 8a 46 01 88 47 01 8b 45 08 5e 5f c9 c3 8d ÉÃ.Š.ˆ.ŠF.ˆG..E.^_ÉÃ.
So the mistake was found. The problem was neither the allocation nor the reassignment of the array but rather the fact, that the API didn't behave as expected and marshalled several callbacks. Trying to add mine by appeding it crashed and caused the exception as it was not ment to be done that way. (Solution and structure would be too complex to post it here.)
Thank you for your time and hints anyway, I learned a lot while chasing the errors!
I've working with a legacy application and I'm trying to work out the difference between applications compiled with Multi byte character set and Not Set under the Character Set option.
I understand that compiling with Multi byte character set defines _MBCS which allows multi byte character set code pages to be used, and using Not set doesn't define _MBCS, in which case only single byte character set code pages are allowed.
In the case that Not Set is used, I'm assuming then that we can only use the single byte character set code pages found on this page: http://msdn.microsoft.com/en-gb/goglobal/bb964654.aspx
Therefore, am I correct in thinking that is Not Set is used, the application won't be able to encode and write or read far eastern languages since they are defined in double byte character set code pages (and of course Unicode)?
Following on from this, if Multi byte character set is defined, are both single and multi byte character set code pages available, or only multi byte character set code pages? I'm guessing it must be both for European languages to be supported.
Thanks,
Andy
Further Reading
The answers on these pages didn't answer my question, but helped in my understanding:
About the "Character set" option in visual studio 2010
Research
So, just as working research... With my locale set as Japanese
Effect on hard coded strings
char *foo = "Jap text: テスト";
wchar_t *bar = L"Jap text: テスト";
Compiling with Unicode
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Compiling with Multi byte character set
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Compiling with Not Set
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Conclusion:
The character encoding doesn't have any effect on hard coded strings. Although defining chars as above seems to use the Locale defined codepage and wchar_t seems to use either UCS-2 or UTF-16.
Using encoded strings in W/A versions of Win32 APIs
So, using the following code:
char *foo = "C:\\Temp\\テスト\\テa.txt";
wchar_t *bar = L"C:\\Temp\\テスト\\テw.txt";
CreateFileA(bar, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
CreateFileW(foo, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
Compiling with Unicode
Result: Both files are created
Compiling with Multi byte character set
Result: Both files are created
Compiling with Not set
Result: Both files are created
Conclusion:
Both the A and W version of the API expect the same encoding regardless of the character set chosen. From this, perhaps we can assume that all the Character Set option does is switch between the version of the API. So the A version always expects strings in the encoding of the current code page and the W version always expects UTF-16 or UCS-2.
Opening files using W and A Win32 APIs
So using the following code:
char filea[MAX_PATH] = {0};
OPENFILENAMEA ofna = {0};
ofna.lStructSize = sizeof ( ofna );
ofna.hwndOwner = NULL ;
ofna.lpstrFile = filea ;
ofna.nMaxFile = MAX_PATH;
ofna.lpstrFilter = "All\0*.*\0Text\0*.TXT\0";
ofna.nFilterIndex =1;
ofna.lpstrFileTitle = NULL ;
ofna.nMaxFileTitle = 0 ;
ofna.lpstrInitialDir=NULL ;
ofna.Flags = OFN_PATHMUSTEXIST|OFN_FILEMUSTEXIST ;
wchar_t filew[MAX_PATH] = {0};
OPENFILENAMEW ofnw = {0};
ofnw.lStructSize = sizeof ( ofnw );
ofnw.hwndOwner = NULL ;
ofnw.lpstrFile = filew ;
ofnw.nMaxFile = MAX_PATH;
ofnw.lpstrFilter = L"All\0*.*\0Text\0*.TXT\0";
ofnw.nFilterIndex =1;
ofnw.lpstrFileTitle = NULL;
ofnw.nMaxFileTitle = 0 ;
ofnw.lpstrInitialDir=NULL ;
ofnw.Flags = OFN_PATHMUSTEXIST|OFN_FILEMUSTEXIST ;
GetOpenFileNameA(&ofna);
GetOpenFileNameW(&ofnw);
and selecting either:
C:\Temp\テスト\テopenw.txt
C:\Temp\テスト\テopenw.txt
Yields:
When compiled with Unicode
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
When compiled with Multi byte character set
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
When compiled with Not Set
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
Conclusion:
Again, the Character Set setting doesn't have a bearing on the behaviour of the Win32 API. The A version always seems to return a string with the encoding of the active code page and the W one always returns UTF-16 or UCS-2. I can actually see this explained a bit in this great answer: https://stackoverflow.com/a/3299860/187100.
Ultimate Conculsion
Hans appears to be correct when he says that the define doesn't really have any magic to it, beyond changing the Win32 APIs to use either W or A. Therefore, I can't really see any difference between Not Set and Multi byte character set.
No, that's not really the way it works. The only thing that happens is that the macro gets defined, it doesn't otherwise have a magic effect on the compiler. It is very rare to actually write code that uses #ifdef _MBCS to test this macro.
You almost always leave it up to a helper function to make the conversion. Like WideCharToMultiByte(), OLE2A() or wctombs(). Which are conversion functions that always consider multi-byte encodings, as guided by the code page. _MBCS is an historical accident, relevant only 25+ years ago when multi-byte encodings were not common yet. Much like using a non-Unicode encoding is a historical artifact these days as well.
In the reference it is stated that:
By definition, the ASCII character set is a subset of all
multibyte-character sets. In many multibyte character sets, each
character in the range 0x00 – 0x7F is identical to the character that
has the same value in the ASCII character set. For example, in both
ASCII and MBCS character strings, the 1-byte NULL character ('\0') has
value 0x00 and indicates the terminating null character.
As you guessed, by enabling _MBCS Visual Studio also supports ASCII single character set.
In a second reference, single character set seems to be supported even if we enable _MBCS:
MBCS/Unicode portability: Using the Tchar.h header file, you can build
single-byte, MBCS, and Unicode applications from the same sources.
Tchar.h defines macros prefixed with _tcs , which map to str, _mbs, or
wcs functions, as appropriate. To build MBCS, define the symbol _MBCS.
To build Unicode, define the symbol _UNICODE. By default, _MBCS is
defined for MFC applications. For more information, see Generic-Text
Mappings in Tchar.h.
I'm working on understanding and drawing my own DLL for PDF417 (2d barcodes). Anyhow, the actual drawing of the file is perfect, and in correct boundaries of 32 bits (as monochrome result). At the time of writing the data, the following is a memory dump as copied from C++ Visual Studio memory dump of the pointer to the bmp buffer. Each row is properly allocated to 36 wide before the next row.
Sorry about the wordwrap in the post, but my output was intended to be the same 36 bytes wide as the memory dump so you could better see the distortion.
The current drawing is 273 pixels wide by 12 pixels high, monochrome...
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02 eb 00 00
00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02 eb 00 00
00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02 eb 00 00
00 ab 85 7e d0 29 e8 14 f4 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79 bc 5c 0a 8e 02 eb 00 00
00 ab 85 7e d0 29 e8 14 f4 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79 bc 5c 0a 8e 02 eb 00 00
00 ab 85 7e d0 29 e8 14 f4 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79 bc 5c 0a 8e 02 eb 00 00
00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c db ce 6d 5f be 02 eb 00 00
00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c db ce 6d 5f be 02 eb 00 00
00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c db ce 6d 5f be 02 eb 00 00
Here is the code to WRITE the file out -- verbatim immediately at the time of the memory dump from above
FILE *stream;
if( fopen_s( &stream, cSaveToFile, "w+" ) == 0 )
{
fwrite( &bmfh, 1, (UINT)sizeof(BITMAPFILEHEADER), stream );
fwrite( &bmi, 1, (UINT)sizeof(BITMAPINFO), stream );
fwrite( &RGBWhite, 1, (UINT)sizeof(RGBQUAD), stream );
fwrite( ppvBits, 1, (UINT)bmi.bmiHeader.biSizeImage, stream );
fclose( stream );
}
Here's what ACTUALLY Gets written to the file.
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab a8 61 d7 18 ed 18 f7 a3 89 1c dd 70 86 f5 f7 1a 20 91 3b c9 27 e7 67 12 1c 68 ae 3c b7 3e 02 eb 00 00
00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0d 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02 eb 00
00 00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0d 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02 eb
00 00 00 ab 81 4b ca 07 6b 9c 11 40 9a e6 0c 76 0d 0a fc a3 33 70 bb 30 55 87 e9 c4 10 58 d9 ea 0d 48 3e 02
eb 00 00 00 ab 85 7e d0 29 e8 14 f4 0d 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79 bc 5c 0d 0a
8e 02 eb 00 00 00 ab 85 7e d0 29 e8 14 f4 0d 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79 bc 5c
0d 0a 8e 02 eb 00 00 00 ab 85 7e d0 29 e8 14 f4 0d 0a 7a 05 3c 37 ba 86 87 04 db b6 09 dc a0 62 fc d1 31 79
bc 5c 0d 0a 8e 02 eb 00 00 00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c
db ce 6d 5f be 02 eb 00 00 00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c
db ce 6d 5f be 02 eb 00 00 00 ab 85 43 c5 30 e2 26 70 4a 1a f3 e4 4d ce 2a 3f 79 cd bc e6 de 73 6f 39 b7 9c
db ce 6d 5f be 02 eb 00 00
Notice the start of the distortion with the "0d" in the result from reading the file back in the 4th line, about the 15th byte over... Then, there are a few more staggered around which in total, skew the image off by 9 bytes worth...
Obviously, the drawing portion is working ok as everything remains properly aligned in memory for the 12 lines.
Shouldn't you open the file in a compound mode i.e. writable & binary as in wb+?
Notice the start of the distortion with the "0d"
That's ASCII code for Carriage Return (CR) -- added on some OSes with newline (where a newline is actually a sequence of CR/LF). This should go away once you start writing the output in binary mode.
Your code looks neat otherwise. Cheers!
Your 0x0A (\n) gets converted to DOS format 0x0D0A (\r\n), becouse you're write the file in text mode. Switch to binary mode.
I actually just did a similar thing in java (printing bmp data to a thermal receipt printer). There are a couple of things i want to share with you:
bmp image data != an image format from microsoft. the MS bitmap has about 54 bytes of header information before any image data. (i spent a day or two working on this before I realized the difference)
bmp image data reads left to right, top to bottom, with the most significant bit on the left.
make sure the barcode image has a bitdepth of 1. this means 1 bit = 1 pixel. hexidecimal "ab" is 10101011 in binary, those 8 pixels will be filled in accordingly.
if you have a barcode 36 bytes wide, the barcode resolution is 288 x 12, not 273 x 12. (36 * 8 = 288).
the image data should be 432 bytes in size (12 rows of 36 bytes).
i dont know what this means:
Anyhow, the actual drawing of the file is perfect, and in correct boundaries of 32 bits (as monochrome result).
monochrome means its either 1 color or another. the pixel (think bit) is either filled in or it isnt.
Hope this helps