We have an MFC application that handles data with either MS-Access, Oracle or SQL-Server.
For a specific treatment, we have to use database transactions.
On Oracle or SQL-Server, everything is ok, but in MS-Access, we get the "File sharing lock count exceeded. Increase MaxLocksPerFile registry entry." message, as described here http://support.microsoft.com/kb/815281
The question is what is the maximum coherent value that I can put there?
I was thinking about setting this value programatically at the appplication statup...
Thanks
After doing some research, I didn't see any article mentioning a maximum value.
Theoretically, the value being a DWORD, we could put up to FFFFFFFF (hex) or 4294967295 (decimal), but I'm not sure if it's a coherent value.
Related
I'm trying to play with Kafka Stream to aggregate some attribute of People.
I have a kafka stream test like this :
new ConsumerRecordFactory[Array[Byte], Character]("input", new ByteArraySerializer(), new CharacterSerializer())
var i = 0
while (i != 5) {
testDriver.pipeInput(
factory.create("input",
Character(123,12), 15*10000L))
i+=1;
}
val output = testDriver.readOutput....
I'm trying to group the value by key like this :
streamBuilder.stream[Array[Byte], Character](inputKafkaTopic)
.filter((key, _) => key == null )
.mapValues(character=> PersonInfos(character.id, character.id2, character.age) // case class
.groupBy((_, value) => CharacterInfos(value.id, value.id2) // case class)
.count().toStream.print(Printed.toSysOut[CharacterInfos, Long])
When i'm running the code, I got this :
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 1
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 2
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 3
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 4
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 5
Why i'm getting 5 rows instead of just one line with CharacterInfos and the count ?
Doesn't groupBy just change the key ?
If you use the TopologyTestDriver caching is effectively disabled and thus, every input record will always produce an output record. This is by design, because caching implies non-deterministic behavior what makes itsvery hard to write an actual unit test.
If you deploy the code in a real application, the behavior will be different and caching will reduce the output load -- which intermediate results you will get, is not defined (ie, non-deterministic); compare Michael Noll's answer.
For your unit test, it should actually not really matter, and you can either test for all output records (ie, all intermediate results), or put all output records into a key-value Map and only test for the last emitted record per key (if you don't care about the intermediate results) in the test.
Furthermore, you could use suppress() operator to get fine grained control over what output messages you get. suppress()—in contrast to caching—is fully deterministic and thus writing a unit test works well. However, note that suppress() is event-time driven, and thus, if you stop sending new records, time does not advance and suppress() does not emit data. For unit testing, this is important to consider, because you might need to send some additional "dummy" data to trigger the output you actually want to test for. For more details on suppress() check out this blog post: https://www.confluent.io/blog/kafka-streams-take-on-watermarks-and-triggers
Update: I didn't spot the line in the example code that refers to the TopologyTestDriver in Kafka Streams. My answer below is for the 'normal' KStreams application behavior, whereas the TopologyTestDriver behaves differently. See the answer by Matthias J. Sax for the latter.
This is expected behavior. Somewhat simplified, Kafka Streams emits by default a new output record as soon as a new input record was received.
When you are aggregating (here: counting) the input data, then the aggregation result will be updated (and thus a new output record produced) as soon as new input was received for the aggregation.
input record 1 ---> new output record with count=1
input record 2 ---> new output record with count=2
...
input record 5 ---> new output record with count=5
What to do about it: You can reduce the number of 'intermediate' outputs through configuring the size of the so-called record caches as well as the setting of the commit.interval.ms parameter. See Memory Management. However, how much reduction you will be seeing depends not only on these settings but also on the characteristics of your input data, and because of that the extent of the reduction may also vary over time (think: could be 90% in the first hour of data, 76% in the second hour of data, etc.). That is, the reduction process is deterministic but from the resulting reduction amount is difficult to predict from the outside.
Note: When doing windowed aggregations (like windowed counts) you can also use the Suppress() API so that the number of intermediate updates is not only reduced, but there will only ever be a single output per window. However, in your use case/code you the aggregation is not windowed, so cannot use the Suppress API.
To help you understand why the setup is this way: You must keep in mind that a streaming system generally operates on unbounded streams of data, which means the system doesn't know 'when it has received all the input data'. So even the term 'intermediate outputs' is actually misleading: at the time the second input record was received, for example, the system believes that the result of the (non-windowed) aggregation is '2' -- its the correct result to the best of its knowledge at this point in time. It cannot predict whether (or when) another input record might arrive.
For windowed aggregations (where Suppress is supported) this is a bit easier, because the window size defines a boundary for the input data of a given window. Here, the Suppress() API allows you to make a trade-off decision between better latency but with multiple outputs per window (default behavior, Suppress disabled) and longer latency but you'll get only a single output per window (Suppress enabled). In the latter case, if you have 1h windows, you will not see any output for a given window until 1h later, so to speak. For some use cases this is acceptable, for others it is not.
I'm trying to develop a function which would refresh token model in django rest framework.They seem to use binascii.hexlify(os.urandom(32)).decode() for generating unique tokens for every user.How does this line ensures that token generated by it will always be unique.Suppose if i want to refresh content of token after every 10 months ,then, will binascii.hexlify(os.urandom(32)).decode() will generate unique key that has not been used by any current user or i need to check whether it is being used or not?
help(os.urandom) says:
Return a bytes object containing random bytes suitable for cryptographic use.
On Linux this will use the /dev/urandom character device which is designed to be cryptographically secure. Only time it could fail to generate so would be the very early stage of boot when the entropy pool is not initialized yet 1. But once it's initialized and seeded from the previouse seed, device drives and so on you would generate cryptographic grade randomness.
Also check man 4 urandom.
1 getrandom(2) system call is there for these cases, which is blocking unlike reading from /dev/urandom.
binascii.hexlify(os.urandom(32)).decode():
os.urandom(32) returns 32 bytes of random data
binascii.hexlify returns the hex represntation of the bytes
as the return from hexlify is bytes we need to decode it to get string
So as the original random bytes are being retrieved from os.urandom this should be (cryptographically) secure randomness.
Given a Windows socket, I want to determine which values it is using for the TCP keepalive idle time and the TCP keepalive interval time (roughly equivalent to the TCP_KEEPIDLE and TCP_KEEPINTVL settings on Berkeley sockets).
I see that you can set these values using a WSAIoctl call (see http://msdn.microsoft.com/en-us/library/windows/desktop/dd877220%28v=vs.85%29.aspx ). However, there does not appear to be any API for reading their current values. I tried calling WSAIoctl with a populated output parameter but NULL input parameter, like this:
DWORD bytes_returned;
struct tcp_keepalive keepalive_opts;
int rv = WSAIoctl(socket, SIO_KEEPALIVE_VALS, NULL, 0, &keepalive_opts, sizeof(keepalive_opts), &bytes_returned, NULL, NULL);
But this returns me a WSAEFAULT ("The system detected an invalid pointer address in attempting to use a pointer argument in a call.").
I could call WSAIoctl with both an input and an output parameter, but I don't want to set the values, I just want to read them. And as far as I can tell, providing any non-NULL input parameter would cause the parameters to be set to whatever values happen to be in that memory space (defined by the struct tcp_keepalive; again see http://msdn.microsoft.com/en-us/library/windows/desktop/dd877220%28v=vs.85%29.aspx ).
The above also highlights another problem with not knowing what the current values are: I can't set just one of the keepalive idle time or the keepalive interval time - I must blow away both (unknown) values at the same time since they're both members of the struct I'm required to provide.
I know that I could assume things about what values are set based on Windows documentation, but I'd rather not assume. I see that http://technet.microsoft.com/en-us/library/bb726981.aspx#EDAA defines KeepAliveInterval and KeepAliveTime default values. However, the Parameters folder in my Windows 7 registry does not contain either of those keys, so I really have to rely on the documentation being 100% correct here (to know the default values a socket will receive), which is much worse than programmatically retrieving them (even retrieving them from the registry might be ok, but the above experience shows I can't).
Is there any way to get the current TCP keepalive idle time and the TCP keepalive interval time values for a Windows socket?
Unlike TCP_KEEPIDLE and TCP_KEEPINTVL, which can be used with getsockopt(), there is no way to read the current SIO_KEEPALIVE_VALS values for a socket, only to set them.
I know I can find other answers about this on SO, but I want clarifications from somebody who really knows MPEG-1/MPEG-2 (or MP3, obviously).
The start of an MPEG-1/2 frame is 12 set bits starting at a byte boundary, so bytes ff f*, where * is any nibble. Those 12 bits are called a sync word. This is a useful characteristic to find the start of a frame in any MPEG-1/2 stream.
My first question is: formally, can a false sync word be found or not in the payload of an MPEG-1/2 frame, outside its header?
If so, here's my second question: why does the sync word mechanism even exist then? If we cannot make sure that we found a new frame when reading fff, what is the purpose of this sync word?
Please do not even consider ID3 in your answer; I already know about sync words that can be found in ID3v2 payloads, but that's well documented.
I worked on MPEG-2 streams, more precisely Transport Streams (TS): I guess we can find similarities.
A TS is composed of Transport Packets, which have a header, starting with a sync byte 0x47.
We also can found 0x47 within the payload of the TP, but we know that it is not a sync byte because it is not aligned (TP have a fixed size of 188 bytes).
The sync word gives an entry point to someone that looks at the stream, and allows a program to synchronize his process with the stream, hence the name.
It also allows a fast browsing and parsing of the stream: in a TS you can jump from a packet to another (inspect header, check sync byte, skip 188 bytes and so on)
Finally it is a safety measure that helps you to spot errors (in the stream during transmission for example or in the process if a bug caused a bad alignment)
These argument are about TS but I think the same goes with your case : finding a sync word within a payload should not be an issue because you should always able to distinguish payload and header, most of the time with a length information (either because the size is fixed like in TP or because you have a TLV format).
can a false sync word be found or not in the payload of an MPEG-1/2
frame, outside its header?
According to this, "frame sync can be easily (and very frequently) found in any binary file." See the section titled "MPEG Audio Frame Header"
I confirmed this with an .mp3 song that I chose at random (stripped of ID3 tags). It had 5193 sync words, of which only 4898 were found to be valid (using code too long to be included here).
>>> f = open('notag.mp3', 'rb')
>>> r=f.read()
>>> r.count(b'\xff\xfb')
5193
why does the sync word mechanism even exist then? If we cannot make
sure that we found a new frame when reading fff, what is the purpose
of this sync word?
We can be (relatively) sure if we are checking the rest of the frame header, and not just the sync word. There are bits following the sync which can be used to:
identify a false positive or
give you useful info
With .mp3, you have to use those useful bits to calculate the size of the frame. By skipping ahead <frame-size> bytes before looking for the next sync word, you avoid any false syncs that may be present in the payload. See the section titled "How to calculate frame length" in that same link.
I want to implement an progress bar in my C++ windows application when downloading a file using WinHTTP. Any idea how to do this? It looks as though the WinHttpSetStatusCallback is what I want to use, but I don't see what notification to look for... or how to get the "percent downloaded"...
Help!
Thanks!
Per the docs:
WINHTTP_CALLBACK_STATUS_DATA_AVAILABLE
Data is available to be retrieved with
WinHttpReadData. The
lpvStatusInformation parameter points
to a DWORD that contains the number of
bytes of data available. The
dwStatusInformationLength parameter
itself is 4 (the size of a DWORD).
and
WINHTTP_CALLBACK_STATUS_READ_COMPLETE
Data was successfully read from the
server. The lpvStatusInformation
parameter contains a pointer to the
buffer specified in the call to
WinHttpReadData. The
dwStatusInformationLength parameter
contains the number of bytes read.
There may be other relevant notifications, but these two seem to be the key ones. Getting "percent" is not necessarily trivial because you may not know how much data you're getting (not all downloads have content-length set...); you can get the headers with:
WINHTTP_CALLBACK_STATUS_HEADERS_AVAILABLE
The response header has been received
and is available with
WinHttpQueryHeaders. The
lpvStatusInformation parameter is
NULL.
and if Content-Length IS available then the percentage can be computed by keeping track of the total number of bytes at each "data available" notification, otherwise your guess is as good as mine;-).