Apache Arrow Flight: Releasing the flight stream that was created by GetFlightInfo - apache-arrow

According to the Arrow Flight protocol definition, a client(consumer) can let the server generate a flight stream through a specified descriptor in GetFlightInfo. And the flight stream will be available for the duration defined by the server(a flight service).
But it seems that there is no rpc message that 'releases' the flight stream that GetFlightInfo generated.
Since the client has no standard way to know or control the duration of the flight stream availability, it is impossible to implement a reliable client application.
And since the server has no standard way to know when the client is done with the flight stream, it is impossible to implement efficient flight stream management.
Of course, the duration can be published and the method can be implemented in a non-standard way by the client and server that know each other, but a general client (like a BI application) that uses a standard wrapper - for example, Apache Arrow Flight SQL, let alone a wrapper of wrapper: Apache Arrow Flight SQL JDBC driver - is out of luck.
Is there any standard way for a client(consumer) to release the flight stream that GetFlightInfo generated? If not, why did the designers choose not to support that feature?

Related

What (standard) protocols offer piecemeal access to streams?

I have an application in which I need to access large files remotely in a piecemeal fashion. I will know the start offset, but - having read some prefix of the file from that position onwards, I will establish another new offset, and will want to read next from this new position - crucially - having suffered the minimum possible latency.
I've considered using HTTP - posting a request detailing the offset at which to start a transfer - but I don't want to either specify a transfer size (a size too small would lead to low throughput; a size too large would lead to an unacceptable latency.) or drop an open connection - as that incurs a latency penalty on reconnection.
I've considered 'rolling my own' with TCP/UDP and sockets - but it feels as if this approach involves re-inventing the wheel. UDP might promise lowest latency, but I am not in a position to trade reliability for lower-latency.
I would be very interested to be pointed towards any standards (proposals, RFCs - etc.) about protocols to tackle this mode of access to data. Perhaps there's a good approach developed already in the context of cloud storage?
What you want, I believe, is a variation of the FTP protocol (RFC 959: https://www.rfc-editor.org/rfc/rfc959). I don't think there's any established protocol standard for what you want to do, exactly, but FTP is very close. It uses two connections, a "control connection" and a "data connection". The control connection handles passing of of commands from client to server and return status messages and the data connection is used separately to transfer the data. It sounds like this is the kind of system you need to set up.
The main thing you want to do that's different is to be able to seek to and transfer data from arbitrary offsets in your file which can be easily accomplished via custom commands. Depending on your setup, you may be able to grab existing open source implementations of FTP clients and servers and just add your custom commands.

C++ how to accept server-push data?

My situation: I would like to create a hobby project for improving my C++ involving real-time/latency programming.
I have decided I will write a small Java program which will send lots of random stock prices to a client, where the client will be written in C++ and accept all the prices.
I do not want the C++ client to have to poll/have a while loop which continuously checks for data even if there is none.
What options do I have for this? If it's easier to accomplish having a C++ server then that is not a problem.
I presume for starters I will have to use the boost ASIO package for networking?
I will be doing this on windows 7.
Why not just have the Java server accept connections and then wait for some duration of time. e.g. 10 seconds. Within that time if data becomes available, send it and close the connection.
Then the C++ client can have a thread which opens a connection whenever the previous one has completed.
That should give quite low latency without creating connections very often when there is no new data.
This is basically the Comet web programming model, which is used for many applications.
Think about how a web server receives data. When a URL is accessed the data is pushed to the server. The server need not poll the client (or indeed know anything about the client other than its a service pushing bytes towards it).
You could use a Java servlet to accept the data over HTTP and write the code in this fashion. Similarly, boost::asio has a server example that should get you started. Under the hood, you could enable persistent HTTP so that the connections aren't opened / closed frequently. This'll make the coding model much simpler.
I do not want the C++ client to have to poll/have a while loop which
continuously checks for data
Someone HAS to.
Need not be you. I've never used boost ASIO, but it might provide a callback registration. If yes, then just register a callback function of yours with boost, boost would do the waiting and give you a call back when it gets some data.
Other option is of course that you use some functions which are synchronous. Like (not a real function) Socket.read() which blocks the thread until there is data in the socket or it's closed. But in this case you're dedicating a thread of your own.
--edit--
Abt the communication itself. Just pick any IPC mechanism (sockets/pipes/files/...), someone already described one I think. Once you send the data, the data itself is "encoded" and "decoded" by you, so you can create your own protocol. E.g. "%%<STOCK_NAME>=<STOCK_PRICE>##" where "%%", = and ## (markers to mark start, mid and end) that you add on sender side and remove on receiver side to get stock name and price.
You can develop the protocol further based on your needs. Like you can also send buy/sell recommendation or, text alert msgs with major stock exchange news. As long as your client and server understand how the data is "encoded" you're good.
Finally, if you want to secure teh communication (and say you're not using some secure layer (SSL)) then you can encrypt the data. But that's a different chapter. :)
HTH

Options for inter-service one-way communication

I'm searching for different options for implementing communication between a service and other services/applications.
What I would like to do:
I have a service that is constantly running, polling a device connected to a serial port. At certain points, this service should send a message to interested clients containing data retrieved from the device. Data is uncomplicated, most likely just a single string.
Ideally, the clients would not have to subscribe to receive these messages, which leads me to some sort of event 'broadcast' setup (similar to Windows events). The message sending process should not block, and does not need a response from any clients (or that there even are any clients for that matter).
I've been reading about IPC (COM in particular) and windows events, but am yet to come across something that really fits with what I want to do.
So is this possible? If so, what technologies should I be using? If not, what are some viable communication alternatives?
Here's the particulars of the setup:
Windows 2000/XP environments
'Server' service is a windows service, using VC++2005
Clients would vary, but always be in the windows environment (usual clients would be VC++6 windows services, VB6 applications)
Any help would be appreciated!
Windows supports broadcasting messages, check here. You can SendMessage to HWND_BROADCAST from the service, and receive it in each client.
There are a number of ways to do a broadcast system, but you'll have to either give up reliability (ie, some messages must be lost) or use a proper subscription system.
If you're willing to give up reliability, you can create a shared memory segment and named manual-reset event object. When a new message arrives, write it to the shared memory segment, signal the event object, then close the event object and create a new one with a different name (the name should be in the shmem segment somewhere). Clients open the shmem segment, find the current event object, wait for it to be signaled, then read off the message and new event segment.
In this option, you must be careful to deal with the case of a client reading at the same time as the shmem segment is updated properly. One way to do this is to have two sequence number fields in the shmem segment - one is updated before the new message is written, one after. Clients read the second sequence number prior to reading the message, then re-read both sequence numbers after, and check that they are all equal (and discard the message and retry after a delay if they are not). Be sure to place memory barriers around accesses to these sequence numbers to ensure the compiler does not reorder them!
Of course, this is all a bit hairy. Named pipes are a lot simpler, but a subscription (of a sort) is required. The server calls CreateNamedPipe, then accepts connections with ConnectNamedPipe. Clients use CreateFile to connect to the server's pipe. The server then just loops to send data (using WriteFile) to all of its clients. Note that you will need to create addititonal instance of the pipe using CreateNamedPipe each time you accept a connection. An example of a named pipe server can be found here: http://msdn.microsoft.com/en-us/library/aa365588(v=vs.85).aspx

Recommendations on multiple types of games server

I've already developed some online games (like chess, checkers, risk clone) using server side programming (PHP and C++) and Flash (for the GUI). Now, I'd like to develop some kind of game portal (like www.mytopia.com). In order to do so, I must decide what is a good way to structure my server logic.
At first I thought in programming separated game servers for each game. In this way, each game will be an isolated program that opens a specific port to the client. I thought also in creating different servers to each game room (each game room allow 100 clients connected on the same time). Of course I'd use database to link everything (like highscores, etc).
Then, I guess it is not the best way to structure a game portal server. I'm reading about thread programming and I think that is the best way to do it. So, I thought in doing something like a connection thread that will listen only to new connection clients (that way every type of game client will connect in only one port), validate this client (login) and then tranfer this client to the specific game thread (like chess thread, checkers thread, etc). I'll be using select (or variants) to handle the asynchronous clients (I guess the "one thread per client" is not suited this time). This structure seems to be the best but how do I make the communication between threads? I've read about race conditions and global scope variables, so one solution is to have a global clients array (vector or map) that need to be locked by connection thread or game thread everytime it is changed (new connection, logout, change states, etc). Is it right?
Has anyone worked in anything like this? Any recommendations?
Thanks very much
A portal needs to be robust, scalable and extensible so that you can cope with larger audiences, more games/servers being added, etc. A good place to start is to look into the way MMOs and distributed systems are designed. This might help too: http://onlinegametechniques.blogspot.com/
Personally, I'd centralise the users by having an authentication server, then a separate game server for each game that validates users against the authentication server.
If you use threads you might have an easier time sharing data but you'll have to be more careful about security for exactly the same reason. That of course doesn't address MT issues in general.
TBH I've been doing a voip system where the server can send out many streams and the client can listen to many streams. The best architecture I've come up with so far is just to bind to a single port and use sendto and recvfrom to handle communications. If i receive a valid connect packet from a client on a new address then I add the client to an internal list and begin sending audio data to them. The packet receive and response management (RRM) all happens in one thread. The audio, as it becomes ready, then gets sent to all the clients from the audio thread. The clients respond saying they received the audio and that gets handle on the RRM thread. If the client fails to respond for longer than 30 seconds then I send a disconnect and remove the client from my internal list. I don't need to be particularly fault tolerant.
As for how to do this in a games situation my main thought was to send a set of impulse vectors (the current one and 'n' previous ones). This way if the client moves out of sync it can check how out of sync it is by checking the last few impulses it should have received for a given object. If it doesn't correspond to what its got then it can either correct or if it is too far out of sync it can ask for a game state reset. The idea being to try and avoid doig a full game state reset as it is going to be quite an expensive thing to do.
Obviously each packet would be hashed so the client can check the validity of incoming packets but it also allows for the client to ignore an invalid packet and still get the info it needs in the next update and thus helping prevent the state reset.
On top of that its worth doing things like keeping an eye on where the client is. There is no point in sending updates to a client when the client is looking in the other direction or there is something in the way (ie the client can't see the object its being told about). This also limits the effectiveness of a wallhack packet sniffing the incoming packets. Obviously you have to start sending things a tad before the object becomes visible, however, or you will get things popping into existence at inconvenient moments.
Anyway ... thats just some random thoughts. I have to add that I've never actually written a multiplayer engine for a game so I hope my musings help ya a bit :)

Middleware with generic communication media layer

Greetings all,
I'm trying to implement middleware (driver) for an embedded device with generic communication media layer. Not sure what is the best way to do it so I'm seeking an advice from more experienced stackoverflow users:). Basically we've got devices around the country communicating with our servers (or a pda/laptop in used in field). Usual form of communication is over TCP/IP, but could be also using usb, RF dongle, IR, etc. The plan is to have object corresponding with each of these devices, handling the proprietary protocol on one side and requests/responses from other internal systems on the other.
The thing is how create something generic in between the media and the handling objects. I had a play around with the TCP dispatcher using boost.asio but trying to create something generic seems like a nightmare :). Anybody tried to do something like that? What is the best way how to do it?
Example: Device connects to our Linux server. New middleware instance is created (on the server) which announces itself to one of the running services (details are not important). The service is responsible for making sure that device's time is synchronized. So it asks the middleware what is the device's time, driver translates it to device language (protocol) and sends the message, device responses and driver again translates it for the service. This might seem as a bit overkill for such a simple request but imagine there are more complex requests which the driver must translate, also there are several versions of the device which use different protocol, etc. but would use the same time sync service. The goal is to abstract the devices through the middleware to be able to use the same service to communicate with them.
Another example: we find out that the remote communications with the device are down. So we send somebody out with PDA, he connects to the device using USB cable. Starts up the application which has the same functionality as the timesync service. Again middleware instance is created (on the PDA) to translate communication between application and the device this time only using USB/serial media not TCP/IP as in previous example.
I hope it makes more sense now :)
Cheers,
Tom
The thing is how create something generic in between the media and the handling objects. I had a play around with the TCP dispatcher using boost.asio but trying to create something generic seems like a nightmare :). Anybody tried to do something like that? What is the best way how to do it?
I haven't used Boost, but the way I usually handled that kind of problem was to create a Device base class which the server interacts with, and then subclassed it for each device type, and made the subclasses deal with the device oddness. That way, the Device class becomes a definition of your protocol. Also, the Device class would need to be portable, but the subclasses would not.
If you had to get fancier than that, you could use the Factory pattern to create the actual subclassed objects.
As far as actually communicating, I'd see if I could just run one process per Device. If you have to have more than one Device per process, on Linux I'd just use select() and its friends to manage I/O between the various Device instances. I don't know how to do that on Windows; its select only works for sockets, not serial ports or other file-like things.
Other things that come to mind that might be useful include dbus and the MPI (Message Passing Interface) library, though they aren't complete solutions for your problem (dbus doesn't do inter-computer communications, IIRC).
Does this help at all?
EDIT: Needed a formatted response to Tom's reply...
Does your device class contain the communication specific parts? Because that's the thing I wanted to avoid.
The subclasses contain the communication specific parts. That's the whole point of using subclasses here; the generic stuff goes in the base class, and the specifics go in the subclass.
I was thinking about something like this: Say there is a dispatcher specific for media used which creates Connection object for each connection (media specific), Device obj. would be created as well but just a generic one and the Connection would pass the incoming data to Device and the Device would pass the responses back to Connection.
I think that may be a bit complex, and you're expecting a generic Device to deal with a specific Connection, which can get hard to maintain fast.
What I'd recommend is a Device subclass specifically for handling that type of Connection which takes the Connection from the dispatcher and owns it until the connection closes. Then your manager can talk to the generic Device and the Connection can mess with the specifics.
An example: Say you have a temperature sensor USB thingamajig. You have some dispatcher that catches the "USB thing plugged in" signal. When it sees the USB thing plugged in:
Dispatcher creates a USBTemperatureThingConnection.
Dispatcher creates a USBTemperatureDevice, which is a subclass of Device, giving the USBTemperatureThingConnection to the USBTemperatureDevice as a constructor parameter.
USBTemperatureDevice::USBTemperatureDevice(USBTemperatureThingConnection* conn) goes and sets up whatever it needs locally to finish setting up the Connection, then sends a message to the Device Manager saying it has set itself up.
Some time later, the Device Manager wants to set the time on all devices. So it iterates through its list of devices and calls the generic (maybe even abstract) Device::SetTime(const struct timespec_t&) method on each of them.
When it gets to your temperature device, it calls USBTemperatureDevice::SetTime(const struct timespec_t&), since USBTemperatureDevice overrode the one in Device (which was either abstract, i.e. virtual void SetTime(const struct timespec_t&) = 0; or a no-op, i.e. virtual void SetTime(const struct timespec_t&) {}, so you don't have to override it for devices that can't set time). USBTemperatureDevice::SetTime(const struct timespec_t&) does whatever USB Temperature sensor-specific things are needed, using the USBTemperatureThingConnection, to get the time set.
Some time later, the device might send back a "Time Set Result" message, saying if it worked or not. That comes in on the USBTemperatureThingConnection, which wakes up your thread and you need to deal with it. So your USBTemperatureDevice::DealWithMessageFromSensor() method (which only exists in USBTemperatureDevice) dives into the message contents and figures out if the time setting worked or not. It then takes that result, turns it into a value defined in enum Device::ResultCode and calls Device::TimeSetComplete(ResultCode result), which records the result, sets a flag (bool Device::timeComplete) saying the result is in, and then hits a Semaphore or Condition to wake up the Device Manager and get it to check all the Device's, in case it was blocked waiting for all the devices to finish setting time before continuing.
I have no idea what that pattern is called. If pressed, I'd say "subclassing", or "object-oriented design" if I felt grumpy. The "middleware" is the Device class, the DeviceManager, and all their underlings. The application then just talks to the Device Manager, or at most to the generic Device interface of a specific device.
Btw. Factory pattern was planned, each object would run in separate thread :)
Good to hear.
I'm assuming by TCP/IP you mean remote nodes, and by USB, etc. the local devices connected to the same physical box. I think what I'm missing in your explanation is the part that announces the new local devices to the server process ( i.e. the analog of a listening socket) Again, assuming something along the lines of Linux uevent, I would start with the following structure:
Controller: knows correct time, manages event sources, reacts to events
Event source: produces "new/delete device" events, knows its device class
server socket
local device monitor
etc.
Device class: encapsulates device-specific logic and manages/enumerates devices
remote device type (connected socket)
USB device type
USB device version X.Y.Z type
etc.
The high-level protocol is very simple - on receipt or "new device" event, query the "Device class" for time from given device, then update the time on the device. The "Device class" is the driver/translator/bridge that implements the conversion from query/update interface to device-specific commands (network exchange for remote nodes.) It also holds a list of its devices.
This should easily map to a class diagram. Was there something else that I missed?