Lets assume that I have an input DataStream and want to implement some functionality that requires "memory" so I need ProcessFunction that gives me access to state. Is it possible to do it straight to the DataStream or the only way is to keyBy the initial stream and work in keyed-context?
I'm thinking that one solution would be to keyBy the stream with a hardcoded unique key so the whole input stream ends up in the same group. Then technically I have a KeyedStream and I can normally use keyed state, like I'm showing below with keyBy(x->1). But is this a good solution?
DataStream<Integer> inputStream = env.fromSource(...)
DataStream<Integer> outputStream = inputStream
.keyBy(x -> 1)
.process(...) //I've got acess to state heree
As I understand that's not a common usecase because the main purpose of flink is to partition the stream, process them seperately and then merge the results. In my scenario thats exactly what I'm doing, but the problem is that the merge step requires state to produce the final "global" result. What I actually want to do is something like this:
DataStream<Integer> inputStream = env.fromElements(1,2,3,4,5,6,7,8,9)
//two groups: group1=[1,2,3,4] & group2=[5,6,7,8,9]
DataStream<Integer> partialResult = inputStream
.keyBy(val -> val/5)
.process(<..stateful processing..>)
//Can't do statefull processing here because partialResult is not a KeyedStream
DataStream<Integer> outputStream = partialResult
.process(<..statefull processing..>)
outputStream.print();
But Flink doesnt seem to allow me do the final "merge partial results operation" because I can't get access to state in process function as partialResult is not a KeyedStream.
I'm beginner to flink so I hope what I'm writing makes sense.
In general I can say that I haven't found a good way to do the "merging" step, especially when it comes to complex logic.
Hope someone can give me some info, tips or correct me if I'm missing something
Thank you for your time
Is "keyBy the stream with a hardcoded unique key" a good idea? Well, normally no, since it forces all data to flow through a single sub-task, so you get no benefit from the full parallelism in your Flink cluster.
If you want to get a global result (e.g. the "best" 3 results, from any results generated in the preceding step) then yes, you'll have to run all records through a single sub-task. So you could have a fixed key value, and use a global window. But note (as the docs state) you need to come up with some kind of "trigger condition", otherwise with a streaming workflow you never know when you really have the best N results, and thus you'd never emit any final result.
I need to build a message flow where I am supposed to trigger a message(which is combination of numbers and alphabets simply a Id no) to a URL through HTTP request node and get a flat file in response(EDI Data). I have coded and I want to test whether my flow is working.
For this I need to create a mock service and construct a flat file. My question is how I can construct the response. Any proper code to write flat files(EDI data) so that I can get that as response and make sure my flow is working.
This is how exactly my response should look like:
EDI_BA40 125566 INVOIC LS 0036566
.......... and so on
Please provide some solution so that I can receive this in my response or something that I can do to test my flow so that there would be no error when I move my code next environment(QAL/PRD).
I've been playing around with Akka Streams and get the idea of creating Flows and wiring them together using FlowGraphs.
I know this part of Akka is still under development so some things may not be finished and some other bits may change, but is it possible to create a FlowGraph that isn't "complete" - i.e. isn't attached to a Sink - and pass it around to different parts of my code to be extended by adding Flow's to it and finally completed by adding a Sink?
Basically, I'd like to be able to compose FlowGraphs but don't understand how... Especially if a FlowGraph has split a stream by using a Broadcast.
Thanks
The next week (December) will be documentation writing for us, so I hope this will help you to get into akka streams more easily! Having that said, here's a quick answer:
Basically you need a PartialFlowGraph instead of FlowGraph. In those we allow the usage of UndefinedSink and UndefinedSource which you can then"attach" afterwards. In your case, we also provide a simple helper builder to create graphs which have exactly one "missing" sink – those can be treated exactly as if it was a Source, see below:
// for akka-streams 1.0-M1
val source = Source() { implicit b ⇒
// prepare an undefined sink, which can be relpaced by a proper sink afterwards
val sink = UndefinedSink[Int]
// build your processing graph
Source(1 to 10) ~> sink
// return the undefined sink which you mean to "fill in" afterwards
sink
}
// use the partial graph (source) multiple times, each time with a different sink
source.runWith(Sink.ignore)
source.runWith(Sink.foreach(x ⇒ println(x)))
Hope this helps!
We have a set of Coldfusion applications that all extended various parts of an application base. I'll provide a bit of code and then explain the issues we are having and see if anyone can shed light on the best way to trouble shoot this:
In our "OnRequestStart" in the app.cfc we have the following line to initiate a user:
if(!structKeyExists(SESSION, 'user'))
SESSION.user = CreateObject("component","cfcs.ds_user");
Then in the ds_user.cfc we call it like so:
component extends="cas.cas_user" displayname="basic_user"{
The application and all its parts run just like they should. However, in a seeming random manner after a while, the application will crash and I have to restart ColdFusion Service to get it running again. The error I get is:
Could not find the ColdFusion component or interface cas.cas_user.
So, for whatever reason after a while, my application decides it cannot find the path to the parent component. The mapping for that cfc is in the application.cfc at the top as so:
THIS.mappings["/cas"] = "#ReplaceNoCase(currpath,ListToArray(THIS.name,'_')[1],'cas30')#assets\cfcs\";
I want to be sure to say this, the application works perfectly as designed for a random amount of time and then it cannot find the parent component and will not find it again until I restart the ColdFusion Service on the server.
I figure this is somehow a memory leak or something, but I have no idea where to start looking to troubleshoot the issue. We have 6 or so other applications that are extended in the same way and work fine and never crash, but this one does.
EDIT: To be more clear on the mappings. Our applications are located:
root.com/app1
root.com/app2
We created mappings to grab cfcs from app2 while in app1 using the method above. The method, while I believe sort of strange, does work in all of our applications.
EDIT: The correct mappings that display for a while are:
/cfcs - D:\www\app1\assets\cfcs
/templates - D:\www\app1\assets\templates
/cas - D:\www\app2\assets\cfcs
/common - D:\www\app3\assets\common_elements
However once the Application goes in "crashed mode", the dump reveals the mappings are as follows:
/cfcs - D:\www\app1\assets\cfcs
/templates - D:\www\app1\assets\templates
/cas - D:\www\app1\assets\cfcs
/common - D:\www\app1\assets\common_elements
And here is how those mappings are defined at the start of the Application.cfc:
currpath = GetDirectoryFromPath(GetCurrentTemplatePath());
THIS.mappings["/templates"] = "#currpath#assets\templates";
THIS.mappings["/cfcs"] = "#currpath#assets\cfcs";
THIS.mappings["/common"] = "#ReplaceNoCase(currpath,ListToArray(THIS.name,'_')[1],'gum')#assets\common_elements\";
THIS.mappings["/cas"] = "#ReplaceNoCase(currpath,ListToArray(THIS.name,'_')[1],'cas30')#assets\cfcs\";
THIS.name = digisign_CAAAFACBFDFFE or
name_var = (arrayLen(meta_array) >= 2) ? meta_array[arrayLen(meta_array) - 1] & '_' : 'root_';
THIS.name = name_var & right(reReplace(hash(getCurrentTemplatePath()), "[^a-zA-Z]","","all"), 64 - len(name_var));
Where could it be failing. It seems the replace statement isn't working and therefore the appname in the path is not being changed from app1 to app2 when setting the mappings. is it possible this is related to this error we are currently working through: http://forums.adobe.com/message/4657868#4657868 We have yet to apply the Update 4 patch on production. However this problem we believe was happening before CF10. And while we have this issue, it only cropped up recently. This application in question has been crashing like this for a long time.
EDIT:
1. I guess when I say "crash" I mean the application gets into a state, where it will not declare the mappings correctly until I restart Coldfusion. I assume the error in our code causes the crash.
2.This is usually where the issue occurs, when doing this check of the SESSION.user var. I believe it has happened as well, it decides it cannot find our datasource. This is rare.
3. At first I thought yes, but actually no, not that many. Throughout our apps we have several names for common mappings. cas common cfcs templates etc. However D:\www\cas is where the application domain.com/cas30 is located. However a legacy version of that app is located at domain.com/cas. The mapping /cas should go to D:\www\cas30\assets\cfcs and works.
4.We have a dev setup and this never happens. (I assumed it was a load issue which is why it doesn't happen on dev). However, our dev environment is structred as so:
D:\www\deva\app1
D:\www\deva\app2
D:\www\devb\app1
D:\www\devb\app2
What we do (which I think is stupid) is we have a file located not in the same dir as the current app. This file is called application_base.cfc. All of the application.cfcs in the other applications extend from this application_base.cfc. They are not extended from other Application.cfc files. (hope that makes sense) In application_base is a init, onrequeststart, and an onerror. I'll post the App.cfc below. Also some setting are read from XML files both in the application base (to determine environment stuff) and at the application level. However we thought that might be causing the issue so the previous developer removed the xml file at the application level.
6.Yes. I'll post the app.cfc and the appbase.cfc so you can view both.
By reinitialize you mean call onapplicationstart or something. Not that I know of.
A few applications we have do:
currpath = GetDirectoryFromPath(GetCurrentTemplatePath());
app_path = ListToArray(currpath,'\');
THIS.name = app_path[ArrayLen(app_path)];
This one does:
meta_array = ListToArray(GetMetaData(this).name,'.');
name_var = (arrayLen(meta_array) >= 2) ? meta_array[arrayLen(meta_array) - 1] & '_' : 'root_';
THIS.name = name_var & right(reReplace(hash(getCurrentTemplatePath()), "[^a-zA-Z]","","all"), 64 - len(name_var));
A few others do this as well. Not sure if it was two different developers or something, but that is the way it is.
Once the app fails, it fails until I restart coldfusion. The app requires login from the domain.com/app page, so (not saying it cant change from request to request) but the request location is always the same where it's failing.
God I wish it wasn't this complex. I recently pulled our current CMS off of alot of this crazy stuff, but we have 7 or 8 applications that are so intertwined with each other and designed to work in dev/prod environments with different paths, its sometimes hard to tell what I can remove and what I can't.
I thought I tried dumping the applicationname from our error handler, but I thought it didn't work unless passed in. I passed through the mappings so I could see them which is how I know digisign is not changing to cas30 like it should in "crash" mode.
I think all the dynamic mappings were so the original developer could just use the same app.cfc template without changing anything. He liked to do stuff like var a = (b) ? (a-c) ? a-f+b : (a+b) ? d : d; : a; h; crap with no comments so it sometimes hard to just read the damn code let alone debug it.
EDIT
I feel like this issue and stackoverflow.com/q/14300915/1229594 issue may be related. I've posted some more details here as well: forums.adobe.com/message/5022377#5022377
First things first: why are you initialising session-oriented stuff in onREQUESTStart()? If you inited that in onSessionStart(), you'd not need to check for its existence every request, which - whilst trivial - is unnecessary overhead, and is simply the wrong code in the wrong place.
Secondly... you quote your error, but don't say where it's happening. Is it happening in that line in onRequestStart()?
If so, do me a favour: put a try/catch around it, and within that write the value of this.mappings to a log file, as well as the value of currPath. How is the value of that variable being derived, btw?
That said, I think if you just put that session.user init code in the right place, it'll solve your problem.
NB: frame this problem as almost certainly not a memory leak (ie: ColdFusion's fault), but your code doing something you did not anticipate (so... err... your fault ;-). This will help focus better on finding the problem. I'm not having a go at you, but "where is my code wrong" is a better approach than "it's probably something else". And more likely to be correct ;-)
Oh... and what version of CF are you on?
Take a look at this and see if it's relevant to your problem.
https://github.com/Mach-II/Mach-II-Framework/wiki/Application-Specific-Mapping-Workaround
If not, then it could have something to do with application specific mappings of the same name, on the same CF server, with those applications having different application names.
Some questions:
Are you assuming the crash is being caused by the code error, or that the code error is occurring because of the crash?
Is the instantiation of the session user the only line of code that you see these path errors?
Do you have any physical directories in your app that have the same name as the mapping names?
Does this occur in any other environments (dev/test)? Is this a clustered environment?
Are there multiple Application.cfc files extending this same Application.cfc?
Is there any code that is directly calling Application.cfc methods?
Are there any bits of code that cause the application to reinitialize itself?
What is determining the meta_array that is being used to derive the application name?
A few observations:
It seems to me that the application name is getting changed or that some other application is overwriting with the same name. This doesn't seem far-fetched as there's an awful lot of dynamic naming going on here. Starting with the application name, it's dependent on the current template's physical location, which could be different from request to request, depending on how the app routes requests. If the current template varies, the application name will vary, and cause the other app-specific mappings to change, which would cause a cascading effect to all the other mappings that use the app name to determine the physical location of those mappings.
Which begs the question: Why is all this dynamic evaluation of the application name and mapping locations even necessary? Can it be simplified or hard-coded? Can you instead use a server mapping? If it doesn't have to be this complex, simplifying it to its barest essentials will help troubleshooting and may clear up the issue entirely.
Finally, can you verify that the application name during normal operation is the same application name being referenced when the errors are occurring?
If they are different, then something is causing the application to execute within a different context (see my initial questions above for clues). A sudden change in the application name would invalidate any existing sessions and force the session user instantiation code to re-run. And because the user component paths are based in part on the application name, the paths may no longer be correct.
But if the application names are the same between normal operation and crash mode, then most likely the currpath variable is being affected by some part of the application being executed in a different physical path than expected. Since currpath is directly used in determining the rest of the mappings, that could certainly explain why an unexpected path could cause the component to go missing.
Because there are so many variables going into deriving these names, you would be well served to log those variables during normal operation and during crash mode. You'll want to see
GetCurrentTemplatePath()
GetDirectoryFromPath(GetCurrentTemplatePath())
THIS.name
meta_array
THIS.mappings
I suspect you'll find something significantly different in these variables when operating normally and when the crash/errors are occurring, and that difference should lead you closer to the answer.
This is probably a very simple question but I just can't seem to figure it out.
I am writing a Javascript app to retrieve layer information from a WFS server using a GetCapabilities request using GeoExt. GetCapabilities returns information about the WFS server -- the server's name, who runs it, etc., in addition to information on the data layers it has on offer.
My basic code looks like this:
var store = new GeoExt.data.WFSCapabilitiesStore({ url: serverURL });
store.on('load', successFunction);
store.on('exception', failureFunction);
store.load();
This works as expected, and when the loading completes, successFunction is called.
successFunction looks like this:
successFunction = function(dataProxy, records, options) {
doSomeStuff();
}
dataProxy is a Ext.data.DataProxy object, records is a list of records, one for each layer on the WFS server, and options is empty.
And here is where I'm stuck: In this function, I can get access to all the layer information regarding data offered by the server. But I also want to extract the server information that is contained in the XML fetched during the store.load() (see below). But I can't figure out how to get it out of the dataProxy object, where I'm sure it must be squirreled away.
Any ideas?
The fields I want are contained in this snippet:
<ows:ServiceIdentification>
<ows:Title>G_WIS_testIvago</ows:Title>
<ows:Abstract/>
<ows:Keywords>
<ows:Keyword/>
</ows:Keywords>
<ows:ServiceType>WFS</ows:ServiceType>
<ows:ServiceTypeVersion>1.1.0</ows:ServiceTypeVersion>
<ows:Fees/>
<ows:AccessConstraints/>
Apparently,GeoExt currently discards the server information, undermining the entire premise of my question.
Here is a code snippet that can be used to tell GeoExt to grab it. I did not write this code, but have tested it, and found it works well for me:
https://github.com/opengeo/gxp/blob/master/src/script/plugins/WMSSource.js#L37