At one time I had a theory that instantiating objects on every request rather than having them reside in the Application scope was a huge memory hog. As my knowledge of ColdFusion has grown over the years, I don't think I really understood how CF deals with classes in the "black box" of the CF framework, so I'm going to ask this for community correction or confirmation.
I'm just going to throw out what I think is happening:
A CFC is compiled into a class, each method within that CFC is compiled into a class.
Those classes will reside in (PermGen) memory and can be written to disk based on CF administrator settings.
When a new object is created or template requested, the source code is hashed and compared to the hash stored with the compiled class.
If there is a match, it will use the compiled class in memory
If the compiled class doesn't exist, it will compile from source
If the compiled class exists, but the hash doesn't match, it will recompile.
As an aside, whenever you enable trusted cache, ColdFusion will no longer hash the source to check for differences and will continue to use the compiled class in memory.
Whenever you create a new object, you get a new pointer to the compiled class and its methods' classes and any runtime events occur in the pseudo-constructor. Edit: At this point, I'm referring to using createObject and having any "loose" code outside of functions run. When I say pointer, I mean the reference to memory allocated for the object's scopes (this, variables, function variables).
If you request an init, then the constructor runs. The memory consumed at this point is just your new reference and any variables set in the pseudo-constructor and constructor. You are not actually taking up memory for a copy of the entire class. Edit: For this step I'm referring to using the new operator or chaining your createObject().init() old school.
This eliminates a huge fallacy that I, personally, might have heard over the years that instantiating large objects in every request is a massive memory hog (due to having a copy of the class rather than just a reference). Please note that I am not in favor of this, the singleton pattern is amazing. I'm just trying to confirm what is going on under the hood to prevent chasing down red herrings in legacy code.
Edit: Thanks for the input everyone, this was a really helpful Q/A for me.
I've been developing CF for 14 years and I've never heard anyone claim that creating CFC instances on each request consumed memory due to class compilation. At the Java level, your CFML code is direct compiled to bytecode and stored as Java classes in memory and on disk. Java classes are not stored in the heap, but rather in the permanent generation which is not (usually) a collected memory space. You can create as many instances of that CFC and no more perm gen space will be used, however heap space will be allocated to store the instance data for that CFC for the duration of its existsance. Note, open source Railo does not use separate classes for methods.
Now, if you create a very large amount of CFC instances (or any variable) for that matter, that will create a lot of cruft in your heap's young generations. As long as hard references are not held after the request finishes, those objects will be cleared from the heap when the next minor garbage collection runs. This isn't necessarily a bad thing, but heap sizes and GC pauses should always be taken into account when performance tuning an application.
Now, there are reasons to persist CFC instances, either as a singleton pattern or for the duration of a session, request, etc. One reason is the overhead of actual object creation. This often involves disk I/O to check last modified times. Object creation has increased speed significantly since the old days, but is still pretty far behind native Java if you're going to be creating thousands of instances. The other main reason is for your objects to maintain state over the life of the application/session/request such as a shopping cart stored in session while the user shops.
And for completeness, I'll attempt to address your points categorically:
For Adobe CF yes, for Railo, methods are inner classes
Yes.
Actually, I don't believe there is any hashing involved. It's all based on the datetime last modified on the source file.
Yes, but again, no hashing-- it just skips the disk I/O to check the last modified datetime
I don't think "pointer" is the right term as that implies the Java classes actually live in the heap. CF uses a custom URL classloader to load the class for the template and then an INSTANCE of that class is created and stored in the heap. I can understand how this may be confusing as CFML has no concept of "class". Everything is simply an instance or doesn't exist at all. I'm not sure what you mean by "runtime events occur[ing] in the pseudo-constructor".
To be clear, the JAVA constructor already ran the instant you created the CFC. The CF constructor may be optional, but it has zero bearing on the memory consumed by the CFC instance. Again, I think you're getting unnecessarily hung up on the pseudo-constructor as well. That's just loose code inside the component that runs when it is created and has no bearing on memory allocated in the heap. The Java class is never copied, it is just the template for the instance.
Related
I've taken over some legacy C++ code (written in C++03) which is for an application that runs on an RTOS. While browsing the codebase, I came across a construct like this:
...
new UserDebug(); ///<User debug commands.
...
Where the allocation done using new isn't stored anywhere so I looked a bit deeper and found this
class UserDebug
{
public:
///Constructor
UserDebug()
{
new AdvancedDebug();
new CameraCommand();
new CameraSOG();
new DebugCommandTest();
new DebugCommand();
// 30 more new objects like this
};
virtual ~UserDebug(){};
};
I dug deeper into each of the class definitions and implementations mentioned and couldn't find any reference to delete anywhere.
This code was written by the principal software engineer (who has left our company).
Can anyone shed some ideas on why you would want to do something like this and how does it work?
Thanks
If you look into the constructors of those classes you’ll see that they have interesting side effects, either registering themselves with some manager class or storing themselves in static/global pointer variables á la singletons.
I don’t like that they’ve chosen to do things that way - it violates the Principle of Least Surprise - but it isn’t really a problem. The memory for the objects is probably (but not necessarily) leaked, but they’re probably meant to exist for the lifetime of the executable so no big deal.
(It’s also possible that they have custom operator news which do something even odder, like constructing into preallocated static/global storage, though that’s only somewhat relevant to the ‘why’.)
If these objects created once they might be expected to have the lifetime of the application (similar to singletons) and thus should never be deleted.
Another way to capture pointer is through overloaded operator new: both global and class specific. Check if there are any overloads that implement some sort of garbage collection.
I am developing a large, complex model (mostly simple math, primarily algebra, but lots of calculations). I did my initial bid assuming I'd have to do everything once, but now the scope has expanded such that I need to run the entire model multiple times (on the same set of underlying assumptions but with a different dataset).
Based on my initial needs, I created a bunch of classes, then created dynamic instances of those classes in my main function, passing them by reference to each function as I went. That way, at the end of the main function, I can do all the necessary reporting / output once all of the functions have run. My question is about how to now modify my main function to allow for multiple iterations. A few sample bits of code below, followed by my question(s):
// Sample class declaration (in main)
vector<PropLevelFinancials> PLF;
// Sample function call (functions in other header files)
ProcessPropFinancials(vector<PropLevelFinancials>& PLF);
// Reporting at the end of main (functions in other header files)
OutputSummaryReport(vector<PropLevelFinancials>& PLF, other objects);
// What I need to do next
// Clear/Delete PLF and other objects, iterate through model again
I am quite happy with the speed and result of my current program, so don't need a whole lot of input on that front (although always welcome suggestions).
How should I implement the ability to cycle through multiple datasets (I obviously know how to do the loop, my question is about memory management)? Speed is critical. I want to essentially delete the existing instance of the class object I have created (PLF), and then run everything again on a new instance of the object(s). Is this the type of situation where I should use "new" and "delete" to manage the iterations? Would that change my function calls as outlined above? If I wanted to avoid using new and delete (stay on the stack), what are my options?
No. Do not, ever, use new and delete without a highly exceptional cause.
std::vector<T> offers a clear() member you can use. I suggest you implement a similar one for your own classes if that is what you need. Or you can simply do PLF = std::vector<T>();, which would work a bit better for your own UDTs without modification, assuming that you wrote them according to the most basic C++ guidelines.
Is it safe to store a CFC object in the REQUEST scope to be accessed later? Right now, our sites load up navigation data at least twice, possibly three times if they use our breadcrumbs feature. Some times, this data can vary, however, most of the time, three separate calls end up being made to grab the same exact navigation data...
So, I was thinking after the first load, save the navigation data in the REQUEST scope in some sort of struct, and in subsequent calls, just check to see if that data is already there, and if so, just use what is stored rather than re-creating it again. I know this would be accessing a shared scope outside of a contained object, which is probably not good practice, but in the end could shave off half of our page load times...
I know it can be done, however, we have had problems with the server recently, some of it possibly being memory leaks from how we use/store certain things, so was wondering if this was safe to do...
Either the variables or request scope would be suitable for your purpose, however more advisable would be to modify the functions that require access to this variable to accept your cached variable as an argument. With regard to CFCs it could be passed in the init() method and stored for use by the methods within that CFC (assuming you initialise it)
By relying on a global variable (even one restricted to current request) you are potentially just causing difficulties for yourself down the line, which would be solved by ensuring the methods are more encapsulated.
As mentioned in my comments earlier, ColdFusion - When to use the "request" scope? is worth a quick read as it has relevant information in the answers.
Yes. The only request that has access to the REQUEST scope is the current request.
I'm attempting to implement a Save/Load feature into my small game. To accomplish this I have a central class that stores all the important variables of the game such as position, etc. I then save this class as binary data to a file. Then simply load it back for the loading function. This seems to work MOST of the time, but if I change certain things then try to do a save/load the program will crash with memory access violations. So, are classes guaranteed to have the same structure in memory on every run of the program or can the data be arranged at random like a struct?
Response to Jesus - I mean the data inside the class, so that if I save the class to disk, when I load it back, will everything fit nicely back.
Save
fout.write((char*) &game,sizeof Game);
Load
fin.read((char*) &game, sizeof Game);
Your approach is extremely fragile. With many restrictions, it can work. These restrictions are not worth subjecting your users (or yourself!) to in typical cases.
Some Restrictions:
Never refer to external memory (e.g. a pointer or reference)
Forbid ABI changes/differences. Common case: memory layout and natural alignment on 32 vs 64 will vary. The user will need a new 'game' for each ABI.
Not endian compatible.
Altering your type's layouts will break your game. Changing your compiler options can do this.
You're basically limited to POD data.
Use offsets instead of pointers to refer to internal data (This reference would be in contiguous memory).
Therefore, you can safely use this approach in extremely limited situations -- that typically applies only to components of a system, rather than the entire state of the game.
Since this is tagged C++, "boost - Serialization" would be a good starting point. It's well tested and abstracts many of the complexities for you.
Even if this would work, just don't do it. Define a file format at the byte-level and write sensible 'convert to file format' and 'convert from file format' functions. You'll actually know the format of the file. You'll be able to extend it. Newer versions of the program will be able to read files from older versions. And you'll be able to update your platform, build tools, and classes without fear of causing your program to crash.
Yes, classes and structures will have the same layout in memory every time your program runs., although I can't say if the standard enforces this. The machine code generated by C++ compilers use "hard-coded" offsets to access type fields, so they are fixed. Realistically, the layout will only change if you modify the C++ class definition (field sizes, order, virtual methods, etc.), compile with a different compiler or change compiler options.
As long as the type is POD and without pointer fields, it should be safe to simply dump it to a file and read it back with the exact same program. However, because of the above-mentionned concerns, this approach is quite inflexible with regard to versionning and interoperability.
[edit]
To respond to your own edit, do not do this with your "Game" object! It certainly has pointers to other objects, and those objects will not exist anymore in memory or will be elsewhere when you'll reload your file.
You might want to take a look at this.
Classes are not guaranteed to have the same structure in memory as pointers can point to different locations in memory each time a class is created.
However, without posting code it is difficult to say with certainty where the problem is.
I know this is strange but I'm just having fun.
I am attempting to transmit a std::map (instantiated using placement new in a fixed region of memory) between two processes via a socket between two machines: Master and Slave. The map I'm using has this typedef:
// A vector of Page objects
typedef
std::vector<Page*,
PageTableAllocator<Page*> >
PageVectorType;
// A mapping of binary 'ip address' to a PageVector
typedef
std::map<uint32_t,
PageVectorType*,
std::less<uint32_t>,
PageTableAllocator<std::pair<uint32_t, PageVectorType*> > >
PageTableType;
The PageTableAllocator<T> class is responsible for allocating whatever memory the STL containers may want/need into a fixed location in memory. E.g., all Page objects and STL internal structures are being instantiated in this fixed memory region. This ensures that both the std::map object and the allocator are both placed in a fixed region of memory. I've used GDB to make sure the map and allocator behave correctly (all memory used is in the fixed region, nothing ever goes on the application's normal heap).
Assuming Master starts up, initializes all of it's STL structures and the special memory region, the following happens. Slave starts, prints out its version of the page table, then looks for a Master. Slave finds a master, deletes its version of the page table, copies Master's version of the page table (and the special memory region), and successfully prints it out the Master's version of the page table. From what prodding I've done in GDB I can perform many read-only operations.
When trying to add to the newly copied PageTableType object, Slave faults in the allocator's void construct (pointer p, const T& value) method. The value passed in as p points to an already allocated area of memory (as per Master's version of the std::map).
I don't know anything about C++ object structure, but I'm guessing that object state from Slave's version of the PageTableType must be hanging around even after I replace all of the memory that the PageTableType and its allocator used. My question is if this is a valid concern. Does C++ maintain some sort of object state outside of the area of memory that object was instantiate din?
All of the objects used in the map are non-POD. Same is true for the allocator.
To answer your specific question:
Does C++ maintain some sort of object state outside of the area of memory that object was instantiated in?
The answer is no. There are no other data structures set up to "track" objects or anything of the sort. C++ uses an explicit memory allocation model, so if you choose to be responsible for allocation and deallocation, then you have complete control.
I suspect there's something wrong in your code somewhere, but since you believe the code is correct you're inventing some other reason why your code might be failing, and following that path instead. I would pull back, and carefully examine everything about the way your code is working right now, and see if you can determine the problem. Although the STL classes are complex (especially std::map), they're ultimately just code and there is no hidden magic in there.