Too many queries when preparing test data on salesforce - unit-testing

I occured some 'Too many SOQL queries: 101' when deploying my apex code to production. After some investigation, I found that was caused by the queries in the Triggers of User object and Account object. When I prepare some test data and insert them into database, the triggers work and query for many times. A sample of my test code is like this(This is just a sample, actually there are more objects I need to prepare). Besides that, all the queries in my Trigger are not redundant, so I can't change my Trigger to reduce queries.
#isTest(SeeAllData=true)
static void testMethod() {
User user = testFixture.prepareUser();
insert user; //57 queries
Account acc = testFixture.prepareAccount();
insert acc; //50 queries
Test.startTest();
//my test code
Test.stopTest();
}
So my question is how can I avoid this error? I already searched a lot and I don't have any query in loop. Already tried to use the 'startTest' to reset the query counter, but it still not solve my problem. I wonder if there is a method I can disable the triggers when prepare data or any other way to solve my problem?
I tried my best to express my problem, if it is still not understandable, please tell me.
Any ideas are appreciated.

Related

Apollo pagination and cache-and-network fetch policy

The apollo docs suggest a way to implement the merge function in a field policy when implementing pagination logic.
merge(existing = [], incoming) {
return [...existing, ...incoming];
}
However, when I use 'cache-and-network' fetch policy for the query, that means that first it loads data from the cache, then it goes out to the network, and will append the existing list with the incoming data, so every item will exist in the cache twice, if the incoming data is the same as what was in the cache before.
What is the correct way to solve this? Can I differentiate between an initial load and a fetchmore request in the merge function? The merge function should obviously work differently for an initial fetch that should overwrite what we have loaded from the cache, and for a pagination fetch more.
In case anyone stumbled into this issue, solution as of Apollo 3 is:
fetchPolicy: 'cache-and-network',
nextFetchPolicy: 'cache-first',

django: A good way of inserting or updating a large list of objects

It looks like I have a problem working with a large dataset. Lets say I have the following case:
1) A large csv (200K+ lines) with products (each one with a unique sku), is sent with a post request
2) For each item I need to check if it's already in database (I was using get_or_create)
3) In case object exists I do update it, otherwise I create a new entity in database.
The code looks like this:
for csv_item in csv_products:
product = Products.objects.get_or_create(sku=csv_item.sku)
product.price = csv_item.price # Here I update all fields from csv_item
product.save()
The issue
This code creates unbelievable hit to the database. The amount of queries is amazing, and the execution time as well...
What i was trying:
1) Bulk create method of django. Why it did not work well: I need not only to create but to update as well. So checking if object exists on first step break bulk_create...
2) Controlling transactions manually by sending commit/rollback. Not sure here... it tends to give me integrity errors.
Please advise..

Overcoming querying limitations in Couchbase

We recently made a shift from relational (MySQL) to NoSQL (couchbase). Basically its a back-end for social mobile game. We were facing a lot of problems scaling our backend to handle increasing number of users. When using MySQL loading a user took a lot of time as there were a lot of joins between multiple tables. We saw a huge improvement after moving to couchbase specially when loading data as most of it is kept in a single document.
On the downside, couchbase also seems to have a lot of limitations as far as querying is concerned. Couchbase alternative to SQL query is views. While we managed to handle most of our queries using map-reduce, we are really having a hard time figuring out how to handle time based queries. e.g. we need to filter users based on timestamp attribute. We only need a user in view if time is less than current time:
if(user.time < new Date().getTime() / 1000)
What happens is that once a user's time is set to some future time, it gets exempted from this view which is the desired behavior but it never gets added back to view unless we update it - a document only gets re-indexed in view when its updated.
Our solution right now is to load first x user documents and then check time in our application. Sorting is done on user.time attribute so we get those users who's time is less than or near to current time. But I am not sure if this is actually going to work in live environment. Ideally we would like to avoid these type of checks at application level.
Also there are times e.g. match making when we need to check multiple time based attributes. Our current strategy doesn't work in such cases and we frequently get documents from view which do not pass these checks when done in application. I would really appreciate if someone who has already tackled similar problems could share their experiences. Thanks in advance.
Update:
We tried using range queries which works for only one key. Like I said in most cases we have multiple time based keys meaning multiple ranges which does not work.
If you use Date().getTime() inside a view function, you'll always get the time when that view was indexed, just as you said "it never gets added back to view unless we update it".
There are two ways:
Bad way (don't do this in production). Query views with stale=false param. That will cause view to update before it will return results. But view indexing is slow process, especially if you have > 1 milllion records.
Good way. Use range requests. You just need to emit your date in map function as a key or a part of complex key and use that range request. You can see one example here or here (also if you want to use DateTime in couchbase this example will be more usefull). Or just look to my example below:
I.e. you will have docs like:
doc = {
"id"=1,
"type"="doctype",
"timestamp"=123456, //document update or creation time
"data"="lalala"
}
For those docs map function will look like:
map = function(){
if (doc.type === "doctype"){
emit(doc.timestamp,null);
}
}
And now to get recently "updated" docs you need to query this view with params:
startKey="dateTimeNowFromApp"
endKey="{}"
descending=true
Note that startKey and endKey are swapped, because I used descending order. Here is also a link to documnetation about key types that couchbase supports.
Also I've found a link to a question that can also help.

EclipseLink JPA: Can I run multiple queries from one builder?

I have a method that builds and runs a Criteria query. The query does what I want it to, specifically it filters (and sorts) records based on user input.
Also, the query size is restricted to the number of records on the screen. This is important because the data table can be potentially very large.
However, if filters are applied, I want to count the number of records that would be returned if the query was not limited. So this means running two queries: one to fetch the records and then one to count the records that are in the overall set. It looks like this:
public List<Log> runQuery(TableQueryParameters tqp) {
// get the builder, query, and root
CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<Log> query = builder.createQuery(Log.class);
Root<Log> root = query.from(Log.class);
// build the requested filters
Predicate filter = null;
for (TableQueryParameters.FilterTerm ft : tqp.getFilterTerms()) {
// this section runs trough the user input and constructs the
// predicate
}
if (filter != null) query.where(filter);
// attach the requested ordering
List<Order> orders = new ArrayList<Order>();
for (TableQueryParameters.SortTerm st : tqp.getActiveSortTerms()) {
// this section constructs the Order objects
}
if (!orders.isEmpty()) query.orderBy(orders);
// run the query
TypedQuery<Log> typedQuery = em.createQuery(query);
typedQuery.setFirstResult((int) tqp.getStartRecord());
typedQuery.setMaxResults(tqp.getPageSize());
List<Log> list = typedQuery.getResultList();
// if we need the result size, fetch it now
if (tqp.isNeedResultSize()) {
CriteriaQuery<Long> countQuery = builder.createQuery(Long.class);
countQuery.select(builder.count(countQuery.from(Log.class)));
if (filter != null) countQuery.where(filter);
tqp.setResultSize(em.createQuery(countQuery).getSingleResult().intValue());
}
return list;
}
As a result, I call createQuery twice on the same CriteriaBuilder and I share the Predicate object (filter) between both of them. When I run the second query, I sometimes get the following message:
Exception [EclipseLink-6089] (Eclipse Persistence Services -
2.2.0.v20110202-r8913):
org.eclipse.persistence.exceptions.QueryException Exception
Description: The expression has not been initialized correctly. Only
a single ExpressionBuilder should be used for a query. For parallel
expressions, the query class must be provided to the ExpressionBuilder
constructor, and the query's ExpressionBuilder must always be on the
left side of the expression. Expression: [ Base
com.myqwip.database.Log] Query: ReportQuery(referenceClass=Log ) at
org.eclipse.persistence.exceptions.QueryException.noExpressionBuilderFound(QueryException.java:874)
at
org.eclipse.persistence.expressions.ExpressionBuilder.getDescriptor(ExpressionBuilder.java:195)
at
org.eclipse.persistence.internal.expressions.DataExpression.getMapping(DataExpression.java:214)
Can someone tell me why this error shows up intermittently, and what I should do to fix this?
Short answer to the question : Yes you can, but only sequentially.
In the method above, you start creating the first query, then start creating the second, the execute the second, then execute the first.
I had the exact same problem. I don't know why it's intermittent tough.
I other words, you start creating your first query, and before having finished it, you start creating and executing another.
Hibernate doesn't complain but eclipselink doesn't like it.
If you just start by the query count, execute it, and then create and execute the other query (what you've done by splitting it in 2 methods), eclipselink won't complain.
see https://issues.jboss.org/browse/SEAMSECURITY-91
It looks like this posting isn't going to draw much more response, so I will answer this in how I resolved it.
Ultimately I ended up breaking my runQuery() method into two methods: runQuery() that fetches the records and runQueryCount() that fetches the count of records without sort parameters. Each method has its own call to em.getCriteriaBuilder(). I have no idea what effect that has on the EntityManager, but the problem has not appeared since.
Also, the DAO object that has these methods used to be #ApplicationScoped. It now has no declared scope, so it is now constructed on demand from the various #RequestScoped and #ConversationScoped beans that use it. I don't know if this has any effect on the problem but since it has not appeared since I will use this as my code pattern from now on. Suggestions welcome.

How to delete all database data with NHibernate?

Is it possible to delete all data in the database using NHibernate. I want to do that before starting each unit test. Currently I drop my database and create it again but this is not acceptable solution for me.
==========================================================
Ok, here are the results. I am testing this on a database (Postgre). I will test CreateSchema(1), DanP solution(2) and apollodude217 solution(3). I run the tests 5 times with each method and take the average time.
Round 1 - 10 tests
(1) - ~26 sec
(2) - 9,0 sec
(3) - 9,3 sec
Round 2 - 100 tests
(1) - Come on, I will not do that on my machine
(2) - 12,6 sec
(3) - 18,6 sec
I think that it is not necessary to test with more tests.
I'm using the SchemaExport class and recreate the schema before each test. This is almost like dropping the database, but it's only dropping and recreating the tables. I assume that deleting all data from each table is not faster then this, it could even be slower.
Our unit tests are usually running on Sqlite in memory, this is very fast. This database exists only as long as the connection is open, so the whole database is recreated for each test. We're switching to Sqlserver by changing the build configuration.
Personally, I use a stored procedure to do this, but it may be possible with Executable HQL (see this post for more details: http://fabiomaulo.blogspot.com/2009/05/nh21-executable-hql.html )
Something along the lines of session.Delete("from object");
I do not claim this is faster, but you can do something like this for each mapped class:
// untested
var entities = MySession.CreateCriteria(typeof(MappedClass)).List<MappedClass>();
foreach(var entity in entities)
MySession.Delete(entity); // please optimize
This (alone) will not work in at least 2 cases:
When there is data that must be in your database when the app starts up.
When you have a type where the identity property's unsaved-value is "any".
A good alternative is having a backup of the initial DB state and restoring it when starting tests (this can be complex or not, depending on the DB)
Re-creating the database is a good choice, especially for unit testing. If the creation script is too slow you could take a backup of the database and use it to restore the DB to an initial state before each test.
The alternative would be to write a script that would drop all foreign keys in the database then delete/truncate all tables. This would not reset any autogenerated IDs or sequences however. This doesn't seem like an elegant solution and it is definitely more time consuming.
In any case, this is not something that should be done through an ORM, not just NHibernate.
Why do you reject the re-creation option? What are your requirements? Is the schema too complex? Does someone else design the database? Do you want to avoid file fragmentation?
Another solution might be to create a stored procedure that wipes the data. In your test set up or instantiate method run the stored procedure first.
However I am not sure if this is quicker than any of the other methods as we don't know the size of database and number of rows likely to be deleted. Also I would not recommend deploying this stored procedure to the live server for safety purposes!
HTH