Testcafe concurrency throwing weird errors - concurrency

I have a suite of tests that, when running without concurrency, all pass fine. When I start running them with concurrency, they start to fall apart. I do not believe that the tests have inter-dependencies that are making them fail. When the test fails, it looks like this:
1) A JavaScript error occurred on
"https://advancedaccount.wistia.io/stats/medias/lz45f2dspl#social".
Repeat test actions in the browser and check the console for errors.
If you see this error, it means that the tested website caused it. You can fix it or disable tracking JavaScript errors in TestCafe. To do the latter, enable the
"--skip-js-errors" option.
If this error does not occur, please write a new issue at:
"https://github.com/DevExpress/testcafe/issues/new?template=bug-report.md".
JavaScript error details:
undefined:
No stack trace available
Browser: Chrome 73.0.3683 / Linux 0.0.0
Screenshot: /mnt/artifacts/screenshots/Media - Social Stats/142_errors/1.png
497 | .expect(socialStatsPage.youtube.likes.textContent).contains('5')
498 | .expect(socialStatsPage.youtube.shares.textContent).contains('100')
499 | .expect(socialStatsPage.youtube.views.textContent).contains('20')
500 | .expect(socialStatsPage.cardError.withProps('vendorName',
'facebook').exists).ok()
501 | .expect(socialStatsPage.cardError.withProps('vendorName',
'facebook').textContent).contains('It looks like your credentials might be outdated')
> 502 | .click(socialStatsPage.reauthorizeAccountLink)
503 | .expect(getLocation()).contains('www.facebook.com');
504 | });
505 |
506 |test
507 | .requestHooks(mockFacebookExpired)('Non-account owner expired token', async (t)
=> {
at click (/usr/src/app/testcafe/tests/media/socialStats.js:502:8)
2) Unhandled promise rejection:
{ code: 'E1', isTestCafeError: true, callsite: CallsiteRecord { filename:
'/usr/src/app/testcafe/tests/media/socialStats.js', lineNum: 501, callsiteFrameIdx: 6,
stackFrames: [ [CallSite], [CallSite], [CallSite], [CallSite], [CallSite], [CallSite],
[CallSite], CallSite {}, [CallSite], [CallSite], [CallSite], [CallSite] ], isV8Frames: true
}, errStack: 'undefined:\n No stack trace available', pageDestUrl:
'https://advancedaccount.wistia.io/stats/medias/lz45f2dspl#social' }
Browser: Chrome 73.0.3683 / Linux 0.0.0
It also seems like when this happens, many if not all of the other concurrent browsers also crash with just the same Unhandled promise rejection failure diagnostic and nothing related to the test at all.
I have no idea where to start with debugging this. Any help would be great because I would like to get concurrency working. This occurs locally on my mac as well as in our CI setup.
What's also really odd is that when I look at the screenshot for the test failure, everything looks fine, so I can't figure out why it failed in the first place. Like I said, this test (and all the other ones) pass just fine if there is no concurrency flag.
Any tips would be much appreciated. I am using TestCafé v1.2.1.

Related

Twisted ssh - session execCommand implementation

Good day. I apologize for asking for obvious things because I'm writing in PHP and I know Python at the level "I started learning this yesterday". I've already spent a few days on this - but to no avail.
I downloaded twisted example of the SSH server for version 20.3 from here https://docs.twistedmatrix.com/en/twisted-20.3.0/conch/examples/. Line 162 has an execCommand method that I need to implement to make it work. Then I noticed a comment in this method "We don't support command execution sessions". Therefore, the question: Is this comment apply only to the example, or twisted library entirely. Ie, is it possible to implement this method to make the example server will work as I need?
More information. I don't think that this info is required to answer my questions above.
Why do I need it? I'm trying to compile an environment for writing functional (!) tests (there would be no such problems with the unit tests, I guess). Our API uses the SSH client (phpseclib / SSH2) by 30%+ of endpoints. Whatever I do, I had only 3 options of the results depending on how did I implement this method: (result: success, response: "" - empty; result: success, response: "1"; result: failed, response: "Unable to fulfill channel request at… SSH2.php:3853"). Those were for an SSH2 Client. If the error occurs (3rd case), the server shows logs in the terminal:
[SSHServerTransport, 0,127.0.0.1] Got remote error, code 11 reason: ""
[SSHServerTransport, 0,127.0.0.1] connection lost
I just found this works:
def execCommand(self, protocol, cmd):
protocol.write('Some text to return')
protocol.session.conn.sendEOF(protocol.session)
If I don't send EOF the client throws a timeout error.

Memory crash on sending 100MB+ file to S3 on chrome

I'm currently using Javascript to upload some video files to S3. The process works for files <100MB, but for ~100MB plus on chrome I run into an error (this works on safari). I am using ManagedUpload in this example which should be doing multipart/form-data in the background.
Code snippet:
...
let upload = new AWS.S3.ManagedUpload({
params:{
Bucket: 'my-bucket',
Key: videoFileName,
Body: videoHere,
ACL: "public-read"
}
});
upload.promise();
...
Chrome crashes with the error RESULT_CODE_INVALID_CMDLINE_URL, dev tools crash and in the Chrome terminal logs i get this:
[5573:0x3000000000] 27692 ms: Scavenge 567.7 (585.5) -> 567.7 (585.5) MB, 23.8 / 0.0 ms (average mu = 0.995, current mu = 0.768) allocation failure
[5573:0x3000000000] 28253 ms: Mark-sweep 854.6 (872.4) -> 609.4 (627.1) MB, 235.8 / 0.0 ms (+ 2.3 ms in 2 steps since start of marking, biggest step 1.4 ms, walltime since start of marking 799 ms) (average mu = 0.940, current mu = 0.797) allocation fa
<--- JS stacktrace --->
[5573:775:0705/140126.808951:FATAL:memory.cc(38)] Out of memory. size=0
[0705/140126.813085:WARNING:process_memory_mac.cc(93)] mach_vm_read(0x7ffee4199000, 0x2000): (os/kern) invalid address (1)
[0705/140126.880084:WARNING:system_snapshot_mac.cc(42)] sysctlbyname kern.nx: No such file or directory (2)
I've tried using HTTP PUT also, both work for smaller files but once i get bigger they both crash.
Any ideas? I've been through tons of SO posts / AWS docs but nothing helped this issue yet.
Edit: I've filed the issue with Chrome; seems like its an actual bug. Will update post when I have an answer.
This issue came from loading the big file into memory (several times) which would crash chrome before it even had a chance to upload.
The fix was using createObjectURL (a url pointing to the file) instead of readAsDataUrl (the entire file itself), and when sending the file to your API, use const newFile = new File([await fetch(objectURL).then(req => req.blob()], 'example.mp4', {type: 'video/mp4'});
This worked for me as I was doing many conversions to get readAsDataUrl to the file type i wanted, but in this way i use much less space.

Jetty 8.1 flooding the log file with "Dispatched Failed" messages

We are using Jetty 8.1 as an embedded HTTP server. Under overload conditions the server sometimes starts flooding the log file with these messages:
warn: java.util.concurrent.RejectedExecutionException
warn: Dispatched Failed! SCEP#76107610{l(...)<->r(...),d=false,open=true,ishut=false,oshut=false,rb=false,wb=false,w=true,i=1r}...
The same message is repeated thousands of times, and the amount of logging appears to slow down the whole system. The messages itself are fine, our request handler ist just to slow to process the requests in time. But the huge number of repeated messages makes things actually worse and makes it more difficult for the system to recover from the overload.
So, my question is: is this a normal behaviour, or are we doing something wrong?
Here is how we set up the server:
Server server = new Server();
SelectChannelConnector connector = new SelectChannelConnector();
connector.setAcceptQueueSize( 10 );
server.setConnectors( new Connector[]{ connector } );
server.setThreadPool( new ExecutorThreadPool( 32, 32, 60, TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>( 10 )));
The SelectChannelEndPoint is the origin of this log message.
To not see it, just set your named logger of org.eclipse.jetty.io.nio.SelectChannelEndPoint to LEVEL=OFF.
Now as for why you see it, that is more interesting to the developers of Jetty. Can you detail what specific version of Jetty you are using and also what specific JVM you are using?

Are there any tools for viewing a "pass/fail" history of unit tests with respect to SCM commits?

This seems like such a no-brainer that I'm almost sure something like this must exist. I just don't know where to find it. On the other hand, maybe there are technical reasons this is impossible, and I'm just not seeing them.
But basically, it seems to me it'd be very helpful if, given a particular unit test, one could (with the assistance of a CI server like Jenkins) view a history of commits that affected the red/green status of the test in question. So, say I want to see such a history for unit text X; I might see a history looking like this:
Revision | Date | Test X Status
-------------------------------------
123 | 2011-03-20 | Failed
120 | 2011-03-19 | Passed
119 | 2011-03-19 | Failed
112 | 2011-03-16 | Passed
111 | 2011-03-16 | Pending
Hopefully that makes sense: what I'd see would basically be a filtered list of commits—only those that had some effect on the outcome of the particular unit test in question (X).
Does a tool like this exist (anywhere—so, as a standalone tool, as a component of some collaboration software, as a plugin for Eclipse, Visual Studio, etc.)?
"Use the REST, Luke."
This quick and dirty Bash script works with Bamboo, I tested it using Spring Framework CI server REST API:
echo "Revision Date Test X Status"
echo "-------------------------------------------------------------"
url=https://build.springsource.org/rest/api/latest/result
for buildNumber in {1000..980}
do
curl -qs ${url}/SPR-TRUNKSNAPSHOT-${buildNumber} \
| sed 's/^.*state="\(.*\)" key.*buildCompletedTime.\(.*\)..buildCompletedTime.*vcsRevisionKey.\(.*\)..vcsRevisionKey.*$/\3\t\t\2\t\1/'
echo
done
The ugliest part is parsing the XML using sed (side note: it's a pity Linux shell does not provide XPath/XSLT built-in command-line tools, c'mon, it's XXI century!), but it works:
Revision Date Test X Status
-------------------------------------------------------------
4086 2011-03-13T01:09:13.319-08:00 Successful
4083 2011-03-12T01:05:49.145-08:00 Successful
4081 2011-03-11T01:04:46.949-08:00 Successful
4074 2011-03-10T01:09:11.003-08:00 Successful
4069 2011-03-09T01:10:17.766-08:00 Successful
4069 2011-03-08T01:09:34.492-08:00 Successful
4069 2011-03-07T06:43:51.054-08:00 Successful
4068 2011-03-07T03:50:41.909-08:00 Failed
4068 2011-03-07T00:53:55.523-08:00 Failed
4060 2011-03-06T01:06:50.758-08:00 Failed
4060 2011-03-05T01:08:35.477-08:00 Successful
4057 2011-03-04T01:08:52.870-08:00 Successful
4056 2011-03-03T01:10:00.473-08:00 Successful
4056 2011-03-02T01:09:15.679-08:00 Successful
4055 2011-03-01T01:13:19.069-08:00 Successful
4051 2011-02-28T01:08:32.165-08:00 Successful
4050 2011-02-27T00:59:33.392-08:00 Successful
4050 2011-02-26T01:15:01.113-08:00 Successful
4036 2011-02-25T01:09:35.420-08:00 Successful
4032 2011-02-24T01:13:29.997-08:00 Successful
4030 2011-02-23T00:56:51.656-08:00 Failed
Jenkins also has REST support, so it shouldn't take you more than 30 minutes to rewrite my code.

Debugging livelock in Django/Postgresql

I run a moderately popular web app on Django with Apache2, mod_python, and PostgreSQL 8.3 with the postgresql_psycopg2 database backend. I'm experiencing occasional livelock, identifiable when an apache2 process continually consumes 99% of CPU for several minutes or more.
I did an strace -ppid on the apache2 process, and found that it was continually repeating these system calls:
sendto(25, "Q\0\0\0SSELECT (1) AS \"a\" FROM \"account_profile\" WHERE \"account_profile\".\"id\" = 66201 \0", 84, 0, NULL, 0) = 84
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
poll([{fd=25, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recvfrom(25, "E\0\0\0\210SERROR\0C25P02\0Mcurrent transaction is aborted, commands ignored until end of transaction block\0Fpostgres.c\0L906\0Rexec_simple_query\0\0Z\0\0\0\5E", 16384, 0, NULL, NULL) = 143
This exact fragment repeats continually in the trace, and was running for over 10 minutes before I finally killed the apache2 process. (Note: I edited this to replace my previous strace fragment with a new one that shows full the full string contents rather than truncated.)
My interpretation of the above is that django is attempting to do an existence check on my table account_profile, but at some earlier point (before I started the trace) something went wrong (SQL parse error? referential integrity or uniqueness constraint violation? who knows?), and now Postgresql is returning the error "current transaction is aborted". For some reason, instead of raising an Exception and giving up, it just keeps retrying.
One possibility is that this is being triggered in a call to Profile.objects.get_or_create. This is the model class that maps to the account_profile table. Perhaps there is something in get_or_create that is designed to catch too broad a set of exceptions and retry? From the web server logs, it appears that this livelock might have occurred as a result of a double-click on the POST button in my site's registration form.
This condition has occurred a couple of times over the past few days on the live site, and results in a significant slowdown until I intervene, so pretty much anything other than infinite deadlock would be an improvement! :)
This turned out to be entirely my fault. I found the spot where the select (1) as 'a' statement seemed to originate (in django/models/base.py) and hacked it to log a traceback, which pointed clearly at my code.
I had some code that makes up a unique email "key" for each Profile. These keys are randomly generated, so because there is some possibility of overlap, I run it in a try/except within a while loop. My assumption was that the database's unique constraint would cause the save to fail if the key was not unique, and I'd be able to try again.
Unfortunately, in Postgresql you cannot simply try again after an integrity error. You have to issue a COMMIT or ROLLBACK command (even if you're in autocommit mode, apparently) before you can try again. So I had an infinite loop of failing save attempts where I was ignoring the error message.
Now I look for a more specific exception (django.db.IntegrityError) and run a limited number of attempts so that the loop is not infinite.
Thanks to everyone for viewing/answering.
Your analysis sounds pretty good. Clearly it's not picking up the fact that the transaction is aborted. I suggest you report this as a bug to the django project...