Symfony2 unit testing: PDOException: SQLSTATE[00000] [1040] Too many connections - unit-testing

when I run unit tests for my application, first tests are successful, then around 100, tests start to fail, due to PDOException (Too many connections). I have already searched about this problem, but was not able to solve it.
My config is as follows:
<phpunit
backupGlobals = "false"
backupStaticAttributes = "false"
colors = "true"
convertErrorsToExceptions = "true"
convertNoticesToExceptions = "true"
convertWarningsToExceptions = "true"
processIsolation = "false"
stopOnFailure = "false"
syntaxCheck = "false"
bootstrap = "bootstrap.php.cache" >
If I change processIsolation to "true", all tests generate an error (E):
Caused by ErrorException: unserialize(): Error at offset 0 of 79 bytes
For that I tried setting "detect_unicode = Off" inside php.ini file.
If I run tests in smaller batches, like with "--group something", all tests are successful.
Can someone help me solve the issue when running all the tests at once? I really want to get rid of the PDOException.
Thanks in advance!

You should increase the maximum number of concurrent connections in your DB server.
If you're using MySQL, edit /etc/mysql/my.cnf and set the max_connections parameter to the number of concurrent connections you need. Then restart the MySQL server.
Keep in mind: In theory, the physical limits are very high. But if your queries cause a high CPU load or memory consumption, your DB server could eat up the resources required for other processes. This means, you could run out of memory, or your system can become overloaded.

For people who are having the same issue, here is more specific steps to config the my.cnf file.
If you are sure you are on the right my.cnf file, put max_connections = 500 (default is 151) to [mysqld] section in the my.cnf. Don't put it in the [client] section.
To make sure you are on the right my.cnf, if you have multiple mysqld installed from Homebrew or XAMMPP, find the right mysqld, for XAMMPP using /Applications/XAMPP/xamppfiles/sbin/mysqld --verbose --help | grep -A 1 "Default options" and you will get something like this:
Default options are read from the following files in the given order:
/Applications/XAMPP/xamppfiles/etc/xampp/my.cnf /Applications/XAMPP/xamppfiles/etc/my.cnf ~/.my.cnf
Normally it's at /Applications/XAMPP/xamppfiles/etc/my.cnf.

Related

HTCondor - Partitionable slot not working

I am following the tutorial on
Center for High Throughput Computing and Introduction to Configuration in the HTCondor website to set up a Partitionable slot. Before any configuration I run
condor_status
and get the following output.
I update the file 00-minicondor in /etc/condor/config.d by adding the following lines at the end of the file.
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=4
SLOT_TYPE_1_PARTITIONABLE = TRUE
and reconfigure
sudo condor_reconfig
Now with
condor_status
I get this output as expected. Now, I run the following command to check everything is fine
condor_status -af Name Slotype Cpus
and find slot1#ip-172-31-54-214.ec2.internal undefined 1 instead of slot1#ip-172-31-54-214.ec2.internal Partitionable 4 61295 that is what I would expect. Moreover, when I try to summit a job that asks for more than 1 cpu it does not allocate space for it (It stays waiting forever) as it should.
I don't know if I made some mistake during the installation process or what could be happening. I would really appreciate any help!
EXTRA INFO: If it can be of any help have have installed HTCondor with the command
curl -fsSL https://get.htcondor.org | sudo /bin/bash -s – –no-dry-run
on Ubuntu 18.04 running on an old p2.xlarge instance (it has 4 cores).
UPDATE: After rebooting the whole thing it seems to be working. I can now send jobs with different CPUs requests and it will start them properly.
The only issue I would say persists is that Memory allocation is not showing properly, for example:
But in reality it is allocating enough memory for the job (in this case around 12 GB).
If I run again
condor_status -af Name Slotype Cpus
I still get something I am not supposed to
But at least it is showing the correct number of CPUs (even if it just says undefined).
What is the output of condor_q -better when the job is idle?

How to execute command on multiple servers for executing a command

i have set of servers (150) for logging and a command (to get disk space). How can i execute this command for each server.
Suppose if script is taking 1 min to get report of command for single server, how can i send report for all the servers for every 10 min?
use strict;
use warnings;
use Net::SSH::Perl;
use Filesys::DiskSpace;
# i have almost more than 100 servers..
my %hosts = (
'localhost' => {
user => "z",
password => "qumquat",
},
'129.221.63.205' => {
user => "z",
password => "aardvark",
},
'129.221.63.205' => {
user => "z",
password => "aardvark",
},
);
# file system /home or /dev/sda5
my $dir = "/home";
my $cmd = "df $dir";
foreach my $host (keys %hosts) {
my $ssh = Net::SSH::Perl->new($host,port => 22,debug => 1,protocol => 2,1 );
$ssh->login($hostdata{$host}{user},$hostdata{$host}{password} );
my ($out) = $ssh->cmd($cmd});
print "$out\n";
}
It has to send output of disk space for each server
Is there a reason this needs to be done in Perl? There is an existing tool, dsh, which provides precisely this functionality of using ssh to run a shell command on multiple hosts and report the output from each. It also has the ability, with the -c (concurrent) switch to run the command at the same time on all hosts rather than waiting for each one to complete before going on to the next, which you would need if you want to monitor 150 machines every 10 minutes, but it takes 1 minute to check each host.
To use dsh, first create a file in ~/.dsh/group/ containing a list of your servers. I'll put mine in ~/.dsh/group/test-group with the content:
galera-1
galera-2
galera-3
Then I can run the command
dsh -g test-group -c 'df -h /'
And get back the result:
galera-3: Filesystem Size Used Avail Use% Mounted on
galera-3: /dev/mapper/debian-system 140G 36G 99G 27% /
galera-1: Filesystem Size Used Avail Use% Mounted on
galera-1: /dev/mapper/debian-system 140G 29G 106G 22% /
galera-2: Filesystem Size Used Avail Use% Mounted on
galera-2: /dev/mapper/debian-system 140G 26G 109G 20% /
(They're out-of-order because I used -c, so the command was sent to all three servers at once and the results were printed in the order the responses were received. Without -c, they would appear in the same order the servers are listed in the group file, but then it would wait for each reponse before connecting to the next server.)
But, really, with the talk of repeating this check every 10 minutes, it sounds like what you really want is a proper monitoring system such as Icinga (a high-performance fork of the better-known Nagios), rather than just a way to run commands remotely on multiple machines (which is what dsh provides). Unfortunately, configuring an Icinga monitoring system is too involved for me to provide an example here, but I can tell you that monitoring disk space is one of the checks that are included and enabled by default when using it.
There is a ready-made tool called Ansible for exactly this purpose. There you can define your list of servers, group then and execute commands on all of them.

AppFabric unable to create a DataCache (LMTRepopulationJob FAILS)

Well first of all, I am learning sharepoint 2013 and I have been following a few tutorials, so far I just setup a farm and everything seems to be working properly except for this service that is being logged into the event viewer every 5 minutes:
The Execute method of job definition
Microsoft.Office.Server.UserProfiles.LMTRepopulationJob (ID
1e573155-b7f6-441b-919b-53b2f05770f7) threw an exception. More
information is included below.
Unexpected exception in FeedCacheService.BulkLMTUpdate: Unable to
create a DataCache. SPDistributedCache is probably down..
I found out that this is a job that is configured to execute every 5 minutes
But regarding the assumption that the SPDistributedCache is probably down, I already verified it and it is running
As you can see, it is actually running, also I checked the host cache via SP powershell (get-cachehost and get-cacheclusterhealth) and still all seems fine
Yet when I execute the command get-cache I am getting only the default value, and for what I have read there should be listed another cache types like:
DistributedAccessCache_XXXXXXXXXXXXXXXXXXXXXXXXX
DistributedBouncerCache_XXXXXXXXXXXXXXXXXXXXXXXX
DistributedSearchCache_XXXXXXXXXXXXXXXXXXXXXXXXX
DistributedServerToAppServerAccessTokenCache_XXXXXXX
DistributedViewStateCache_XXXXXXXXXXXXXXXXXXXXXXX
Among others which I think probably should include DataCache
Until now I already tried a few workaround but without success
Restart-Service AppFabricCachingService
Remove-SPDistributedCacheServiceInstance
Add-SPDistributedCacheServiceInstance
Restart-CacheCluster
Even this script that it seems to work on many cases to repair the AppFabric Caching Service
$SPFarm = Get-SPFarm
$cacheClusterName = "SPDistributedCacheCluster_" + $SPFarm.Id.ToString()
$cacheClusterManager = [Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterInfoManager]::Local
$cacheClusterInfo = $cacheClusterManager.GetSPDistributedCacheClusterInfo($cacheClusterName);
$instanceName ="SPDistributedCacheService Name=AppFabricCachingService"
$serviceInstance = Get-SPServiceInstance | ? {($_.Service.Tostring()) -eq $instanceName -and ($_.Server.Name) -eq $env:computername}
$serviceInstance.Delete()
Add-SPDistributedCacheServiceInstance
$cacheClusterInfo.CacheHostsInfoCollection
Well if anyone has any suggestion, I will appreciate very much, thank you in advance!
This is a generic error message, meaning that the real issue isn't known (hence the word "Probably").
I believe that the key to solving this problem when it is not the Probably , is in looking in the ULS log for the events that have occurred just before it. Events of type "Unexpected", do not appear in the events log and are often seen before a generaic type of error.
In many cases you might see something like "File not Found". This usually means that the noted file is not in the assembly cache. Since the distributed Cache utilizes the AppFabric, which is outside of Sharepoint, then the only way for Sharepoint to find it's file, is to look in the assembly cache. The sharepoint pre Installer should have put the files there, but it might have failed or maybe someone uninstalled the App Fabric and re Installed it manually, which would have removed the files from the assembly and not put them back.
Before Restart-CacheCluster you could specify the connection to your SharePoint Database (The catalog name could be not the same)
Use-CacheCluster -ConnectionString "Data Source=(SharePoint DB Server)
\\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;
Integrated Security=True" -ProviderType System.Data.SqlClient
NOTE: It doesn't work permanently
NOTE 2: If you don't have named instance on DB Server, just put the name of your server without "\".
If you don't have a catalog, you could follow this script
***
Remove-Cache default
New-Cache SharePointCache
Get-CacheConfig SharePointCache
Set-CacheConfig SharePointCache -NotificationsEnabled True
***
New-CacheCluster -Provider System.Data.SqlClient -ConnectionString "Data Source=(SharePoint DB Server)\\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;Integrated Security=True" -Size Small
Register-CacheHost -Provider System.Data.SqlClient -ConnectionString "Data Source=(SharePoint DB Server)\\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;Integrated Security=True" -Account "Domain\spservices_account" -CachePort 22233 -ClusterPort 22234 -ArbitrationPort 22235 -ReplicationPort 22236 -HostName [Name_of_your_server]
Add-CacheHost -Provider System.Data.SqlClient -ConnectionString "Data Source=(SharePoint DB Server)\\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;Integrated Security=True" -Account "Domain\spservices_account"
Add-CacheAdmin -Provider System.Data.SqlClient -ConnectionString "Data Source=(SharePoint DB Server)\\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;Integrated Security=True"
Use-CacheCluster
You could specify or check your database configuration in regedit
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\AppFabric\V1.0\Configuration
Look for ConnectionString string value, and set your connection string
Data Source=(SharePoint DB Server)\(Optional Instance);Initial Catalog=CacheClusterConfigurationDB;Integrated Security=True
To query the status of the server you could use:
Get-SPServiceInstance | ? {($_.service.tostring()) -eq “SPDistributedCacheService Name=AppFabricCachingService”} | select Server, Status
Get-SPServer | ? {($_.ServiceInstances | % TypeName) -contains "Distributed Cache"} | % Address
Get-AFCache | Format-Table –AutoSize
Get-CacheHost
Aditional:
If you need to change your service account, you could do this procedure:
$f = Get-SPFarm
$svc = $f.Services | ? {$_.Name -eq "AppFabricCachingService"}
$acc = Get-SPManagedAccount -Identity "Domain\spservices_account"
$svc.ProcessIdentity.CurrentIdentityType = "SpecificUser"
$svc.ProcessIdentity.ManagedAccount = $acc
$svc.ProcessIdentity.Update()
$svc.ProcessIdentity.Deploy()
have you changed the distributed cache account from the farm account?
what build number are you on?
is this a single server farm?
off the top of my head, the only thing left is this:
Grant-CacheAllowedClientAccount -Account "domain\ProfileserviceWebAppIdentity"
i would do an iisreset and restart the owstimer service after you run this command.

django+uwsgi huge excessive memory usage issue

I have a django+uwsgi based website. some of the tables have almost 1 million rows.
After a few website usages, the VIRT memory used the uwsgi process reaches almost 20GB...almost kill my server...
Could you someone tell what may caused this problem? is it my table rows too big? (unlikely. Pinterest has much more data). now, i had to use reload-on-as= 10024
reload-on-rss= 4800 to kill the workers every few minutes....it is painful...
any help?
Here is my uwsgi.ini file
[uwsgi]
chdir = xxx
module = xxx.wsgi
master = true
processes = 2
socket =127.0.0.1:8004
chmod-socket = 664
no-orphans = true
#limit-as=256
reload-on-as= 10024
reload-on-rss= 4800
max-requests=250
uid = www-data
gid = www-data
#chmod-socket = 777
chown-socket = www-data
# clear environment on exit
vacuum = true
After some digging on stackflow and google search, here is the solution.
read this how django memory works and why it keeps going up
read this django app profiling
then I figured out the major parameter to set in uwsgi.ini is max_request. originally, I set it as 2000. now set it as 50. so it will respawn workers when memory goes up too much.
Then i try to figure out which request causes huge data query results from database. I ended up finding this little line:
amount=sum(x.amount for x in Project.objects.all())
While Project table has over 1 million complex entries.Occupying huge memory.... since I commented this out... everything runs smooth now.
So it is good to understand how the [django query works with database]
(Sorry I don't have enough reputation to comment - so apologies if this answer doesn't help in your case)
I had the same issue running Django on uwsgi/gninx and uwsgi controlled via supervisor. uwsgi-supervisor process started using lots of memory and consuming 100% CPU so only option was to repeatedly restart uwsgi.
Turned out the solution was to set up logging in the uwsgi.ini file:
logto = /var/log/uwsgi.log
There is some discussion on this here: https://github.com/unbit/uwsgi/issues/296

Django: Gracefully restart nginx + fastcgi sites to reflect code changes?

Common situation: I have a client on my server who may update some of the code in his python project. He can ssh into his shell and pull from his repository and all is fine -- but the code is stored in memory (as far as I know) so I need to actually kill the fastcgi process and restart it to have the code change.
I know I can gracefully restart fcgi but I don't want to have to manually do this. I want my client to update the code, and within 5 minutes or whatever, to have the new code running under the fcgi process.
Thanks
First off, if uptime is important to you, I'd suggest making the client do it. It can be as simple as giving him a command called deploy-code. Using your method, if there is an error in their code, your method requires a 10 minute turnaround (read: downtime) for fixing it, assuming he gets it correct.
That said, if you actually want to do this, you should create a daemon which will look for files modified within the last 5 minutes. If it detects one, it will execute the reboot command.
Code might look something like:
import os, time
CODE_DIR = '/tmp/foo'
while True:
if restarted = True:
restarted = False
time.sleep(5*60)
for root, dirs, files in os.walk(CODE_DIR):
if restarted=True:
break
for filename in files:
if restared=True:
break
updated_on = os.path.getmtime(os.path.join(root, filename))
current_time = time.time()
if current_time - updated_on <= 6 * 60: # 6 min
# 6 min could offer false negatives, but that's better
# than false positives
restarted = True
print "We should execute the restart command here."