I want to start minikube cluster on specific network/network adapter in VirtualBox, so that I launch other VMs in same network like below
+-------+ +------+ +----------------+
| | | | | |
| VM2 | | VM1 | | Minikube |
| | | | | Cluster |
| | | | | |
+---+---+ +---+--+ +------------+---+
| | |
| | |
| +------+------------+ |
+--+ | |
| 192.168.10.0/24 +-----+
+-------------------+
But I don't see much options for networking in minikube start CLI
Is it possible to start minikube like that or any trick to setup like above?
When it comes to adjusting networking with minikube start you can use the following option:
--host-only-cidr string The CIDR to be used for the minikube VM (only supported with Virtualbox driver) (default "192.168.99.1/24")
As you can see in the table here by default NAT option doesn't give you access to Minikube Cluster VM neither from host nor from other guests (VMs) but you can additionally set port forwarding which is well described in this article.
Although mentioned minikube start doesn't support many options that allow you to modify networking of your default VM, you can easily modify it by adding additional bridged adapter once the Minikube VM is created using Virtualbox GUI or vboxmanage command line tool to modify your network settings as some users suggest here and here.
I have checked again, the minikube cluster is attached to 2 networks,
NAT
Host-Only Network(vboxnet1)
Since it has already connected to a adapter, I can attache VM to exiting adapter and use it like below
+--------+ +---------------------+
| | | Minikube |
| | | |
| VM | | eth1 eth0 |
| | | + + |
| | +---------------------+
+---+----+ | |
| | |
| | |
| +------------v------+ |
| | | v
+------->+ vboxnet1 | NAT
| 192.168.99.0/24 |
| |
+-------------------+
Any other suggestions are welcome
Related
Background:
We are creating a SAAS app using Vue front-end, Django/DRF backend, Postgresgl, all running in a Docker environment. The benchmarks below were run on our local dev machines.
The process to register a new "owner" is rather complex. It does the following:
Create tenant and schema
Run migrations (done in the create schema process)
Create MinIO bucket
Load "production" fixtures
Run sync_permissions
Create an owner instance in the newly created schema
We are seeing some significant differences in processing times for some of the above steps running the registration process in different ways. In trying to figure out our issue, we have tried the following four methods to invoke the registration process:
from the Vue front-end hitting the API endpoint
from a REST client (Talend)
from the APIBrowser (provided by DRF)
(in some cases) via manage.py
We tried it from the REST client to try to eliminate Vue as the culprit, but we got similar times between Vue and the REST client.
We also saw similar times between the APIBrowser and the manage.py method, so in the tables below, we are comparing Talend to APIBrowser (or manage.py).
The issue:
Here are the processing times for several of the steps listed above:
|---------------------|--------|------------|--------|
| Process | Talend | APIBrowser | Factor |
|---------------------|--------|------------|--------|
| Create Tenant | 11.853 | 1.185 | 10.0 |
|---------------------|--------|------------|--------|
| Create MinIO Bucket | 0.386 | 0.273 | 1.4 |
|---------------------|--------|------------|--------|
| Load Fixtures | 0.926 | 0.215 | 4.3 |
|---------------------|--------|------------|--------|
| Sync Permissions | 61.115 | 5.390 | 11.3 |
|---------------------|--------|------------|--------|
| Overall | 74.280 | 7.053 | 10.5 |
|---------------------|--------|------------|--------|
In both cases (Talend and APIBrowser), it is running the exact same code. We don't understand why the REST client method takes more than 10 times as long as running from APIBrowser.
We then tried to get down to finer detail in our benchmark timing. We focused on the first step and quickly noticed that it was the process of running migrate_schemas that was the issue. Here's a list of processing times for each migration file it processed. This time, we ran the second pass via manage.py instead of APIBrowser, but as mentioned previously, those times were comparable.
|---------------------|--------|-----------|--------|
| Migration file | Talend | manage.py | Factor |
|---------------------|--------|-----------|--------|
| activity_log.0001 | 0.133 | 0.013 | 10.2 |
| countries.0001 | 0.086 | 0.013 | 6.6 |
| contenttypes.0001 | 0.178 | 0.022 | 8.1 |
| contenttypes.0002 | 0.159 | 0.033 | 4.8 |
| auth.0001 | 0.530 | 0.092 | 5.8 |
| auth.0002 | 0.124 | 0.022 | 5.6 |
| auth.0003 | 0.090 | 0.023 | 3.9 |
| auth.0004 | 0.097 | 0.027 | 3.6 |
| auth.0005 | 0.126 | 0.016 | 7.9 |
| auth.0006 | 0.079 | 0.006 | 13.2 |
| auth.0007 | 0.079 | 0.011 | 7.2 |
| auth.0008 | 0.100 | 0.011 | 9.1 |
| auth.0009 | 0.085 | 0.014 | 6.1 |
| auth.0010 | 0.121 | 0.015 | 8.1 |
| auth.0011 | 0.087 | 0.018 | 4.8 |
| users.0001 | 0.871 | 0.115 | 7.6 |
| admin.0001 | 0.270 | 0.035 | 7.7 |
| admin.0002 | 0.093 | 0.022 | 4.2 |
| admin.0003 | 0.091 | 0.024 | 3.8 |
| authtoken.0001 | 0.193 | 0.036 | 5.4 |
| authtoken.0002 | 0.395 | 0.090 | 4.4 |
| clients.0001 | 0.537 | 0.082 | 6.5 |
| clients.0002 | 0.519 | 0.145 | 3.6 |
| projects.0001 | 0.475 | 0.062 | 7.7 |
| projects.0002 | 0.293 | 0.062 | 4.7 |
| sessions.0001 | 0.191 | 0.023 | 8.3 |
| tasks.0001 | 0.241 | 0.122 | 2.0 |
| tenants.0001 | 0.086 | 0.017 | 5.1 |
|---------------------|--------|-----------|--------|
| Total time: | 10.404 | 1.618 | 6.4 |
|---------------------|--------|-----------|--------|
Our Theory:
We think it must have something to do with Talend (and Vue) initiating the process from a different domain (as it will be when the site is live), but in the case of APIBrowser, it starts from the actual endpoint (i.e. the same domain) that the endpoint is defined for.
That means, in our local environment, running from Vue, we are on local.dev and it hits the local.api endpoint. But running from APIBrowser, we go directly to local.api, then fill in the data on the form and POST it.
Our theory is that it must be affecting how files are accessed. The migrate_schemas process has to open many .py files. And the worst culprit, SyncPermissions, is processing many .yaml files where we have defined our default permission structure utilized by each tenant. I should point out that the LoadFixtures process also opens external .yaml files, but in this case, it only has one file to process, so the difference is minimized.
It may be like the difference between opening an image file in code vs. a template showing an image via HTML. In the HTML version, it's essentially another request on the server - which surely takes longer than programmatically opening an image on disk.
What we don't understand is why opening files in these processes would be affected by the two methods of initiating the process. Obviously, since the site will have to run in Vue, having the registration process take 70 seconds when we know it could be done in only 7 seconds is unacceptable.
Note:
I realize it is the norm here in SO to include code for the process in question, but in this case, both processes are running the exact same code - which is why I decided not to post several hundred lines of code here.
Edit (in response to #Iain Shelvington)
The process starts in the post() method of TenantRegister view:
class TenantRegister(APIView):
def post(self, request, *args, **kwargs):
...
tenant_data = request.data.pop('tenant', dict())
tenant_serializer = TenantSaveSerializer(data=tenant_data)
tenant_serializer.is_valid(raise_exception=True)
tenant = tenant_serializer.create(tenant_serializer.validated_data)
...
...which calls the create() method of TenantSaveSerializer:
class TenantSaveSerializer(serializers.ModelSerializer):
class Meta:
model = Tenant
fields = '__all__'
def create(self, validated_data):
...
tenant = Tenant.objects.create(**validated_data)
...
if has_schema and tenant.auto_create_schema:
try:
tenant.create_schema(check_if_exists=True, verbosity=self.verbosity)
post_schema_sync.send(sender=Tenant, tenant=tenant)
except Exception:
# We failed creating the schema, delete what
# was created and re-raise the exception.
tenant.delete(force_drop=True)
raise
else:
# Although we are not using the schema functions directly,
# the signal might be registered by a listener.
schema_needs_to_be_sync.send(sender=Tenant, tenant=self)
return tenant
...which calls the create_schema() method on the Tenant model instance:
def create_schema(self, check_if_exists=False, sync_schema=True,
verbosity=1):
connection = connections[get_tenant_database_alias()]
cursor = connection.cursor()
# Create the schema.
cursor.execute('CREATE SCHEMA "%s"' % self.schema_name)
call_command(
'migrate_schemas',
tenant=True,
schema_name=self.schema_name,
interactive=False,
verbosity=verbosity)
connection.set_schema_to_public()
return True
As for the timing of each migration, my colleague did those. I believe he said he just set verbosity to a higher value and the migrate_schemas process produced the timed output.
I am trying to format the AWS CLI table output so that it shows as a 'nice' formatted table in markdown in typora, github md files, etc...
For example, the original table-formatted output from the AWS CLI command
$ aws ec2 describe-subnets --query "Subnets[*].{CIDR:CidrBlock,Name:Tags[?Key=='Name']|[0].Value,AZ:AvailabilityZone}" --output table
is
----------------------------------------------------------------------
| DescribeSubnets |
+------------+------------------+------------------------------------+
| AZ | CIDR | Name |
+------------+------------------+------------------------------------+
| eu-west-3c| 10.1.103.0/24 | vpc-acme-test-public-eu-west-3c |
| eu-west-3b| 172.31.16.0/20 | None |
| eu-west-3a| 10.1.101.0/24 | vpc-acme-test-public-eu-west-3a |
| eu-west-3c| 10.1.3.0/24 | vpc-acme-test-private-eu-west-3c |
| eu-west-3b| 10.1.2.0/24 | vpc-acme-test-private-eu-west-3b |
| eu-west-3a| 172.31.0.0/20 | None |
| eu-west-3c| 172.31.32.0/20 | None |
| eu-west-3a| 10.1.1.0/24 | vpc-acme-test-private-eu-west-3a |
| eu-west-3b| 10.1.102.0/24 | vpc-acme-test-public-eu-west-3b |
+------------+------------------+------------------------------------+
Based on assorted markdown tutorials and tests, the output that would render properly as a table in typora and github is something like:
| AZ | CIDR | Name |
|------------|------------------|------------------------------------|
| eu-west-3c| 10.1.103.0/24 | vpc-acme-test-public-eu-west-3c |
| eu-west-3b| 172.31.16.0/20 | None |
| eu-west-3a| 10.1.101.0/24 | vpc-acme-test-public-eu-west-3a |
| eu-west-3c| 10.1.3.0/24 | vpc-acme-test-private-eu-west-3c |
| eu-west-3b| 10.1.2.0/24 | vpc-acme-test-private-eu-west-3b |
| eu-west-3a| 172.31.0.0/20 | None |
| eu-west-3c| 172.31.32.0/20 | None |
| eu-west-3a| 10.1.1.0/24 | vpc-acme-test-private-eu-west-3a |
| eu-west-3b| 10.1.102.0/24 | vpc-acme-test-public-eu-west-3b |
(the text above does not render as a table in stackoverflow. Below a screenshot of this table rendered in typora:
I could not find any AWS CLI option, but the following unix-like series of filters does the job.
Pipe the output of the AWS command to:
sed s/'+'/'|'/g | tail -n +4 | head -n -1
The full CLI command is:
$ aws ec2 describe-subnets --query "Subnets[*].{CIDR:CidrBlock,Name:Tags[?Key=='Name']|[0].Value,AZ:AvailabilityZone}" --output table | sed s/'+'/'|'/g | tail -n +4 | head -n -1
Other suggestions welcome!
I am trying to build a data collection pipe-line on top of AWS services. Overal architecture is given below;
In summary system should get events from API gateway (1) ( one request for each event ) and the data should be written to Kinesis (2).
I am expecting ~100k events per second. My question is related to KPL usage on Lambda functions. On step 2 I am planning to write a Lambda method with KPL to write events on Kinesis with high throughput. But I am not sure it is possible as API Gateway calls lambda function for each event separately.
Is it possible/reasonable to use KPL in such architecture or I should using Kinesis Put API instead?
1 2 3 4
+----------------+ +----------------+ +----------------+ +----------------+
| | | | | | | |
| | | | | | | |
| AWS API GW +-----------> | AWS Lambda +-----------> | AWS Kinesis +----------> | AWS Lambda |
| | | Function with | | Streams | | |
| | | KPL | | | | |
| | | | | | | |
+----------------+ +----------------+ +----------------+ +-----+-----+----+
| |
| |
| |
| |
| |
5 | | 6
+----------------+ | | +----------------+
| | | | | |
| | | | | |
| AWS S3 <-------+ +----> | AWS Redshift |
| | | |
| | | |
| | | |
+----------------+ +----------------+
I am also thinking about writing directly to S3 instead of calling lambda function from api-gw. If first architecture is not reasonable this may be a solution but in that case I will have a delay till writing data to kinesis
1 2 3 4 5
+----------------+ +----------------+ +----------------+ +----------------+ +----------------+
| | | | | | | | | |
| | | | | | | | | |
| AWS API GW +-----------> | AWS Lambda +------> | AWS Lambda +-----------> | AWS Kinesis +----------> | AWS Lambda |
| | | to write data | | Function with | | Streams | | |
| | | to S3 | | KPL | | | | |
| | | | | | | | | |
+----------------+ +----------------+ +----------------+ +----------------+ +-----+-----+----+
| |
| |
| |
| |
| |
6 | | 7
+----------------+ | | +----------------+
| | | | | |
| | | | | |
I do not think using KPL is the right choice here. The key concept of KPL is, that records get collected at the client and then send as a batch operation to Kinesis. Since Lambdas are stateless per invocation, it would be rather difficult to store the records for aggregation (before sending it to Kinesis).
I think you should have a look at the following AWS article which explain how you can directly connect API-Gateway to Kinesis. This way, you can avoid the extra Lambda which just forwards your request.
Create an API Gateway API as an Kinesis Proxy
Obviously, if your data coming through AWS API Gateway corresponds to one Kinesis Data Streams record it makes no sense to use the KPL as pointed out by Jens. In this case you can make direct call of Kinesis API without using Lambda. Eventually, you may use some additional processing in Lambda and send the data through PutRecord (not PutRecords used by KPL). Your code in JAVA will looks like this
AmazonKinesisClientBuilder clientBuilder = AmazonKinesisClientBuilder.standard();
clientBuilder.setRegion(REGION);
clientBuilder.setCredentials(new DefaultAWSCredentialsProviderChain());
clientBuilder.setClientConfiguration(new ClientConfiguration());
AmazonKinesis kinesisClient = clientBuilder.build();
...
//then later on each record
PutRecordRequest putRecordRequest = new PutRecordRequest();
putRecordRequest.setStreamName(STREAM_NAME);
putRecordRequest.setData(data);
putRecordRequest.setPartitionKey(daasEvent.getAnonymizedId());
putRecordRequest.setExplicitHashKey(Utils.randomExplicitHashKey());
putRecordRequest.setSequenceNumberForOrdering(sequenceNumberOfPreviousRecord);
PutRecordResult putRecordResult = kinesisClient.putRecord(putRecordRequest);
sequenceNumberOfPreviousRecord = putRecordResult.getSequenceNumber();
However, there may be cases when using KPL from lambda makes sense. For example the data sent to AWS API Gateway contains multiple individual records which will be sent to one or multiple streams. In that cases the benefits (see https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html) of KPL are still valid, but you have to be aware of specifics given by using of Lambda concretely an "issue" pointed out here https://github.com/awslabs/amazon-kinesis-producer/issues/143 and use
kinesisProducer.flushSync()
at the end of insertions which worked also for me.
I'm using Doctrine 2 with my ZF2 project, but i'm getting some weird problem with my server CPU and memory. And my server simply crashes.
I'm getting a lot of sleep state querys and they seem not to get cleaned.
mysql> show processlist;
+---------+--------------+-----------+------------------+----------------+------+--------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+---------+--------------+-----------+------------------+----------------+------+--------------------+------------------------------------------------------------------------------------------------------+
| 2832346 | leechprotect | localhost | leechprotect | Sleep | 197 | | NULL |
| 2832629 | db_user | localhost | db_exemple | Sleep | 3 | | NULL |
| 2832643 | db_user | localhost | db_exemple | Sleep | 3 | | NULL |
| 2832646 | db_user | localhost | db_exemple | Sleep | 3 | | NULL |
| 2832664 | db_user | localhost | db_exemple | Sleep | 154 | | NULL |
| 2832666 | db_user | localhost | db_exemple | Sleep | 153 | | NULL |
| 2832669 | db_user | localhost | db_exemple | Sleep | 152 | | NULL |
| 2832674 | db_user | localhost | db_exemple | Sleep | 7 | | NULL |
| 2832681 | db_user | localhost | db_exemple | Sleep | 1 | | NULL |
| 2832683 | db_user | localhost | db_exemple | Sleep | 4 | | NULL |
| 2832690 | db_user | localhost | db_exemple | Sleep | 149 | | NULL |
(.......)
Also, it seems php GC is not cleaning all the objects from memory, or even kill processes.
Is there a way to disable the cache system? Would it improve the use of my resorces=
Most my querys are similar to:
$query = $this->createQueryBuilder('i');
$query->innerJoin('\Application\Relation', 'r', 'WITH', 'r.child = i.id');
$query->innerJoin('\Application\Taxonomy', 't', 'WITH', 't.id = r.taxonomy');
$query->where('t.type = :type')->setParameter('type', $relation);
$query->groupBy('i.id');
$items = $query->getQuery()->getResult(2);
Thanks in advance.
Firstly check the mysql's wait_timout variable. From the documentation:
Wait_timeout : The number of seconds the server waits for activity on
a noninteractive connection before closing it.
In normal flow (which not using persistent connections), php closes the connection automatically after script execution. To ensure there are no sleeping threads; at the end of your script simply close the connection:
$entityManager->getConnection()->close();
If these queries are running in a big while/for loop, you might want to read doctrine 2 batch processing documentation.
I am trying to get a list of hosts connected to a mysql server. How can i get this?
What should i do after connecting to the mysql server.
Code snippets will really help.
Also whats the best api to use to connect to mysql using c++?
One way you could do it is to execute the query show processlist, which will give you a table with Id, User, Host, db, Command, Time, State and Info columns. Remember that your show processlist query will be part of the output.
You can try this query: select distinct host from information_schema.processlist;
For example, there are multiple connections from 10.9.0.10 and one local connection.
mysql> select distinct host from information_schema.processlist;
+-----------------+
| host |
+-----------------+
| 10.9.0.10:63668 |
| 10.9.0.10:63670 |
| 10.9.0.10:63664 |
| 10.9.0.10:63663 |
| 10.9.0.10:63666 |
| 10.9.0.10:63672 |
| 10.9.0.10:63665 |
| 10.9.0.10:63671 |
| 10.9.0.10:63669 |
| 10.9.0.10:63667 |
| localhost |
| |
+-----------------+
12 rows in set (0,00 sec)
If you want only hosts (not different connections), you can try something like this: select distinct substring_index(host,':',1) from information_schema.processlist;
Example:
mysql> select distinct substring_index(host,':',1) from information_schema.processlist;
+-----------------------------+
| substring_index(host,':',1) |
+-----------------------------+
| 10.9.0.10 |
| localhost |
| |
+-----------------------------+
3 rows in set (0,00 sec)
You can see, that MySQL shows me one empty row, it is normal (i have a deamon process):
mysql> select distinct substring_index(host,':',1),`command` from information_schema.processlist;
+-----------------------------+---------+
| substring_index(host,':',1) | command |
+-----------------------------+---------+
| 10.9.0.10 | Sleep |
| localhost | Query |
| | Daemon |
+-----------------------------+---------+
You can remove it with where `command`!="Daemon" or where `host`!=''
And here is good link with query which also count connections from host and show which users are connected: http://blog.shlomoid.com/2011/08/how-to-easily-see-whos-connected-to.html