Redmine API get time entries report - redmine

Redmine API can serve time entries in JSON.
$ curl -X GET \
'https://my_redmine/time_entries.json' \
-H 'X-Redmine-API-Key: my_api_key'
{"time_entries":[{"id":14212,"project":{"id":73,"name":"Project 1"},"issue":{"id":5488},"user":{"id":5,"name":"John SMITH"},"activity":{"id":8,"name":"Grouillot"},"hours":8.0,"comments":"dsgsdsdh","spent_on":"2020-03-09","created_on":"2020-01-07T14:03:58Z","updated_on":"2020-01-07T14:33:52Z", ...
Now I need to get them aggregated and I'd rather use the built in report aggregation feature than doing it myself after having fetched all the required time entries.
Let's say we have these time entries
+------+----------+-------+------------+
| user | issue_id | hours | spent_on |
+------+----------+-------+------------+
| 1 | 42 | 4 | 2020-01-01 |
| 1 | 43 | 4 | 2020-01-01 |
| 2 | 42 | 8 | 2020-01-01 |
+------+----------+-------+------------+
If I want them to be aggregated by user, here is what need to get
+------+-------+------------+
| user | hours | spent_on |
+------+-------+------------+
| 1 | 8 | 2020-01-01 |
| 2 | 8 | 2020-01-01 |
+------+-------+------------+
There doesn't seem to be any way to do that in the current REST API
But since a report can be generated by CSV with this kind of query
https://my_redmine/time_entries/report.csv?columns=day&criteria%5B%5D=user
I naively though we could use it like
$ curl -X GET \
'https://my_redmine/time_entries/report.json?columns=day&criteria%5B%5D=user' \
-H 'Accept: application/json' \
-H 'X-Redmine-API-Key: my_api_key'
But no luck, it returns a redirection to an HTML login form.
Even fetching the CSV with the API key or a basic auth doesn't seem possible.
Does anyone would have any idea about getting the time entries aggregated with a REST HTTP query ?

Related

Django: Periodic increment to all records in one field

Is there a more efficient way of incrementing all records of a field every hour besides running a task that loops through all records at set time intervals and individually updates all records?
For example, User_profile Model:
username | coins_bought | coins_free | coins_spent
Amadeus | 0 | 0 | 0 <-- new user has 0 coins throughout
Ludwig | 5 | 5 | 3
Elise | 21 | 9 | 12 <-- old user with prior activity
1 hr later:
username | coins_bought | coins_free | coins_spent
Amadeus | 0 | 0+1 | 0
Ludwig | 5 | 5+1 | 3
Elise | 21 | 9+1 | 12
5 hr later:
username | coins_bought | coins_free | coins_spent
Amadeus | 0 | 5 | 0
Ludwig | 5 | 10 | 3
Elise | 21 | 14 | 12
In this example, users can buy coins or wait 1 hour until they all receive a free coin and can use in on the web-app. I can't make this feature client side, because it's not a mobile app, and caching is easy to corrupt.
Edit: I found the solution, if anyone else is stuck on this view this link
You can run an update method on a queryset as such at set time interval:
Model.object.all().update(same_field=F('same_field')+1)
One option would be to use an update query which could be run from a scheduled task controlled by Celery. Celery's documentation is pretty good on this: https://docs.celeryproject.org/en/stable/userguide/periodic-tasks.html

What causes processing time differences running REST endpoint from different sources?

Background:
We are creating a SAAS app using Vue front-end, Django/DRF backend, Postgresgl, all running in a Docker environment. The benchmarks below were run on our local dev machines.
The process to register a new "owner" is rather complex. It does the following:
Create tenant and schema
Run migrations (done in the create schema process)
Create MinIO bucket
Load "production" fixtures
Run sync_permissions
Create an owner instance in the newly created schema
We are seeing some significant differences in processing times for some of the above steps running the registration process in different ways. In trying to figure out our issue, we have tried the following four methods to invoke the registration process:
from the Vue front-end hitting the API endpoint
from a REST client (Talend)
from the APIBrowser (provided by DRF)
(in some cases) via manage.py
We tried it from the REST client to try to eliminate Vue as the culprit, but we got similar times between Vue and the REST client.
We also saw similar times between the APIBrowser and the manage.py method, so in the tables below, we are comparing Talend to APIBrowser (or manage.py).
The issue:
Here are the processing times for several of the steps listed above:
|---------------------|--------|------------|--------|
| Process | Talend | APIBrowser | Factor |
|---------------------|--------|------------|--------|
| Create Tenant | 11.853 | 1.185 | 10.0 |
|---------------------|--------|------------|--------|
| Create MinIO Bucket | 0.386 | 0.273 | 1.4 |
|---------------------|--------|------------|--------|
| Load Fixtures | 0.926 | 0.215 | 4.3 |
|---------------------|--------|------------|--------|
| Sync Permissions | 61.115 | 5.390 | 11.3 |
|---------------------|--------|------------|--------|
| Overall | 74.280 | 7.053 | 10.5 |
|---------------------|--------|------------|--------|
In both cases (Talend and APIBrowser), it is running the exact same code. We don't understand why the REST client method takes more than 10 times as long as running from APIBrowser.
We then tried to get down to finer detail in our benchmark timing. We focused on the first step and quickly noticed that it was the process of running migrate_schemas that was the issue. Here's a list of processing times for each migration file it processed. This time, we ran the second pass via manage.py instead of APIBrowser, but as mentioned previously, those times were comparable.
|---------------------|--------|-----------|--------|
| Migration file | Talend | manage.py | Factor |
|---------------------|--------|-----------|--------|
| activity_log.0001 | 0.133 | 0.013 | 10.2 |
| countries.0001 | 0.086 | 0.013 | 6.6 |
| contenttypes.0001 | 0.178 | 0.022 | 8.1 |
| contenttypes.0002 | 0.159 | 0.033 | 4.8 |
| auth.0001 | 0.530 | 0.092 | 5.8 |
| auth.0002 | 0.124 | 0.022 | 5.6 |
| auth.0003 | 0.090 | 0.023 | 3.9 |
| auth.0004 | 0.097 | 0.027 | 3.6 |
| auth.0005 | 0.126 | 0.016 | 7.9 |
| auth.0006 | 0.079 | 0.006 | 13.2 |
| auth.0007 | 0.079 | 0.011 | 7.2 |
| auth.0008 | 0.100 | 0.011 | 9.1 |
| auth.0009 | 0.085 | 0.014 | 6.1 |
| auth.0010 | 0.121 | 0.015 | 8.1 |
| auth.0011 | 0.087 | 0.018 | 4.8 |
| users.0001 | 0.871 | 0.115 | 7.6 |
| admin.0001 | 0.270 | 0.035 | 7.7 |
| admin.0002 | 0.093 | 0.022 | 4.2 |
| admin.0003 | 0.091 | 0.024 | 3.8 |
| authtoken.0001 | 0.193 | 0.036 | 5.4 |
| authtoken.0002 | 0.395 | 0.090 | 4.4 |
| clients.0001 | 0.537 | 0.082 | 6.5 |
| clients.0002 | 0.519 | 0.145 | 3.6 |
| projects.0001 | 0.475 | 0.062 | 7.7 |
| projects.0002 | 0.293 | 0.062 | 4.7 |
| sessions.0001 | 0.191 | 0.023 | 8.3 |
| tasks.0001 | 0.241 | 0.122 | 2.0 |
| tenants.0001 | 0.086 | 0.017 | 5.1 |
|---------------------|--------|-----------|--------|
| Total time: | 10.404 | 1.618 | 6.4 |
|---------------------|--------|-----------|--------|
Our Theory:
We think it must have something to do with Talend (and Vue) initiating the process from a different domain (as it will be when the site is live), but in the case of APIBrowser, it starts from the actual endpoint (i.e. the same domain) that the endpoint is defined for.
That means, in our local environment, running from Vue, we are on local.dev and it hits the local.api endpoint. But running from APIBrowser, we go directly to local.api, then fill in the data on the form and POST it.
Our theory is that it must be affecting how files are accessed. The migrate_schemas process has to open many .py files. And the worst culprit, SyncPermissions, is processing many .yaml files where we have defined our default permission structure utilized by each tenant. I should point out that the LoadFixtures process also opens external .yaml files, but in this case, it only has one file to process, so the difference is minimized.
It may be like the difference between opening an image file in code vs. a template showing an image via HTML. In the HTML version, it's essentially another request on the server - which surely takes longer than programmatically opening an image on disk.
What we don't understand is why opening files in these processes would be affected by the two methods of initiating the process. Obviously, since the site will have to run in Vue, having the registration process take 70 seconds when we know it could be done in only 7 seconds is unacceptable.
Note:
I realize it is the norm here in SO to include code for the process in question, but in this case, both processes are running the exact same code - which is why I decided not to post several hundred lines of code here.
Edit (in response to #Iain Shelvington)
The process starts in the post() method of TenantRegister view:
class TenantRegister(APIView):
def post(self, request, *args, **kwargs):
...
tenant_data = request.data.pop('tenant', dict())
tenant_serializer = TenantSaveSerializer(data=tenant_data)
tenant_serializer.is_valid(raise_exception=True)
tenant = tenant_serializer.create(tenant_serializer.validated_data)
...
...which calls the create() method of TenantSaveSerializer:
class TenantSaveSerializer(serializers.ModelSerializer):
class Meta:
model = Tenant
fields = '__all__'
def create(self, validated_data):
...
tenant = Tenant.objects.create(**validated_data)
...
if has_schema and tenant.auto_create_schema:
try:
tenant.create_schema(check_if_exists=True, verbosity=self.verbosity)
post_schema_sync.send(sender=Tenant, tenant=tenant)
except Exception:
# We failed creating the schema, delete what
# was created and re-raise the exception.
tenant.delete(force_drop=True)
raise
else:
# Although we are not using the schema functions directly,
# the signal might be registered by a listener.
schema_needs_to_be_sync.send(sender=Tenant, tenant=self)
return tenant
...which calls the create_schema() method on the Tenant model instance:
def create_schema(self, check_if_exists=False, sync_schema=True,
verbosity=1):
connection = connections[get_tenant_database_alias()]
cursor = connection.cursor()
# Create the schema.
cursor.execute('CREATE SCHEMA "%s"' % self.schema_name)
call_command(
'migrate_schemas',
tenant=True,
schema_name=self.schema_name,
interactive=False,
verbosity=verbosity)
connection.set_schema_to_public()
return True
As for the timing of each migration, my colleague did those. I believe he said he just set verbosity to a higher value and the migrate_schemas process produced the timed output.

query to give workflow statistics like source count,target count,start time and end time of each sessions

I have one workflow which contain five sessions. I am looking for a query by using informatica repository tables/views which give me output like below. I am not able to get a query which give me desired result.
workflow-names session-names source-count target-count session-start time session-end time.
If you have access to Repository metadata tables, then you can use below query
Metadata Tables used in query:
OPB_SESS_TASK_LOG
OPB_TASK_INST_RUN
OPB_WFLOW_RUN
Here the Repository user is INFA_REP, and workflow name is wf_emp_load.
SELECT w.WORKFLOW_NAME,
t.INSTANCE_NAME,
s.SRC_SUCCESS_ROWS,
s.TARG_SUCCESS_ROWS,
t.START_TIME,
t.END_TIME
FROM INFA_REP.OPB_SESS_TASK_LOG s
INNER JOIN INFA_REP.OPB_TASK_INST_RUN t
ON s.INSTANCE_ID=t.INSTANCE_ID
AND s.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
INNER JOIN INFA_REP.OPB_WFLOW_RUN w
ON w.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
WHERE w.WORKFLOW_RUN_ID =
(SELECT MAX(WORKFLOW_RUN_ID)
FROM INFA_REP.OPB_WFLOW_RUN
WHERE WORKFLOW_NAME='wf_emp_load')
ORDER BY t.START_TIME
Output
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| WORKFLOW_NAME | INSTANCE_NAME | SRC_SUCCESS_ROWS | TARG_SUCCESS_ROWS | START_TIME | END_TIME |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| wf_emp_load | s_emp_load | 14 | 14 | 10-JUN-18 18:31:24 | 10-JUN-18 18:31:26 |
| wf_emp_load | s_emp_revert | 14 | 14 | 10-JUN-18 18:31:27 | 10-JUN-18 18:31:28 |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+

how to classify the whole data set in weka

I've got a supervised data set with 6836 instances, and I need to know the predictions of my model for all the instances, not only for a test set.
I followed the approach train-test (2/3-1/3) to know about my rates TPR and FPR, and I've got the predictions about my test (1/3), but I need to know the predcitions about all the 6836 instances.
How can I do it?
Thanks!
In the classify tab in Weka Explorer there should be a button that says 'More options...' if you go in there you should be able to output predictions as plain text. If you use cross validation rather than a percentage split you will get predictions for all instances in a table like this:
+-------+--------+-----------+-------+------------+
| inst# | actual | predicted | error | prediction |
+-------+--------+-----------+-------+------------+
| 1 | 2:no | 1:yes | + | 0.926 |
| 2 | 1:yes | 1:yes | | 0.825 |
| 1 | 2:no | 1:yes | + | 0.636 |
| 2 | 1:yes | 1:yes | | 0.808 |
| ... | ... | ... | ... | ... |
+-------+--------+-----------+-------+------------+
If you don't want to do cross validation you also can create a data set containing all your data (training + test) and add it as test data. Then you can go to more options and show the results as Campino already answered.

How to get a list of hosts connected to a mysql server

I am trying to get a list of hosts connected to a mysql server. How can i get this?
What should i do after connecting to the mysql server.
Code snippets will really help.
Also whats the best api to use to connect to mysql using c++?
One way you could do it is to execute the query show processlist, which will give you a table with Id, User, Host, db, Command, Time, State and Info columns. Remember that your show processlist query will be part of the output.
You can try this query: select distinct host from information_schema.processlist;
For example, there are multiple connections from 10.9.0.10 and one local connection.
mysql> select distinct host from information_schema.processlist;
+-----------------+
| host |
+-----------------+
| 10.9.0.10:63668 |
| 10.9.0.10:63670 |
| 10.9.0.10:63664 |
| 10.9.0.10:63663 |
| 10.9.0.10:63666 |
| 10.9.0.10:63672 |
| 10.9.0.10:63665 |
| 10.9.0.10:63671 |
| 10.9.0.10:63669 |
| 10.9.0.10:63667 |
| localhost |
| |
+-----------------+
12 rows in set (0,00 sec)
If you want only hosts (not different connections), you can try something like this: select distinct substring_index(host,':',1) from information_schema.processlist;
Example:
mysql> select distinct substring_index(host,':',1) from information_schema.processlist;
+-----------------------------+
| substring_index(host,':',1) |
+-----------------------------+
| 10.9.0.10 |
| localhost |
| |
+-----------------------------+
3 rows in set (0,00 sec)
You can see, that MySQL shows me one empty row, it is normal (i have a deamon process):
mysql> select distinct substring_index(host,':',1),`command` from information_schema.processlist;
+-----------------------------+---------+
| substring_index(host,':',1) | command |
+-----------------------------+---------+
| 10.9.0.10 | Sleep |
| localhost | Query |
| | Daemon |
+-----------------------------+---------+
You can remove it with where `command`!="Daemon" or where `host`!=''
And here is good link with query which also count connections from host and show which users are connected: http://blog.shlomoid.com/2011/08/how-to-easily-see-whos-connected-to.html