How can I load only changed data in PowerBI? - powerbi

I have weekly user login data for a program. I need to set up my query to only append rows for the people who logged in this week. The query is pulled from a folder containing each week's export in a separate csv file.
Example of files with pull date, user name, and last login:
Pull date: 2019-08-09
Mufasa 08-08
Simba 08-08
Nala 08-07
Timon 07-15
Pumba 06-03
Pull date: 2019-08-16
Mufasa 08-14
Simba 08-13
Nala 08-12
Timon 07-15
Pumba 06-03
Pull date: 2019-08-23
Mufasa 08-23
Simba 08-13
Nala 08-12
Timon 07-15
Pumba 06-03
What I want to see in PowerBI is this:
Mufasa 08-08
Mufasa 08-14
Mufasa 08-23
Simba 08-08
Simba 08-13
Nala 08-07
Nala 08-12
Timon 07-15
Pumba 06-03
I don't think I want incremental refresh, but I'm willing to learn more. I want my appended query to only have the data indicated, not just an abbreviated refresh. I may be misunderstanding how incremental refresh works, though.
Would a parameter on last login work, and if so, how would I set that up?

Related

Django refuses to use DB view

I have the following scenario:
I have defined the right view in the database, taking care that the view is named according to the django conventions
I have made sure that my model is not managed by django. The migration created is accordingly defined with managed=False
The DB view is working fine by itself.
When triggering the API endpoint, two strange things happen:
the request to the database fails with:
ERROR: relation "consumption_recentconsumption" does not exist at character 673
(I have logging enabled at the postgres level, and copy-pasting the exact same request into a db console client works, without modifications whatsoever)
the request to the DB gets retried lots of times (more than 30?). Why is this happening? Is there a django setting to control this? (I am sending the request to the API just once, manually with curl)
EDIT
This is my model:
class RecentConsumption(models.Model):
name = models.CharField(max_length=100)
...
class Meta:
managed = False
This is the SQL statement, as generated by django and sent to the db:
SELECT "consumption_recentconsumption"."id", "consumption_recentconsumption"."name", ... FROM "consumption_recentconsumption" LIMIT 21;
As I mentioned, this fails through django, but works fine when run directly against the db.
EDIT2
Logs from postgres, when running the sql directly:
2018-12-13 11:12:02.954 UTC [66] LOG: execute <unnamed>: SAVEPOINT JDBC_SAVEPOINT_4
2018-12-13 11:12:02.955 UTC [66] LOG: execute <unnamed>: SELECT "consumption_recentconsumption"."id", "consumption_recentconsumption"."name", "consumption_recentconsumption"."date", "consumption_recentconsumption"."psc", "consumption_recentconsumption"."material", "consumption_recentconsumption"."system", "consumption_recentconsumption"."env", "consumption_recentconsumption"."objs", "consumption_recentconsumption"."size", "consumption_recentconsumption"."used", "consumption_recentconsumption"."location", "consumption_recentconsumption"."WWN", "consumption_recentconsumption"."hosts", "consumption_recentconsumption"."pool_name", "consumption_recentconsumption"."storage_name", "consumption_recentconsumption"."server" FROM "consumption_recentconsumption" LIMIT 21
2018-12-13 11:12:10.038 UTC [66] LOG: execute <unnamed>: RELEASE SAVEPOINT JDBC_SAVEPOINT_4
Logs from postgres when running through django (repeated more than 30 times):
2018-12-13 11:13:50.782 UTC [75] LOG: statement: SELECT "consumption_recentconsumption"."id", "consumption_recentconsumption"."name", "consumption_recentconsumption"."date", "consumption_recentconsumption"."psc", "consumption_recentconsumption"."material", "consumption_recentconsumption"."system", "consumption_recentconsumption"."env", "consumption_recentconsumption"."objs", "consumption_recentconsumption"."size", "consumption_recentconsumption"."used", "consumption_recentconsumption"."location", "consumption_recentconsumption"."WWN", "consumption_recentconsumption"."hosts", "consumption_recentconsumption"."pool_name", "consumption_recentconsumption"."storage_name", "consumption_recentconsumption"."server" FROM "consumption_recentconsumption" LIMIT 21
2018-12-13 11:13:50.783 UTC [75] ERROR: relation "consumption_recentconsumption" does not exist at character 673
2018-12-13 11:13:50.783 UTC [75] STATEMENT: SELECT "consumption_recentconsumption"."id", "consumption_recentconsumption"."name", "consumption_recentconsumption"."date", "consumption_recentconsumption"."psc", "consumption_recentconsumption"."material", "consumption_recentconsumption"."system", "consumption_recentconsumption"."env", "consumption_recentconsumption"."objs", "consumption_recentconsumption"."size", "consumption_recentconsumption"."used", "consumption_recentconsumption"."location", "consumption_recentconsumption"."WWN", "consumption_recentconsumption"."hosts", "consumption_recentconsumption"."pool_name", "consumption_recentconsumption"."storage_name", "consumption_recentconsumption"."server" FROM "consumption_recentconsumption" LIMIT 21
Answering myself, in case this helps somebody in the future.
I am running the CREATE VIEW command in PyCharm, which seems to use a transaction for all operations. That means that the view is available within the db session in PyCharm (since it uses the transaction for all requests), but not from outside. The django app is running in the console, and does not see the view.
The solution is just to commit the transaction in PyCharm, to make it fully visible.
The final solution is to create the view via django migrations.

How to print initial letter of committer in git log?

I have prepared an alias to get a short log report in git
# excerpt from ~/.gitconfig
[alias]
lg = log --all --oneline --graph --decorate --pretty='%C(auto)%h %Cgreen%ai %C(reset)%C(auto)%s %d'
git lg generates one nice line per commit, but without information on the user:
* 623beff 2016-11-14 14:18:36 +0100 extended plotstyle option and automatic colors
or as screenshot:
But I want to see the initial letters of the committer real name (the full name is sometimes too long) in each line:
* 623beff 2016-11-14 14:18:36 +0100 (J.S.) extended plotstyle option and automatic colors
How can I get this result?
there is a way to do this to get the first letter of the first name, using %<(3,trunc)%cN:
git log --all --oneline --graph --decorate --pretty='%C(auto)%h %Cgreen%ai %C(reset)%C(auto)(%<(3,trunc)%cN) %s %d'
output:
* 8759307 2009-01-15 16:11:48 +0000 (S..) Remove spurious code trying to tag a branch root before the mark was created. (HEAD -> master, origin/master, origin/HEAD)
* 939f999 2008-12-11 13:41:37 +0000 (S..) When just writing output file, do not try to devise lock target with no repository.

how to pre-fix a piece of text in github "git log" using shell-script

I need to make a github commit (the text), from the git command git log into a link in an email. So the recipient can click on the link and go directly to the change.
I receive a long list containing lines with the text:
commit some_long_string_of_hexadecimals
and I need to transform this into:
commit https://github.com/account/repo/commit/some_long_string_of_hexadecimals
The log I am receiving contain n-amount of these logs, so I need the script to do this for all instances of this (some_long_string_of_hexadecimals).
Here are a few example log statements:
commit a98a897a67896a987698a769786a987a6987697a6
Author: Some Person <some#email.com>
Date: Thu Sep 29 09:48:52 2016 +0200
long message describing change.
commit a98a897a67896a987698a769786a987a6987697a6
Author: Some Person <some#email.com>
Date: Thu Sep 29 09:48:52 2016 +0200
more description
I'd like it to look like this:
commit https://github.com/account/repo/commit/a98a897a67896a987698a769786a987a6987697a6
Author: Some Person <some#email.com>
Date: Thu Sep 29 09:48:52 2016 +0200
added handling of running tests from within a docker container
How do I achieve this using a shell command ?
Thanks in advance.
awk '$1 == "commit" {$2 = "https://github.com/account/repo/commit/" $2} 1'
check if field 1 equals "commit"
if so, prepend to field 2
if line matched, print modified line, else print line as is

CloudWatch logs acting weird

I have two log files with multi-line log statements. Both of them have same datetime format at the begining of each log statement. The configuration looks like this:
state_file = /var/lib/awslogs/agent-state
[/opt/logdir/log1.0]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log1.0
log_stream_name = /opt/logdir/logs/log1.0
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group
[/opt/logdir/log2-console.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log2-console.log
log_stream_name = /opt/logdir/log2-console.log
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group
The cloudwatch logs agent is sending log1.0 logs correctly to my log group on cloudwatch, however, its not sending log files for log2-console.log.
awslogs.log says:
2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196444000, 'start_position': 42330916L, 'end_position': 42331504L}, reason: timestamp is more than 2 hours in future.
2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196451000, 'start_position': 42331504L, 'end_position': 42332092L}, reason: timestamp is more than 2 hours in future.
Though server time is correct. Also weird thing is Line numbers mentioned in start_position and end_position does not exist in actual log file being pushed.
Anyone else experiencing this issue?
I was able to fix this.
The state of awslogs was broken. The state is stored in a sqlite database in /var/awslogs/state/agent-state. You can access it via
sudo sqlite3 /var/awslogs/state/agent-state
sudo is needed to have write access.
List all streams with
select * from stream_state;
Look up your log stream and note the source_id which is part of a json data structure in the v column.
Then, list all records with this source_id (in my case it was 7675f84405fcb8fe5b6bb14eaa0c4bfd) in the push_state table
select * from push_state where k="7675f84405fcb8fe5b6bb14eaa0c4bfd";
The resulting record has a json data structure in the v column which contains a batch_timestamp. And this batch_timestamp seams to be wrong. It was in the past and any newer (more than 2 hours) log entries were not processed anymore.
The solution is to update this record. Copy the v column, replace the batch_timestamp with the current timestamp and update with something like
update push_state set v='... insert new value here ...' where k='7675f84405fcb8fe5b6bb14eaa0c4bfd';
Restart the service with
sudo /etc/init.d/awslogs restart
I hope it works for you!
We had the same issue and the following steps fixed the issue.
If log groups are not updating with latest events:
Run These steps:
Stopped the awslogs service
Deleted file /var/awslogs/state/agent-state
Updated /var/awslogs/etc/awslogs.conf configuration from hostaname to
instance ID Ex:
log_stream_name = {hostname} to log_stream_name = {instance_id}
Started awslogs service.
I was able to resolve this issue on Amazon Linux by:
sudo yum reinstall awslogs
sudo service awslogs restart
This method retained my config files in /var/awslogs/, though you may wish to back them up before a reinstall.
Note: In my troubleshooting, I had also deleted my Log Group via the AWS Console. The restart fully reloaded all historical logs, but at the present timestamp, which is of less value. I'm unsure if deleting the Log Group was this was necessary for this method to work. You might want to look at setting the initial_position config to end_of_file before you restart.
I found the reason. The time zone in my docker container is inconsistent with the time zone of my host computer. After setting the two time zones to be consistent, the problem is solved

Git: Getting commits with a ticket number

Okay, so I'm trying to find out if a ticket has been included in my release branch. The tickets are all built out of a project id and an id number, e.g. (PRO-123). I've tried this command:
git log --date=short --format="%h: %ad (%cn) %s" --abbrev-commit --grep='[A-Z]+-[0-9]+'
But it's not returning anything. If I take away the --grep part there's loads of matches to the pattern. For instance:
a6fdcd0: 2016-03-16 (ajfaraday) Merge remote-tracking branch 'origin/develop_5.2_customer' into release_5.2_customer
85d107a: 2016-03-16 (username) Merge pull request #477 from myapp/fix_CST-827_outline_method_in_use_check
6024bda: 2016-03-16 (Andrew Faraday) Merge pull request #473 from myapp/fix_CST-810_soap_container_create_bounds
eec2a61: 2016-03-16 (ajfaraday) added missing stubs
c03b3cb: 2016-03-15 (username) Merge pull request #472 from myapp/fix_CST-490_options_are_clickable_for_user_without_module_admin_rights
728539b: 2016-03-15 (username) Merge pull request #474 from myapp/fix_CST-873_hidden_error_on_pev_validation
4a11dd7: 2016-03-15 (username) Merge pull request #475 from myapp/fix_CST-854_copy_process_version_project_element_values
4a5af44: 2016-03-15 (ajfaraday) CST-854: fixed in-use check for methods
What am I doing wrong?
Okay, I think I've found the problem. It's some minor language difference in regexes (I'm usually writing them in my Ruby code).
For some reason [A-Z]+ wasn't matching but [A-Z]* is working fine. This line does what I wanted:
git log --date=short --format="%h: %ad (%cn) %s" --abbrev-commit --grep="[A-Z]*-[0-9]*"