Increasing SQLite SELECT performance - c++

I have a program that does some math in an SQL query. There are hundreds of thousands rows (some device measurements) in an SQLite table, and using this query, the application breaks these measurements into groups of, for example, 10000 records, and calculates the average for each group. Then it returns the average value for each of these groups.
The query looks like this:
SELECT strftime('%s',Min(Stamp)) AS DateTimeStamp,
AVG(P) AS MeasuredValue,
((100 * (strftime('%s', [Stamp]) - 1334580095)) /
(1336504574 - 1334580095)) AS SubIntervalNumber
FROM LogValues
WHERE ((DeviceID=1) AND (Stamp >= datetime(1334580095, 'unixepoch')) AND
(Stamp <= datetime(1336504574, 'unixepoch')))
GROUP BY ((100 * (strftime('%s', [Stamp]) - 1334580095)) /
(1336504574 - 1334580095)) ORDER BY MIN(Stamp)
The numbers in this request are substituted by my application with some values.
I don't know if i can optimize this request more (if anyone could help me to do so, i'd really appreciate)..
This SQL query can be executed using an SQLite command line shell (sqlite3.exe). On my Intel Core i5 machine it takes 4 seconds to complete (there are 100000 records in the database that are being processed).
Now, if i write a C program, using sqlite.h C interface, I am waiting for 14 seconds for exactly the same query to complete. This C program "waits" during these 14 seconds on the first sqlite3_step() function call (any following sqlite3_step() calls are executed immediately).
From the Sqlite download page I have downloaded SQLite command line shell's source code and build it using Visual Studio 2008. I ran it and executed the query. Again 14 seconds.
So why does a prebuilt, downloaded from the sqlite website, command line tool takes only 4 seconds, while the same tool, built by me, takes 4 times longer time to execute?
I am running Windows 64 bit. The prebuilt tool is an x86 process. It also does not seem to be multicore optimized - in a Task Manager, during query execution, I can see only one core busy, for both built-by-mine and prebuilt SQLite shells.
Any way I could make my C program execute this query as fast as the prebuilt command line tool does it?
Thanks!

Related

HTCondor - Partitionable slot not working

I am following the tutorial on
Center for High Throughput Computing and Introduction to Configuration in the HTCondor website to set up a Partitionable slot. Before any configuration I run
condor_status
and get the following output.
I update the file 00-minicondor in /etc/condor/config.d by adding the following lines at the end of the file.
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=4
SLOT_TYPE_1_PARTITIONABLE = TRUE
and reconfigure
sudo condor_reconfig
Now with
condor_status
I get this output as expected. Now, I run the following command to check everything is fine
condor_status -af Name Slotype Cpus
and find slot1#ip-172-31-54-214.ec2.internal undefined 1 instead of slot1#ip-172-31-54-214.ec2.internal Partitionable 4 61295 that is what I would expect. Moreover, when I try to summit a job that asks for more than 1 cpu it does not allocate space for it (It stays waiting forever) as it should.
I don't know if I made some mistake during the installation process or what could be happening. I would really appreciate any help!
EXTRA INFO: If it can be of any help have have installed HTCondor with the command
curl -fsSL https://get.htcondor.org | sudo /bin/bash -s – –no-dry-run
on Ubuntu 18.04 running on an old p2.xlarge instance (it has 4 cores).
UPDATE: After rebooting the whole thing it seems to be working. I can now send jobs with different CPUs requests and it will start them properly.
The only issue I would say persists is that Memory allocation is not showing properly, for example:
But in reality it is allocating enough memory for the job (in this case around 12 GB).
If I run again
condor_status -af Name Slotype Cpus
I still get something I am not supposed to
But at least it is showing the correct number of CPUs (even if it just says undefined).
What is the output of condor_q -better when the job is idle?

Parallel run on multiple devices using Appium with Python

Please help me out how to run the single script in multiple devices in parallel using Python..?
I have started two different Appium servers using Selenium Grid.. But I'm not able to write the code to start the different drivers in two devices and run script parallel using Python..
Its better if you prepared a separate file for values which you are going to use it and separate code file where you can mention your test cases and the keywords.
Here is an example for values files :
Device:
SamsungA7:
Device_name: 111354d3 #Device id
server: http://localhost:4723/wd/hub #appium server URL
appPackage: com.android.contacts #app package name of your application
appActivity: com.android.contacts.activities.PeopleActivity #app activity of your application
platfrom: 6.0 #platfom version of your device
automation: Appium #Appium is used in automationName instead of Uiautomator for deviced on android version 4.4
Here is an example for the Code file:
* Settings *
Test Setup Sum of two numbers a+b
Test Teardown Set the Default Values
Suite setup Set value
* Variables *
Default Values
Value of A 1
Value of B 1
* Test Cases *
[Setup] Sum of first two numbers should be 6
Enter first value 5
Enter second value 1
5+1
* Test Cases *
[Setup] Sum of Second two numbers should be 11
Enter sum of first value 6
Enter second value 5
6+5
* Keywords *
Test TearDown
Set the Default values
Note: Code file should be in the .robot format and you can save the script file in yaml or notepad or json.

While loop implementation in Pentaho Kettle

I need guidence on implementing WHILE loop with Kettle/PDI. The scenario is
(1) I have some (may be thousand or thousands of thousand) data in a table, to be validated with a remote server.
(2) Read them and loopup to the remote server; I use Modified Java Script for this as remote server lookup validation is defined in external Java JAR file (I can use "Change number of copies to start... option on Modified java script and set to 5 or 10)
(3) Update the result on database table. There will be 50 to 60% connection failure cases each session.
(4) Repeat Step 1 to step 3 till all gets updated to success
(5) Stop looping on Nth cycle; this is to avoid very long or infinite looping, N value may be 5 or 10.
How to design such a WhILE loop in Pentaho Kettle?
Have you seen this link? It gives a pretty well detailed explanation of how to implement a while loop.
You need a parent job with a sub-transformation for doing a check on the condition which will return a variable to the job on whether to abort or to continue.

Limiting the lifetime of a file in Python

Helo there,
I'm looking for a way to limit a lifetime of a file in Python, that is, to make a file which will be auto deleted after 5 minutes after creation.
Problem:
I have a Django based webpage which has a service that generates plots (from user submitted input data) which are showed on the web page as .png images. The images get stored on the disk upon creation.
Image files are created per session basis, and should only be available a limited time after the user has seen them, and should be deleted 5 minutes after they have been created.
Possible solutions:
I've looked at Python tempfile, but that is not what I need, because the user should have to be able to return to the page containing the image without waiting for it to be generated again. In other words it shouldn't be destroyed as soon as it is closed
The other way that comes in mind is to call some sort of an external bash script which would delete files older than 5 minutes.
Does anybody know a preferred way doing this?
Ideas can also include changing the logic of showing/generating the image files.
You should write a Django custom management command to delete old files that you can then call from cron.
If you want no files older than 5 minutes, then you need to call it every 5 minutes of course. And yes, it would run unnecessarily when there are no users, but that shouln't worry you too much.
Ok that might be a good approach i guess...
You can write a script that checks your directory and delete outdated files, and choose the oldest file from the un-deleted files. Calculate how much time had passed since that file is created and calculate the remaining time to deletion of that file. Then call sleep function with remaining time. When sleep time ends and another loop begins, there will be (at least) one file to be deleted. If there is no files in the directory, set sleep time to 5 minutes.
In that way you will ensure that each file will be deleted exactly 5 minutes later, but when there are lots of files created simultaneously, sleep time will decrease greatly and your function will begin to check each file more and more often. To aviod that you add a proper latency to sleep function before starting another loop, like, if the oldest file is 4 minutes old, you can set sleep to 60+30 seconds (adding all time calculations 30 seconds).
An example:
from datetime import datetime
import time
import os
def clearDirectory():
while True:
_time_list = []
_now = time.mktime(datetime.now().timetuple())
for _f in os.listdir('/path/to/your/directory'):
if os.path.isfile(_f):
_f_time = os.path.getmtime(_f) #get file creation/modification time
if _now - _f_time < 300:
os.remove(_f) # delete outdated file
else:
_time_list.append(_f_time) # add time info to list
# after check all files, choose the oldest file creation time from list
_sleep_time = (_now - min(_time_list)) if _time_list else 300 #if _time_list is empty, set sleep time as 300 seconds, else calculate it based on the oldest file creation time
time.sleep(_sleep_time)
But as i said, if files are created oftenly, it is better to set a latency for sleep time
time.sleep(_sleep_time + 30) # sleep 30 seconds more so some other files might be outdated during that time too...
Also, it is better to read getmtime function for details.

Programmatically getting system boot up time in c++ (windows)

So quite simply, the question is how to get the system boot up time in windows with c/c++.
Searching for this hasn't got me any answer, I have only found a really hacky approach which is reading a file timestamp ( needless to say, I abandoned reading that halfway ).
Another approach that I found was actually reading windows diagnostics logged events? Supposedly that has last boot up time.
Does anyone know how to do this (with hopefully not too many ugly hacks)?
GetTickCount64 "retrieves the number of milliseconds that have elapsed since the system was started."
Once you know how long the system has been running, it is simply a matter of subtracting this duration from the current time to determine when it was booted. For example, using the C++11 chrono library (supported by Visual C++ 2012):
auto uptime = std::chrono::milliseconds(GetTickCount64());
auto boot_time = std::chrono::system_clock::now() - uptime;
You can also use WMI to get the precise time of boot. WMI is not for the faint of heart, but it will get you what you are looking for.
The information in question is on the Win32_OperatingSystem object under the LastBootUpTime property. You can examine other properties using WMI Tools.
Edit:
You can also get this information from the command line if you prefer.
wmic OS Get LastBootUpTime
As an example in C# it would look like the following (Using C++ it is rather verbose):
static void Main(string[] args)
{
// Create a query for OS objects
SelectQuery query = new SelectQuery("Win32_OperatingSystem", "Status=\"OK\"");
// Initialize an object searcher with this query
ManagementObjectSearcher searcher = new ManagementObjectSearcher(query);
string dtString;
// Get the resulting collection and loop through it
foreach (ManagementObject envVar in searcher.Get())
dtString = envVar["LastBootUpTime"].ToString();
}
The "System Up Time" performance counter on the "System" object is another source. It's available programmatically using the PDH Helper methods. It is, however, not robust to sleep/hibernate cycles so is probably not much better than GetTickCount()/GetTickCount64().
Reading the counter returns a 64-bit FILETIME value, the number of 100-NS ticks since the Windows Epoch (1601-01-01 00:00:00 UTC). You can also see the value the counter returns by reading the WMI table exposing the raw values used to compute this. (Read programmatically using COM, or grab the command line from wmic:)
wmic path Win32_PerfRawData_PerfOS_System get systemuptime
That query produces 132558992761256000 for me, corresponding to Saturday, January 23, 2021 6:14:36 PM UTC.
You can use the PerfFormattedData equivalent to get a floating point number of seconds, or read that from the command line in wmic or query the counter in PowerShell:
Get-Counter -Counter '\system\system up time'
This returns an uptime of 427.0152 seconds.
I also implemented each of the other 3 answers and have some observations that may help those trying to choose a method.
Using GetTickCount64 and subtracting from current time
The fastest method, clocking in at 0.112 ms.
Does not produce a unique/consistent value at the 100-ns resolution of its arguments, as it is dependent on clock ticks. Returned values are all within 1/64 of a second of each other.
Requires Vista or newer. XP's 32-bit counter rolls over at ~49 days and can't be used for this approach, if your application/library must support older Windows versions
Using WMI query of the LastBootUpTime field of Win32_OperatingSystem
Took 84 ms using COM, 202ms using wmic command line.
Produces a consistent value as a CIM_DATETIME string
WMI class requires Vista or newer.
Reading Event Log
The slowest method, taking 229 ms
Produces a consistent value in units of seconds (Unix time)
Works on Windows 2000 or newer.
As pointed out by Jonathan Gilbert in the comments, is not guaranteed to produce a result.
The methods also produced different timestamps:
UpTime: 1558758098843 = 2019-05-25 04:21:38 UTC (sometimes :37)
WMI: 20190524222528.665400-420 = 2019-05-25 05:25:28 UTC
Event Log: 1558693023 = 2019-05-24 10:17:03 UTC
Conclusion:
The Event Log method is compatible with older Windows versions, produces a consistent timestamp in unix time that's unaffected by sleep/hibernate cycles, but is also the slowest. Given that this is unlikely to be run in a loop it's this may be an acceptable performance impact. However, using this approach still requires handling the situation where the Event log reaches capacity and deletes older messages, potentially using one of the other options as a backup.
C++ Boost used to use WMI LastBootUpTime but switched, in version 1.54, to checking the system event log, and apparently for a good reason:
ABI breaking: Changed bootstamp function in Windows to use EventLog service start time as system bootup time. Previously used LastBootupTime from WMI was unstable with time synchronization and hibernation and unusable in practice. If you really need to obtain pre Boost 1.54 behaviour define BOOST_INTERPROCESS_BOOTSTAMP_IS_LASTBOOTUPTIME from command line or detail/workaround.hpp.
Check out boost/interprocess/detail/win32_api.hpp, around line 2201, the implementation of the function inline bool get_last_bootup_time(std::string &stamp) for an example. (I'm looking at version 1.60, if you want to match line numbers.)
Just in case Boost ever dies somehow and my pointing you to Boost doesn't help (yeah right), the function you'll want is mainly ReadEventLogA and the event ID to look for ("Event Log Started" according to Boost comments) is apparently 6005.
I haven't played with this much, but I personally think the best way is probably going to be to query the start time of the "System" process. On Windows, the kernel allocates a process on startup for its own purposes (surprisingly, a quick Google search doesn't easily uncover what its actual purposes are, though I'm sure the information is out there). This process is called simply "System" in the Task Manager, and always has PID 4 on current Windows versions (apparently NT 4 and Windows 2000 may have used PID 8 for it). This process never exits as long as the system is running, and in my testing behaves like a full-fledged process as far as its metadata is concerned. From my testing, it looks like even non-elevated users can open a handle to PID 4, requesting PROCESS_QUERY_LIMITED_INFORMATION, and the resulting handle can be used with GetProcessTimes, which will fill in the lpCreationTime with the UTC timestamp of the time the process started. As far as I can tell, there isn't any meaningful way in which Windows is running before the System process is running, so this timestamp is pretty much exactly when Windows started up.
#include <iostream>
#include <iomanip>
#include <windows.h>
using namespace std;
int main()
{
unique_ptr<remove_pointer<HANDLE>::type, decltype(&::CloseHandle)> hProcess(
::OpenProcess(
PROCESS_QUERY_LIMITED_INFORMATION,
FALSE, // bInheritHandle
4), // dwProcessId
::CloseHandle);
FILETIME creationTimeStamp, exitTimeStamp, kernelTimeUsed, userTimeUsed;
FILETIME creationTimeStampLocal;
SYSTEMTIME creationTimeStampSystem;
if (::GetProcessTimes(hProcess.get(), &creationTimeStamp, &exitTimeStamp, &kernelTimeUsed, &userTimeUsed)
&& ::FileTimeToLocalFileTime(&creationTimeStamp, &creationTimeStampLocal)
&& ::FileTimeToSystemTime(&creationTimeStampLocal, &creationTimeStampSystem))
{
__int64 ticks =
((__int64)creationTimeStampLocal.dwHighDateTime) << 32 |
creationTimeStampLocal.dwLowDateTime;
wios saved(NULL);
saved.copyfmt(wcout);
wcout << setfill(L'0')
<< setw(4)
<< creationTimeStampSystem.wYear << L'-'
<< setw(2)
<< creationTimeStampSystem.wMonth << L'-'
<< creationTimeStampSystem.wDay
<< L' '
<< creationTimeStampSystem.wHour << L':'
<< creationTimeStampSystem.wMinute << L':'
<< creationTimeStampSystem.wSecond << L'.'
<< setw(7)
<< (ticks % 10000000)
<< endl;
wcout.copyfmt(saved);
}
}
Comparison for my current boot:
system_clock::now() - milliseconds(GetTickCount64()):
2020-07-18 17:36:41.3284297
2020-07-18 17:36:41.3209437
2020-07-18 17:36:41.3134106
2020-07-18 17:36:41.3225148
2020-07-18 17:36:41.3145312
(result varies from call to call because system_clock::now() and ::GetTickCount64() don't run at exactly the same time and don't have the same precision)
wmic OS Get LastBootUpTime
2020-07-18 17:36:41.512344
Event Log
No result because the event log entry doesn't exist at this time on my system (earliest event is from July 23)
GetProcessTimes on PID 4:
2020-07-18 17:36:48.0424863
It's a few seconds different from the other methods, but I can't think of any way that it is wrong per se, because, if the System process wasn't running yet, was the system actually booted?