cfthread with heavy DB activity

cfthread with heavy DB activity - coldfusion

When we enter a ticket in system, there is an insert in a group of 5-6 tables (based on some rules). That works alright when single ticket is added manually.
Now we have a requirement where we will get the ticket feed from external source (most probably a xml or may be a txt file). It can have any number of tickets up to 500.
If I have to leverage the cfthread, and say 10 threads of 50 tickets each, how I can prevent it to cause issue on DB. Ultimately, everyone of those tickets will be inserting data in same Db tables.
As each thread will be completely independent, would't it create queue on DB (may be a deadlock too)?
Environment: CF2016, SQL Server2014

Related

How to achieve consistent read across multiple SELECT using AWS RDS DataService (Aurora Serverless)

I'm not sure how to achieve consistent read across multiple SELECT queries.
I need to run several SELECT queries and to make sure that between them, no UPDATE, DELETE or CREATE has altered the overall consistency. The best case for me would be something non blocking of course.
I'm using MySQL 5.6 with InnoDB and default REPEATABLE READ isolation level.
The problem is when I'm using RDS DataService beginTransaction with several executeStatement (with the provided transactionId). I'm NOT getting the full result at the end when calling commitTransaction.
The commitTransaction only provides me with a { transactionStatus: 'Transaction Committed' }..
I don't understand, isn't the commit transaction fonction supposed to give me the whole (of my many SELECT) dataset result?
Instead, even with a transactionId, each executeStatement is returning me individual result... This behaviour is obviously NOT consistent..

With SELECTs in one transaction with REPEATABLE READ you should see same data and don't see any changes made by other transactions. Yes, data can be modified by other transactions, but while in a transaction you operate on a view and can't see the changes. So it is consistent.
To make sure that no data is actually changed between selects the only way is to lock tables / rows, i.e. with SELECT FOR UPDATE - but it should not be the case.
Transactions should be short / fast and locking tables / preventing updates while some long-running chain of selects runs is obviously not an option.
Issued queries against the database run at the time they are issued. The result of queries will stay uncommitted until commit. Query may be blocked if it targets resource another transaction has acquired lock for. Query may fail if another transaction modified resource resulting in conflict.
Transaction isolation affects how effects of this and other transactions happening at the same moment should be handled. Wikipedia
With isolation level REPEATABLE READ (which btw Aurora Replicas for Aurora MySQL always use for operations on InnoDB tables) you operate on read view of database and see only data committed before BEGIN of transaction.
This means that SELECTs in one transaction will see the same data, even if changes were made by other transactions.
By comparison, with transaction isolation level READ COMMITTED subsequent selects in one transaction may see different data - that was committed in between them by other transactions.

API Gateway generating 11 sql queries per second on REG_LOG

We have sysdig running on our WSO2 API gateway machine and we notice that it fires a large number of SQL queries to the database for a minute, than waits a minute and repeats.
The query looks like this:
Every minute it goes wild, waits for a minute and goes wild again with a request of the following format:
SELECT REG_PATH, REG_USER_ID, REG_LOGGED_TIME, REG_ACTION, REG_ACTION_DATA
FROM REG_LOG
WHERE REG_LOGGED_TIME>'2016-02-29 09:57:54'
AND REG_LOGGED_TIME<'2016-03-02 11:43:59.959' AND REG_TENANT_ID=-1234
There is no load on the server. What is causing this? What can we do to avoid this?
screen shot sysdig api gateway process

This particular query is the result of the registry indexing task that runs in the background. The REG_LOG table is being queried periodically to retrieve the latest registry actions. The indexing task cannot be stopped. However, one can configure the frequency of the indexing task through the following parameter that is in the registry.xml. See [1] for more information.
indexingFrequencyInSeconds
If this table is filled up, one can clean the data using a simple SQL query. However, when deleting the records, one must be careful not to delete all the data. The latest records of each resource path should be left in the REG_LOG table since reindexing of data requires at least one reference of each resource path.
Also, if required, before clearing up the REG_LOG table, you can take a dump of the data in case you do not want to loose old records. Hope this answer provides information you require.
[1] - https://docs.wso2.com/display/Governance510/Configuration+for+Indexing

z/os cics db2 cobol program to process database entries concurrently

I have a DB2 table containing large amount of records to be send out to external system via MQs. There is a column in the table containing whether the record status (sent or pending to be sent).
I write a scheduler program to continually check if there are records in the table that are "pending to sent". If yes, the program will send the pending records out and update the status accordingly
That schedule will be started in multiple transactions. Therefore I am expecting multiple instances of the same program will be running concurrently
My questions is how to prevent the same records being pick up and sent by multiple schedulers at the same time?
I was told to use cursor with row level locks? but i am not sure how this works
remarks: I am working on CICS COBOL in z/os environment

I think you have a design problem. We accomplish something similar what you are trying to do by having a trigger on the DB2 table which sends an MQ message to a queue which is defined to trigger a CICS transaction.
In your case, you can probably dispense with CICS altogether and just do as #BillWoodger suggests and send the message when you set the pending flag.

One way to do this is as follows
1) Determine the Clustering index for the large DB2 table
2) Then have different instances of the program run only looking at different portions of this clustering index. E.G if the clustering index was on a numeric ID field that is unique, like Account ID and the ID size is Integer 9 than have instance one look at account ID ranges from 0 - 099999999 and instance 2 look at account ID ranges from 100000000 to 1999999999 and .....
This way you can write your cusror with hold, perform updates and commits as needed.

CICS will coordinate SQL transactions with DB2 for you. Each one of the CICS transactions you run will be able to select and lock for update rows and DB2 can coordinate between all of them and prevent the selection of multiple records if you do two things.
When you read the rows that qualify, use a SELECT FOR UPDATE type operation, this will lock every row you retrieve and prevent other concurrent transactions from accessing the same one (also requires you BIND with row level locks unless you want full pages locked, see your DBA about the options based on row size).
Before you release the records or end the CICS transaction, you must do something to flag said records as "sent" so that other, waiting, concurrent transactions do not grab them and send them again. This could be as simple as adding a sent Y/N column to the table and adding "AND sent <> 'Y'" to your select where clause. After you have sent the records, do an UPDATE on those records and set sent = 'Y'. Depending on your row data, you could maybe use something else, like time sent or whatever, it just needs to be something that would exclude said row from reselection.

Django App on Heroku

I've been struggling with the issue where I believe my account has been shutdown due to having too large of a table? Correct me if I'm wrong.
=== HEROKU_POSTGRESQL_OL (DATABASE_URL)
Plan: Dev
Status: available
Connections: 0
PG Version: 9.1.8
Created: 2013-01-06 18:23 UTC
Data Size: 11.8 MB
Tables: 15
Rows: 24814/10000 (Write access revoked)
Fork/Follow: Unsupported
I tried running
heroku pg:psql HEROKU_POSTGRESQL_OL
to look at the tables, but how do I determine which table has too many rows and is flooding my database inside psql?
Once, I do determine which table this is. Can I just go to heroku run manage.py shell and call Model_with_too_many_rows.delete.all() and my account will no longer be shutdown? Are there other steps that must be taken to have the smaller db register with heroku so that my write access will be returned?
Sorry, if these questions are trivial, but my understanding of SQL is limited.
EDIT: I also believe that there was a time where my database was flooded with entries, but I have since deleted them. Is there any command I can run to resize the databse to acknowledge that the number of rows have been reduced? Or does heroku do this automatically?

There may be a smarter way to check row count by table, but I use the pg-extras plugin and run pg:index_usage.
You will regain write access to your database within ~5 minutes of getting back down below the 10k row limit – Heroku will check this and update the limit automatically.

Cronjob: Web Service query

I have a cronjob that runs every hours and parse 150,000+ records. Each record is summarized individually in a MySQL tables. I use two web services to retrieve the user information.
User demographic (ip, country, city etc.)
Phone information (if landline or cell phone and if cell phone what is the carrier)
Every time I get 1 record I check if I have information and if not I call these web services. After tracing my code I found out both of these calls takes 2 to 4 seconds and it makes my cronjob very slow and I can't compile statistics on time.
Is there a way to make these web service faster?
Thanks

simple:
get the data locally and use mellissa data:
for ip: http://w10.melissadata.com/dqt/websmart/ip-locator.htm
for phone: http://www.melissadata.com/fonedata.html
you can also cache them using memcache or APC which will make it faster since he does not have to request the data from the api or database.

A couple of ideas... if the same users are returning, caching the data in another table would be very helpful... you would only look it up once and have it for returning users. Upon re-reading the question it looks like you are doing that.
Another option would be to spawn new threads when you need to do the look-ups. This could be a new thread for each request, or if this is not feasible you could have n service threads ready to do the look-ups and update the results.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js