Limiting the lifetime of a file in Python - django

Helo there,
I'm looking for a way to limit a lifetime of a file in Python, that is, to make a file which will be auto deleted after 5 minutes after creation.
Problem:
I have a Django based webpage which has a service that generates plots (from user submitted input data) which are showed on the web page as .png images. The images get stored on the disk upon creation.
Image files are created per session basis, and should only be available a limited time after the user has seen them, and should be deleted 5 minutes after they have been created.
Possible solutions:
I've looked at Python tempfile, but that is not what I need, because the user should have to be able to return to the page containing the image without waiting for it to be generated again. In other words it shouldn't be destroyed as soon as it is closed
The other way that comes in mind is to call some sort of an external bash script which would delete files older than 5 minutes.
Does anybody know a preferred way doing this?
Ideas can also include changing the logic of showing/generating the image files.

You should write a Django custom management command to delete old files that you can then call from cron.
If you want no files older than 5 minutes, then you need to call it every 5 minutes of course. And yes, it would run unnecessarily when there are no users, but that shouln't worry you too much.

Ok that might be a good approach i guess...
You can write a script that checks your directory and delete outdated files, and choose the oldest file from the un-deleted files. Calculate how much time had passed since that file is created and calculate the remaining time to deletion of that file. Then call sleep function with remaining time. When sleep time ends and another loop begins, there will be (at least) one file to be deleted. If there is no files in the directory, set sleep time to 5 minutes.
In that way you will ensure that each file will be deleted exactly 5 minutes later, but when there are lots of files created simultaneously, sleep time will decrease greatly and your function will begin to check each file more and more often. To aviod that you add a proper latency to sleep function before starting another loop, like, if the oldest file is 4 minutes old, you can set sleep to 60+30 seconds (adding all time calculations 30 seconds).
An example:
from datetime import datetime
import time
import os
def clearDirectory():
while True:
_time_list = []
_now = time.mktime(datetime.now().timetuple())
for _f in os.listdir('/path/to/your/directory'):
if os.path.isfile(_f):
_f_time = os.path.getmtime(_f) #get file creation/modification time
if _now - _f_time < 300:
os.remove(_f) # delete outdated file
else:
_time_list.append(_f_time) # add time info to list
# after check all files, choose the oldest file creation time from list
_sleep_time = (_now - min(_time_list)) if _time_list else 300 #if _time_list is empty, set sleep time as 300 seconds, else calculate it based on the oldest file creation time
time.sleep(_sleep_time)
But as i said, if files are created oftenly, it is better to set a latency for sleep time
time.sleep(_sleep_time + 30) # sleep 30 seconds more so some other files might be outdated during that time too...
Also, it is better to read getmtime function for details.

Related

Vertex AI 504 Errors in batch job - How to fix/troubleshoot

We have a Vertex AI model that takes a relatively long time to return a prediction.
When hitting the model endpoint with one instance, things work fine. But batch jobs of size say 1000 instances end up with around 150 504 errors (upstream request timeout). (We actually need to send batches of 65K but I'm troubleshooting with 1000).
I tried increasing the number of replicas assuming that the # of instances handed to the model would be (1000/# of replicas) but that doesn't seem to be the case.
I then read that the default batch size is 64 and so tried decreasing the batch size to 4 like this from the python code that creates the batch job:
model_parameters = dict(batch_size=4)
def run_batch_prediction_job(vertex_config):
aiplatform.init(
project=vertex_config.vertex_project, location=vertex_config.location
)
model = aiplatform.Model(vertex_config.model_resource_name)
model_params = dict(batch_size=4)
batch_params = dict(
job_display_name=vertex_config.job_display_name,
gcs_source=vertex_config.gcs_source,
gcs_destination_prefix=vertex_config.gcs_destination,
machine_type=vertex_config.machine_type,
accelerator_count=vertex_config.accelerator_count,
accelerator_type=vertex_config.accelerator_type,
starting_replica_count=replica_count,
max_replica_count=replica_count,
sync=vertex_config.sync,
model_parameters=model_params
)
batch_prediction_job = model.batch_predict(**batch_params)
batch_prediction_job.wait()
return batch_prediction_job
I've also tried increasing the machine type to n1-high-cpu-16 and that helped somewhat but I'm not sure I understand how batches are sent to replicas?
Is there another way to decrease the number of instances sent to the model?
Or is there a way to increase the timeout?
Is there log output I can use to help figure this out?
Thanks
Answering your follow up question above.
Is that timeout for a single instance request or a batch request. Also, is it in seconds?
This is a timeout for the batch job creation request.
The timeout is in seconds, according to create_batch_prediction_job() timeout refers to rpc timeout. If we trace the code we will end up here and eventually to gapic where timeout is properly described.
timeout (float): The amount of time in seconds to wait for the RPC
to complete. Note that if ``retry`` is used, this timeout
applies to each individual attempt and the overall time it
takes for this method to complete may be longer. If
unspecified, the the default timeout in the client
configuration is used. If ``None``, then the RPC method will
not time out.
What I could suggest is to stick with whatever is working for your prediction model. If ever adding the timeout will improve your model might as well build on it along with your initial solution where you used a machine with a higher spec. You can also try using a machine with higher memory like the n1-highmem-* family.

Nextjs export timeout configuration

I am building a website with NextJS that takes some time to build. It has to create a big dictionary, so when I run next dev it takes around 2 minutes to build.
The issue is, when I run next export to get a static version of the website there is a timeout problem, because the build takes (as I said before), 2 minutes, whihc exceeds the 60 seconds limit pre-configured in next.
In the NEXT documentation: https://nextjs.org/docs/messages/static-page-generation-timeout it explains that you can increase the timeout limit, whose default is 60 seconds: "Increase the timeout by changing the staticPageGenerationTimeout configuration option (default 60 in seconds)."
However it does not specify WHERE you can set that configuration option. In next.config.json? in package.json?
I could not find this information anywhere, and my blind tries of putting this parameter in some of the files mentioned before did not work out at all. So, Does anybody know how to set the timeout of next export? Thank you in advance.
They were a bit more clear in the basic-features/data-fetching part of the docs that it should be placed in the next.config.js
I added this to mine and it worked (got rid of the Error: Collecting page data for /path/[pk] is still timing out after 2 attempts. See more info here https://nextjs.org/docs/messages/page-data-collection-timeout build error):
// next.config.js
module.exports = {
// time in seconds of no pages generating during static
// generation before timing out
staticPageGenerationTimeout: 1000,
}

Scheduling reset every 24 hours at midnight

I have a counter "numberOrders" and i want to reset it everyday at midnight, to know how many orders I get in one day, what I have right now is this:
val system = akka.actor.ActorSystem("system")
system.scheduler.schedule(86400000 milliseconds, 0 milliseconds){(numberOrders = 0)}
This piece of code is inside a def which is called every time i get a new order, so want it does is: reset numberOrders after 24hours from the first order or from every order, I'm not really sure if every time there's a new order is going to reset after 24 hours, which is not what I want. I want to rest the variable everyday at midnight, any idea? Thanks!
To further increase pushy's answer. Since you might not always be sure when the site started and if you want to be exactly sure it runs at midnight you can do the following
val system = akka.actor.ActorSystem("system")
val wait = (24 hours).toMillis - System.currentTimeMillis
system.scheduler.schedule(Duration.apply(wait, MILLISECONDS), 24 hours, orderActor, ResetCounterMessage)
Might not be the tidiest of solutions but it does the job.
As schedule supports repeated executions, you could just set the interval parameter to 24 hours, the initial delay to the amount of time between now and midnight, and initiate the code at startup. You seem to be creating a new actorSystem every time you get an order right now, that does not seem quite right, and you would be rid of that as well.
Also I would suggest using the schedule method which sends messages to actors instead. This way the actor that processes the order could keep count, and if it receives a ResetCounter message it would simply reset the counter. You could simply write:
system.scheduler.schedule(x seconds, 24 hours, orderActor, ResetCounterMessage)
when you start up your actor system initially, and be done with it.

Log Slow Pages Taking Longer Than [n] Seconds In ColdFusion with Details

(ACF9)
Unless there's an option I'm missing, the "Log Slow Pages Taking Longer Than [n] Seconds" setting isn't useful for front-controller based sites (e.g., Model-Glue, FW/1, Fusebox, Mach-II, etc.).
For instance, in a Mura/Framework-One site, I just end up with:
"Warning","jrpp-186","04/25/13","15:26:36",,"Thread: jrpp-186, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 11 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-196","04/25/13","15:27:11",,"Thread: jrpp-196, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 59 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-214","04/25/13","15:28:56",,"Thread: jrpp-214, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 32 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-134","04/25/13","15:31:53",,"Thread: jrpp-134, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 11 seconds, exceeding the 10 second warning limit"
Is there some way to get query string or post details in there, or is there another way to get what I'm after?
You can easily add some logging to your application for any requests that take longer than 10 seconds.
In onRequestStart():
request.startTime = getTickCount();
In onRequestEnd():
request.endTime = getTickCount();
if (request.endTime - request.startTime > 10000){
writeLog(cgi.QUERY_STRING);
}
If you're writing a Mach-II, FW/1 or ColdBox application, it's trivial to write a "plugin" that runs on every request which captures the URL or FORM variables passed in the request and stores that in a simple database table or log file. (You can even capture session.userID or IP address or whatever you may need.) If you're capturing to a database table, you'll probably not want any indexes to optimize for performance and you'll need to rotate that table so you're not trying to do high-speed inserts on a table with tens of millions of rows.
In Mach-II, you'd write a plugin.
In FW/1, you'd put a call to a controller which handles this into setupRequest() in your application.cfc.
In ColdBox, you'd write an interceptor.
The idea is that the log just tells you what pages arw xonsostently slow sp ypu can do your own performance tuning.
Turn on debugging for further details for a start.

Django: Gracefully restart nginx + fastcgi sites to reflect code changes?

Common situation: I have a client on my server who may update some of the code in his python project. He can ssh into his shell and pull from his repository and all is fine -- but the code is stored in memory (as far as I know) so I need to actually kill the fastcgi process and restart it to have the code change.
I know I can gracefully restart fcgi but I don't want to have to manually do this. I want my client to update the code, and within 5 minutes or whatever, to have the new code running under the fcgi process.
Thanks
First off, if uptime is important to you, I'd suggest making the client do it. It can be as simple as giving him a command called deploy-code. Using your method, if there is an error in their code, your method requires a 10 minute turnaround (read: downtime) for fixing it, assuming he gets it correct.
That said, if you actually want to do this, you should create a daemon which will look for files modified within the last 5 minutes. If it detects one, it will execute the reboot command.
Code might look something like:
import os, time
CODE_DIR = '/tmp/foo'
while True:
if restarted = True:
restarted = False
time.sleep(5*60)
for root, dirs, files in os.walk(CODE_DIR):
if restarted=True:
break
for filename in files:
if restared=True:
break
updated_on = os.path.getmtime(os.path.join(root, filename))
current_time = time.time()
if current_time - updated_on <= 6 * 60: # 6 min
# 6 min could offer false negatives, but that's better
# than false positives
restarted = True
print "We should execute the restart command here."