I have an IronPython script that gathers some info from WMI. One of the items I'm trying to gather is LastBootUpTime from Win32_OperatingSystem. I'm able to get the info using:
import clr
from System.Management.Automation import (
PSMethod, RunspaceInvoke
RUNSPACE = RunspaceInvoke()
def wmi(query):
return [dict([(prop.Name, prop.Value) for prop in psobj.Properties]) for psobj in RUNSPACE.Invoke(query)]
def to_ascii(s):
# ignore non-ascii chars
return s.encode('ascii','ignore')
operating_system = wmi('Get-WmiObject Win32_OperatingSystem -Namespace "root\CIMV2"')[0]
last_boot = to_ascii(operating_system.get('LastBootUpTime'))
print last_boot
The result is as follows
Is there a way in IronPython to convert this "timestamp" to a more friendly format?
Use the methods found in the ManagementDateTimeConverter class to convert to a .net object. That field in particular is a datetime so you'll want to use ToDateTime(). You'll just need to add a reference to the System.Management assembly.
from System.Management import ManagementDateTimeConverter
print ManagementDateTimeConverter.ToDateTime(last_boot)
I've got a DateTimeField(auto_now_add=True) in my model. However, I wish to update this field, and the format I'm receiving from my API is a UNIX timestamp. Can I somehow convert the format I receive from my API to the correct one? (eg 1640206232 to 2021-12-22 20:50:32).
Using postgres as my db if that matters..
Yes As mentioned by #Sevy in his comment you can use the following
from datetime import datetime
timestamp = 1640206232
dt_object = datetime.fromtimestamp(timestamp, tz='UTC') # Or whatever timezone you want
print("dt_object =", dt_object)
print("type(dt_object) =", type(dt_object))
More info and how to convert the other way can be found here:
I have a one to many relation between session and camp. Now I have to get the max and min dates of all camps combined for a particular session.
I am able to do it like this:
sess = Session.objects.last()
max_min_dates = sess.camp.aggregate(Min('start_date'), Max('end_date'))
But if I try to send this from HttpResponse then I am getting this error:
TypeError: Object of type 'date' is not JSON serializable
So I need to send the formatted date values in that. How can I modify the above code to get the same?
The default encoder for json.dumps() does not support date encoding (ie. can't convert date into str). You can use django encoder instead, it supports a few more data types see the link for more info.
Django Ref
import json
from django.core.serializers.json import DjangoJSONEncoder
json_str = json.dumps(max_min_dates, cls=DjangoJSONEncoder)
I've wrote a simple script to extract data from some site. Script works as expected but I'm not pleased with output format
Here is my code
class ArticleSpider(Spider):
name = "article"
allowed_domains = ["example.com"]
start_urls = (
def parse(self, response):
next_selector = response.xpath('//a[#class="next"]/#href')
url = next_selector[1].extract()
# url is like "tag/1/page/2"
yield Request(urlparse.urljoin("http://example.com", url))
item_selector = response.xpath('//h3/a/#href')
for url in item_selector.extract():
yield Request(urlparse.urljoin("http://example.com", url),
def parse_article(self, response):
item = ItemLoader(item=Article(), response=response)
# here i extract title of every article
item.add_xpath('title', '//h1[#class="title"]/text()')
return item.load_item()
I'm not pleased with the output, something like:
[scrapy] DEBUG: Scraped from <200 http://example.com/tag/1/article_name>
{'title': [u'\xa0"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f"']}
I think I need to use custom ItemLoader class but I don't know how. Need your help.
TL;DR I need to convert text, scraped by Scrapy from unicode to utf-8
As you can see below, this isn't much of a Scrapy issue but more of Python itself. It could also marginally be called an issue :)
$ scrapy shell http://censor.net.ua/resonance/267150/voobscheto_svoboda_zakanchivaetsya
In [7]: print response.xpath('//h1/text()').extract_first()
In [8]: response.xpath('//h1/text()').extract_first()
Out[8]: u'\xa0"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f"'
What you see is two different representations of the same thing - a unicode string.
What I would suggest is run crawls with -L INFO or add LOG_LEVEL='INFO' to your settings.py in order to not show this output in the console.
One annoying thing is that when you save as JSON, you get escaped unicode JSON e.g.
$ scrapy crawl example -L INFO -o a.jl
gives you:
$ cat a.jl
{"title": "\u00a0\"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f\""}
This is correct but it takes more space and most applications handle equally well non-escaped JSON.
Adding a few lines in your settings.py can change this behaviour:
from scrapy.exporters import JsonLinesItemExporter
class MyJsonLinesItemExporter(JsonLinesItemExporter):
def __init__(self, file, **kwargs):
super(MyJsonLinesItemExporter, self).__init__(file, ensure_ascii=False, **kwargs)
'jsonlines': 'myproject.settings.MyJsonLinesItemExporter',
'jl': 'myproject.settings.MyJsonLinesItemExporter',
Essentially what we do is just setting ensure_ascii=False for the default JSON Item Exporters. This prevents escaping. I wish there was an easier way to pass arguments to exporters but I can't see any since they are initialized with their default arguments around here. Anyway, now your JSON file has:
$ cat a.jl
which is better-looking, equally valid and more compact.
There are 2 independant issues affecting display of unicode string.
if you return a list of strings, the output file will have some issue them because it will use ascii codec by default to serialize list elements. You can work around as below but it's more appropriate to use extract_first() as suggested by #neverlastn
class Article(Item):
title = Field(serializer=lambda x: u', '.join(x))
the default implementation of repr() method will serialize unicode string to their escaped version \uxxxx. You can change this behaviour by overriding this method in your item class
class Article(Item):
def __repr__(self):
data = self.copy()
for k in data.keys():
if type(data[k]) is unicode:
data[k] = data[k].encode('utf-8')
return super.__repr__(data)
I have a django based app with haystack and whoosh search engine. I want to provide an accent and special character independent search so that I can find indexed data with special characters also by using words without special chars:
Indexed is:
Search term:
I've written a provided a specific FoldingWhooshSearchBackend which uses a StemmingAnalyzer and aCharsetFilter(accent_map) as described in the following document:
However the search still doesn't work like expected, i.e. I cannot search with 'cafe' and find 'café'. I've looked into the search index using:
from whoosh.index import open_dir
ix = open_dir('myservice/settings/whoosh_index')
searcher = ix.searcher()
for doc in searcher.documents():
print doc
The special characters are still in the index.
Do I have to do something additional? Is is about changing the index template?
You have to write Haystack SearchIndex classes for your models. That's how you can prepare models data for the search index.
Example of myapp/search_index.py:
from haystack import site
from haystack import indexes
class UserProfileIndex(indexes.SearchIndex):
text = indexes.CharField(document=True)
def prepare_text(self, obj):
data = [obj.get_full_name(), obj.user.email, obj.phone]
original = ' '.join(data)
slugified = slugify(original)
return ' '.join([original, slugified])
site.register(UserProfile, UserProfileIndex)
If a user has name café, you will find his profile with bouth search terms café and cafe.
I think the best approach is to let Haystack create the schema for maximum forwards compatibility, and then hack the CharsetFilter in.
This code is working for me with Haystack 2.4.0 and Whoosh 2.7.0:
from haystack.backends.whoosh_backend import WhooshEngine, WhooshSearchBackend
from whoosh.analysis import CharsetFilter, StemmingAnalyzer
from whoosh.support.charset import accent_map
from whoosh.fields import TEXT
class FoldingWhooshSearchBackend(WhooshSearchBackend):
def build_schema(self, fields):
schema = super(FoldingWhooshSearchBackend, self).build_schema(fields)
for name, field in schema[1].items():
if isinstance(field, TEXT):
field.analyzer = StemmingAnalyzer() | CharsetFilter(accent_map)
return schema
class FoldingWhooshEngine(WhooshEngine):
backend = FoldingWhooshSearchBackend
I run ownCloud on my webspace for a shared calendar. Now I'm looking for a suitable python library to get read only access to the calendar. I want to put some information of the calendar on an intranet website.
I have tried http://trac.calendarserver.org/wiki/CalDAVClientLibrary but it always returns a NotImplementedError with the query command, so my guess is that the query command doesn't work well with the given library.
What library could I use instead?
I recommend the library, caldav.
Read-only is working really well with this library and looks straight-forward to me. It will do the whole job of getting calendars and reading events, returning them in the iCalendar format. More information about the caldav library can also be obtained in the documentation.
import caldav
client = caldav.DAVClient(<caldav-url>, username=<username>,
principal = client.principal()
for calendar in principal.calendars():
for event in calendar.events():
ical_text = event.data
From this on you can use the icalendar library to read specific fields such as the type (e. g. event, todo, alarm), name, times, etc. - a good starting point may be this question.
I wrote this code few months ago to fetch data from CalDAV to present them on my website.
I have changed the data into JSON format, but you can do whatever you want with the data.
I have added some print for you to see the output which you can remove them in production.
from datetime import datetime
import json
from pytz import UTC # timezone
import caldav
from icalendar import Calendar, Event
# CalDAV info
client = caldav.DAVClient(url=url, username=userN, password=passW)
principal = client.principal()
calendars = principal.calendars()
if len(calendars) > 0:
calendar = calendars[0]
print ("Using calendar", calendar)
results = calendar.events()
eventSummary = []
eventDescription = []
eventDateStart = []
eventdateEnd = []
eventTimeStart = []
eventTimeEnd = []
for eventraw in results:
event = Calendar.from_ical(eventraw._data)
for component in event.walk():
if component.name == "VEVENT":
print (component.get('summary'))
print (component.get('description'))
startDate = component.get('dtstart')
print (startDate.dt.strftime('%m/%d/%Y %H:%M'))
endDate = component.get('dtend')
print (endDate.dt.strftime('%m/%d/%Y %H:%M'))
dateStamp = component.get('dtstamp')
print (dateStamp.dt.strftime('%m/%d/%Y %H:%M'))
print ('')
# Modify or change these values based on your CalDAV
# Converting to JSON
data = [{ 'Events Summary':eventSummary[0], 'Event Description':eventDescription[0],'Event Start date':eventDateStart[0], 'Event End date':eventdateEnd[0], 'At:':eventTimeStart[0], 'Until':eventTimeEnd[0]}]
data_string = json.dumps(data)
print ('JSON:', data_string)
pyOwnCloud could be the right thing for you. I haven't tried it, but it should provide a CMDline/API for reading the calendars.
You probably want to provide more details about how you are actually making use of the API but in case the query command is indeed not implemented, there is a list of other Python libraries at the CalConnect website (archvied version, original link is dead now).