Timestamp regexp in Elasticsearch - regex

My goal is to make an alert in ElastAlert for this scenario: no events has occured between midnight and 2 am. (for any date). The problem is how to make a query to Elasticsearch that matches any date but a specific time, because you cannot use regexp or wildcard on timestamp of type 'date'. Any suggestions?
This code returns "Parse failure":
"range": {
"timestamp": {
"gte": "20[0-9]{2}-[0-9]{2}-[0-9]{2}T00:00:00.000Z",
"lt": "20[0-9]{2}-[0-9]{2}-[0-9]{2}T02:00:00.000Z"
}
}

Handling it in a custom rule is ideal.
I wrote the following to do the same kind of filtering:
Note, the dependencies used (dateutil, elastalert.utils) are already bundled with the elastalert framework.
import dateutil.parser
from ruletypes import RuleType
# elastalert.util includes useful utility functions
# such as converting from timestamp to datetime obj
from util import ts_to_dt
# Modified version of http://elastalert.readthedocs.io/en/latest/recipes/adding_rules.html#tutorial
# to catch events happening outside a certain time range
class OutOfTimeRangeRule(RuleType):
""" Match if input time is outside the given range """
# Time range specified by including the following properties in the rule:
required_options = set(['time_start', 'time_end'])
# add_data will be called each time Elasticsearch is queried.
# data is a list of documents from Elasticsearch, sorted by timestamp,
# including all the fields that the config specifies with "include"
def add_data(self, data):
for document in data:
# Convert the timestamp to a time object
login_time = document['#timestamp'].time()
# Convert time_start and time_end to time objects
time_start = dateutil.parser.parse(self.rules['time_start']).time()
time_end = dateutil.parser.parse(self.rules['time_end']).time()
# If time is outside office hours
if login_time < time_start or login_time > time_end:
# To add a match, use self.add_match
self.add_match(document)
# The results of get_match_str will appear in the alert text
def get_match_str(self, match):
return "logged in outside %s and %s" % (self.rules['time_start'], self.rules['time_end'])
def garbage_collect(self, timestamp):
pass

I didn't have the right to write custom rules, so my solution was to make changes in logstash. Added the field hour_of_day, where the value is derived from the timestamp. Thus we are able to create a flatline rule with a filter like this:
filter:
- query:
query_string:
query: "hour_of_day: 0 OR hour_of_day: 1"

Related

Pulling data from datastore and converting it in Json in python(Google Appengine)

I am creating an apllication using google appengine, in which i am fetching a data from the website and storing it in my Database (Data store).Now whenever user hits my application url as "application_url\name =xyz&city= abc",i am fetching the data from the DB and want to show it as json.Right now i am using a filter to fetch data based on the name and city but getting output as [].I dont know how to get data from this.My code looks like this:
class MainHandler(webapp2.RequestHandler):
def get(self):
commodityname = self.request.get('veg',"Not supplied")
market = self.request.get('market',"No market found with this name")
self.response.write(commodityname)
self.response.write(market)
query = commoditydata.all()
logging.info(commodityname)
query.filter('commodity = ', commodityname)
result = query.fetch(limit = 1)
logging.info(result)
and the db structure for "commoditydata" table is
class commoditydata(db.Model):
commodity= db.StringProperty()
market= db.StringProperty()
arrival= db.StringProperty()
variety= db.StringProperty()
minprice= db.StringProperty()
maxprice= db.StringProperty()
modalprice= db.StringProperty()
reporteddate= db.DateTimeProperty(auto_now_add = True)
Can anyone tell me how to get data from the db using name and market and covert it in Json.First getting data from db is the more priority.Any suggestions will be of great use.
If you are starting with a new app, I would suggest to use the NDB API rather than the old DB API. Your code would look almost the same though.
As far as I can tell from your code sample, the query should give you results as far as the HTTP query parameters from the request would match entity objects in the datastore.
I can think of some possible reasons for the empty result:
you only think the output is empty, because you use write() too early; app-engine doesn't support streaming of response, you must write everything in one go and you should do this after you queried the datastore
the properties you are filtering are not indexed (yet) in the datastore, at least not for the entities you were looking for
the filters are just not matching anything (check the log for the values you got from the request)
your query uses a namespace different from where the data was stored in (but this is unlikely if you haven't explicitly set namespaces anywhere)
In the Cloud Developer Console you can query your datastore and even apply filters, so you can see the results with-out writing actual code.
Go to https://console.developers.google.com
On the left side, select Storage > Cloud Datastore > Query
Select the namespace (default should be fine)
Select the kind "commoditydata"
Add filters with example values you expect from the request and see how many results you get
Also look into Monitoring > Log which together with your logging.info() calls is really helpful to better understand what is going on during a request.
The conversion to JSON is rather easy, once you got your data. In your request handler, create an empty list of dictionaries. For each object you get from the query result: set the properties you want to send, define a key in the dict and set the value to the value you got from the datastore. At the end dump the dictionary as JSON string.
class MainHandler(webapp2.RequestHandler):
def get(self):
commodityname = self.request.get('veg')
market = self.request.get('market')
if commodityname is None and market is None:
# the request will be complete after this:
self.response.out.write("Please supply filters!")
# everything ok, try query:
query = commoditydata.all()
logging.info(commodityname)
query.filter('commodity = ', commodityname)
result = query.fetch(limit = 1)
logging.info(result)
# now build the JSON payload for the response
dicts = []
for match in result:
dicts.append({'market': match.market, 'reporteddate': match.reporteddate})
# set the appropriate header of the response:
self.response.headers['Content-Type'] = 'application/json; charset=utf-8'
# convert everything into a JSON string
import json
jsonString = json.dumps(dicts)
self.response.out.write( jsonString )

BigQuery results run via python in Google Cloud don't match results running on MAC

I have a python app that runs a query on BigQuery and appends results to a file. I've run this on MAC workstation (Yosemite) and on GC instance (ubuntu 14.1) and the results for floating point differ. How can I make them the same? They python environments are the same on both.
run on google cloud instance
1120224,2015-04-06,23989,866,55159.71274162368,0.04923989554019882,0.021414467106578683,0.03609987911125933,63.69481840834143
54897577,2015-04-06,1188089,43462,2802473.708558333,0.051049132980100984,0.021641920553251377,0.03658143455582873,64.4810111950286
run on mac workstation
1120224,2015-04-06,23989,866,55159.712741623654,0.049239895540198794,0.021414467106578683,0.03609987911125933,63.694818408341405
54897577,2015-04-06,1188089,43462,2802473.708558335,0.05104913298010102,0.021641920553251377,0.03658143455582873,64.48101119502864
import sys
import pdb
import json
from collections import OrderedDict
from csv import DictWriter
from pprint import pprint
from apiclient import discovery
from oauth2client import tools
import functools
import argparse
import httplib2
import time
from subprocess import call
def authenticate_SERVICE_ACCOUNT(service_acct_email, private_key_path):
""" Generic authentication through a service accounts.
Args:
service_acct_email: The service account email associated
with the private key private_key_path: The path to the private key file
"""
from oauth2client.client import SignedJwtAssertionCredentials
with open(private_key_path, 'rb') as pk_file:
key = pk_file.read()
credentials = SignedJwtAssertionCredentials(
service_acct_email,
key,
scope='https://www.googleapis.com/auth/bigquery')
http = httplib2.Http()
auth_http = credentials.authorize(http)
return discovery.build('bigquery', 'v2', http=auth_http)
def create_query(number_of_days_ago):
""" Create a query
Args:
number_of_days_ago: Default value of 1 gets yesterday's data
"""
q = 'SELECT xxxxxxxxxx'
return q;
def translate_row(row, schema):
"""Apply the given schema to the given BigQuery data row.
Args:
row: A single BigQuery row to transform.
schema: The BigQuery table schema to apply to the row, specifically
the list of field dicts.
Returns:
Dict containing keys that match the schema and values that match
the row.
Adpated from bigquery client
https://github.com/tylertreat/BigQuery-Python/blob/master/bigquery/client.py
"""
log = {}
#pdb.set_trace()
# Match each schema column with its associated row value
for index, col_dict in enumerate(schema):
col_name = col_dict['name']
row_value = row['f'][index]['v']
if row_value is None:
log[col_name] = None
continue
# Cast the value for some types
if col_dict['type'] == 'INTEGER':
row_value = int(row_value)
elif col_dict['type'] == 'FLOAT':
row_value = float(row_value)
elif col_dict['type'] == 'BOOLEAN':
row_value = row_value in ('True', 'true', 'TRUE')
log[col_name] = row_value
return log
def extractResult(queryReply):
""" Extract a result from the query reply. Uses schema and rows to translate.
Args:
queryReply: the object returned by bigquery
"""
#pdb.set_trace()
result = []
schema = queryReply.get('schema', {'fields': None})['fields']
rows = queryReply.get('rows',[])
for row in rows:
result.append(translate_row(row, schema))
return result
def writeToCsv(results, filename, ordered_fieldnames, withHeader=True):
""" Create a csv file from a list of rows.
Args:
results: list of rows of data (first row is assumed to be a header)
order_fieldnames: a dict with names of fields in order desired - names must exist in results header
withHeader: a boolen to indicate whether to write out header -
Set to false if you are going to append data to existing csv
"""
try:
the_file = open(filename, "w")
writer = DictWriter(the_file, fieldnames=ordered_fieldnames)
if withHeader:
writer.writeheader()
writer.writerows(results)
the_file.close()
except:
print "Unexpected error:", sys.exc_info()[0]
raise
def runSyncQuery (client, projectId, query, timeout=0):
results = []
try:
print 'timeout:%d' % timeout
jobCollection = client.jobs()
queryData = {'query':query,
'timeoutMs':timeout}
queryReply = jobCollection.query(projectId=projectId,
body=queryData).execute()
jobReference=queryReply['jobReference']
# Timeout exceeded: keep polling until the job is complete.
while(not queryReply['jobComplete']):
print 'Job not yet complete...'
queryReply = jobCollection.getQueryResults(
projectId=jobReference['projectId'],
jobId=jobReference['jobId'],
timeoutMs=timeout).execute()
# If the result has rows, print the rows in the reply.
if('rows' in queryReply):
#print 'has a rows attribute'
#pdb.set_trace();
result = extractResult(queryReply)
results.extend(result)
currentPageRowCount = len(queryReply['rows'])
# Loop through each page of data
while('rows' in queryReply and currentPageRowCount < int(queryReply['totalRows'])):
queryReply = jobCollection.getQueryResults(
projectId=jobReference['projectId'],
jobId=jobReference['jobId'],
startIndex=currentRow).execute()
if('rows' in queryReply):
result = extractResult(queryReply)
results.extend(result)
currentRow += len(queryReply['rows'])
except AccessTokenRefreshError:
print ("The credentials have been revoked or expired, please re-run"
"the application to re-authorize")
except HttpError as err:
print 'Error in runSyncQuery:', pprint.pprint(err.content)
except Exception as err:
print 'Undefined error' % err
return results;
# Main
if __name__ == '__main__':
# Name of file
FILE_NAME = "results.csv"
# Default prior number of days to run query
NUMBER_OF_DAYS = "1"
# BigQuery project id as listed in the Google Developers Console.
PROJECT_ID = 'xxxxxx'
# Service account email address as listed in the Google Developers Console.
SERVICE_ACCOUNT = 'xxxxxx#developer.gserviceaccount.com'
KEY = "/usr/local/xxxxxxxx"
query = create_query(NUMBER_OF_DAYS)
# Authenticate
client = authenticate_SERVICE_ACCOUNT(SERVICE_ACCOUNT, KEY)
# Get query results
results = runSyncQuery (client, PROJECT_ID, query, timeout=0)
#pdb.set_trace();
# Write results to csv without header
ordered_fieldnames = OrderedDict([('f_split',None),('m_members',None),('f_day',None),('visitors',None),('purchasers',None),('demand',None), ('dmd_per_mem',None),('visitors_per_mem',None),('purchasers_per_visitor',None),('dmd_per_purchaser',None)])
writeToCsv(results, FILE_NAME, ordered_fieldnames, False)
# Backup current data
backupfilename = "data_bk-" + time.strftime("%y-%m-%d") + ".csv"
call(['cp','../data/data.csv',backupfilename])
# Concatenate new results to data
with open("../data/data.csv", "ab") as outfile:
with open("results.csv","rb") as infile:
line = infile.read()
outfile.write(line)
You mention that these come from aggregate sums of floating point data. As Felipe mentioned, floating point is awkward; it violates some of the mathematical identities that we tend to assume.
In this case, the associative property is the one that bites us. That is, usually (A+B)+C == A+(B+C). However, in floating point math, this isn't the case. Each operation is an approximation; you can see this better if you wrap with an 'approx' function: approx(approx(A+B) + C) is clearly different from approx(A + approx(B+C)).
If you think about how bigquery computes aggregates, it builds an execution tree, and computes the value to be aggregated at the leaves of the tree. As those answers are ready, they're passed back up to the higher levels of the tree and aggregated (let's say they're added). The "when they're ready" part makes it non-deterministic.
A node may get results back in the order A,B,C the first time and C,A, B the second time. This means that the order of distribution will change, since you'll get approx(approx(A + B) + C) the first time and approx(approx(C, A) + B) the second time. Note that since we're dealing with ordering, it may look like the commutative property is the problematic one, but it isn't; A+B in floating math is the same as B+A. The problem is really that you're adding partial results, which aren't associative.
Floating point math has all sorts of nasty properties and should usually be avoided if you rely on precision.
Assume floating point is non-deterministic:
https://randomascii.wordpress.com/2013/07/16/floating-point-determinism/
“the IEEE standard does not guarantee that the same program will
deliver identical results on all conforming systems.”

How to Add a Column in DynamoDB

Is there a way to add a new column to existing table in DynamoDB in Amazon's AWS?
Google didn't help,
UpdateTable Query in http://docs.aws.amazon.com/cli/latest/reference/dynamodb/update-table.html?highlight=update%20table doesn't have any information related to adding a new column.
DynamoDB does not require schema definition, and so there is no such thing as a "column". You can just add a new item with a new attribute.
Well, let's not get dragged away in the semantic discussion about the difference between "fields" and "columns". The word "column" does remind us of relational databases, which dynamodb is not. In essence that means that dynamodb does not have foreign keys.
Dynamodb does have "primary partition keys" and "index partition keys" though, just as with relational databases. (Although there are other strategies as well)
You do need to respect those keys when you add data. But aside from those requirements, you don't have to predefine your fields (except for those partition keys mentioned earlier).
Assuming that you are new to this, some additional good practices:
Add a numeric field to each record, to store the time of creation in seconds. Dynamodb has optional cleaning features, which require this type of field in your data.
You cannot use dates in dynamodb, so you have to store those as numeric fields or as strings. Given the previously mentioned remark, you may prefer a numeric type for them.
Don't store big documents in it, because there is a maximum fetch size of 16MB, and a maximum record size of 400KB. Fortunately, AWS has S3 storage and other kind of databases (e.g. DocumentDB).
There are many strategies for table keys:
If you only declare the partition-key, then it acts like a primary key (e.g. partition-key=invoiceId). That's fine.
If your object has a parent reference. (e.g. invoices have a customer), then you probably want to add a sort-key. (e.g. partition-key=customerId;sort-key=invoiceId) Together they behave like a composed key. The advantage is that you can do a lookup using both keys, or just using the partition-key. (e.g. request a specific invoice for a specific customer, or all invoices for that customer)
I installed NoSQL Workbench then connected to existing DynamoDB Table and tried to update existing Item by adding a new attribute.
I figured out that we can only add a new attribute with one of these types - "SS", "NS", "BS" (String Set, Number Set, Binary Set").
In Workbench, we can generate code for the chosen operation.
I scanned my dynamodb Table and for each item added new attribute with type "SS" then I scanned again and updated recently added new attribute to type - "S" in order create a global secondary index (GSI) with a primary key - "pk2NewAttr".
NoSQL Workbench related video - https://www.youtube.com/watch?v=Xn12QSNa4RE&feature=youtu.be&t=3666
Example in Python "how to scan all dynamodb Table" - https://gist.github.com/pgolding
You can achieve the same by doing the following,
Open Dynamodb and click on the tables option in the left sidebar menu.
Search your table by name and click on your table
Now select the orange button named Explore table items
Scroll down and Click on Create item
Now you will see an editor with JSON Value, Click on Form button on the right side to add a new column and its type.
Note: this will insert 1 new record and you can see now the new column as well.
A way to add a new column to existing table in DynamoDB in Amazon's AWS:
We can store the values in DynamoDb in 2 ways,
(i) In an RDBMS Type of Structure for the DynamoDB, we can add a new Coulmn by executing the same command keeping the "new Column" entry within which the Records in the Existing Table has been created. we can use DynamoDb with the Records/ Rows having Values for certain Columns while other columns does not have Values.
(ii) In a NoSQL kind of Structure; where we store a Json String within a Column to keep all the attributes as per the Requirement. Here we are generating a json string and we have to add the new Attribute into the json String which can then be inserted into the same Column but with the new Attribute.
This script will either delete a record, or add a ttl field. You might want to tailor it to your column name and remove the delete stuff.
Usage:
usage: add_ttl.py [-h] --profile PROFILE [-d] [--force_delete_all] [-v] [-q]
Procedurally modify DynamoDB
optional arguments:
-h, --help show this help message and exit
--profile PROFILE AWS profile name
-d, --dryrun Dry run, take no action
--force_delete_all Delete all records, including valid, unexpired
-v, --verbose set loglevel to DEBUG
-q, --quiet set loglevel to ERROR
Script:
#!/usr/bin/env python3
# pylint:disable=duplicate-code
import argparse
import logging
import sys
from collections import Counter
from dataclasses import dataclass
from datetime import datetime, timedelta, timezone
from functools import cached_property
from typing import Dict, Optional
import boto3
from dateutil.parser import isoparse
from tqdm import tqdm
LOGGER = logging.getLogger(__name__)
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"
LOG_FORMAT = "[%(asctime)s] %(levelname)s:%(name)s:%(message)s"
def setup_logging(loglevel=None, date_format=None, log_format=None):
"""Setup basic logging.
Args:
loglevel (int): minimum loglevel for emitting messages
"""
logging.basicConfig(
level=loglevel or logging.INFO,
stream=sys.stdout,
format=log_format or LOG_FORMAT,
datefmt=date_format or DATE_FORMAT,
)
def parse_args():
"""
Extract the CLI arguments from argparse
"""
parser = argparse.ArgumentParser(description="Procedurally modify DynamoDB")
parser.add_argument(
"--profile",
help="AWS profile name",
required=True,
)
parser.add_argument(
"-d",
"--dryrun",
action="store_true",
default=False,
help="Dry run, take no action",
)
parser.add_argument(
"--force_delete_all",
action="store_true",
default=False,
help="Delete all records, including valid, unexpired",
)
parser.add_argument(
"-v",
"--verbose",
dest="loglevel",
help="set loglevel to DEBUG",
action="store_const",
const=logging.DEBUG,
)
parser.add_argument(
"-q",
"--quiet",
dest="loglevel",
help="set loglevel to ERROR",
action="store_const",
const=logging.ERROR,
)
return parser.parse_args()
def query_yes_no(question, default="yes"):
"""Ask a yes/no question via input() and return their answer.
"question" is a string that is presented to the user.
"default" is the presumed answer if the user just hits <Enter>.
It must be "yes" (the default), "no" or None (meaning
an answer is required of the user).
The "answer" return value is True for "yes" or False for "no".
"""
valid = {"yes": True, "y": True, "ye": True, "no": False, "n": False}
if default is None:
prompt = " [y/n] "
elif default == "yes":
prompt = " [Y/n] "
elif default == "no":
prompt = " [y/N] "
else:
raise ValueError("invalid default answer: '%s'" % default)
while True:
sys.stdout.write(question + prompt)
choice = input().lower()
if default is not None and choice == "":
return valid[default]
if choice in valid:
return valid[choice]
sys.stdout.write("Please respond with 'yes' or 'no' " "(or 'y' or 'n').\n")
#dataclass
class Table:
"""Class that wraps dynamodb and simplifies pagination as well as counting."""
region_name: str
table_name: str
_counter: Optional[Counter] = None
def __str__(self):
out = "\n" + ("=" * 80) + "\n"
for key, value in self.counter.items():
out += "{:<20} {:<2}\n".format(key, value)
return out
def str_table(self):
keys = list(self.counter.keys())
# Set the names of the columns.
fmt = "{:<20} " * len(keys)
return f"\n\n{fmt}\n".format(*keys) + f"{fmt}\n".format(
*list(self.counter.values())
)
#cached_property
def counter(self):
if not self._counter:
self._counter = Counter()
return self._counter
#cached_property
def client(self):
return boto3.client("dynamodb", region_name=self.region_name)
#cached_property
def table(self):
dynamodb = boto3.resource("dynamodb", region_name=self.region_name)
return dynamodb.Table(self.table_name)
#property
def items(self):
response = self.table.scan()
self.counter["Fetched Pages"] += 1
data = response["Items"]
with tqdm(desc="Fetching pages") as pbar:
while "LastEvaluatedKey" in response:
response = self.table.scan(
ExclusiveStartKey=response["LastEvaluatedKey"]
)
self.counter["Fetched Pages"] += 1
data.extend(response["Items"])
pbar.update(500)
self.counter["Fetched Items"] = len(data)
return data
#cached_property
def item_count(self):
response = self.client.describe_table(TableName=self.table_name)
breakpoint()
count = int(response["Table"]["ItemCount"])
self.counter["Total Rows"] = count
return count
def delete_item(table, item):
return table.table.delete_item(
Key={
"tim_id": item["tim_id"],
}
)
def update_item(table: Table, item: Dict, ttl: int):
return table.table.update_item(
Key={"tim_id": item["tim_id"]},
UpdateExpression="set #t=:t",
ExpressionAttributeNames={
"#t": "ttl",
},
ExpressionAttributeValues={
":t": ttl,
},
ReturnValues="UPDATED_NEW",
)
def main():
setup_logging()
args = parse_args()
if not query_yes_no(
f"Performing batch operations with {args.profile}; is this correct?"
):
sys.exit(1)
sys.stdout.write(f"Setting up connection with {args.profile}\n")
boto3.setup_default_session(profile_name=args.profile)
table = Table(region_name="us-west-2", table_name="TimManager")
now = datetime.utcnow().replace(microsecond=0).astimezone(timezone.utc)
buffer = timedelta(days=7)
# #TODO list comprehension
to_update = []
to_delete = []
for item in tqdm(table.items, desc="Inspecting items"):
ttl_dt = isoparse(item["delivery_stop_time"])
if ttl_dt > now - buffer and not args.force_delete_all:
to_update.append(item)
else:
to_delete.append(item)
table.counter["Identified for update"] = len(to_update)
table.counter["Identified for delete"] = len(to_delete)
table.counter["Performed Update"] = 0
table.counter["Performed Delete"] = 0
if to_update and query_yes_no(
f"Located {len(to_update)} records to update with {args.profile}"
):
for item in tqdm(to_update, desc="Updating items"):
if not args.dryrun:
ttl_dt = isoparse(item["delivery_stop_time"])
response = update_item(table, item, int((ttl_dt + buffer).timestamp()))
if response.get("ResponseMetadata", {}).get("HTTPStatusCode") == 200:
table.counter["Updated"] += 1
if to_delete and query_yes_no(
f"Located {len(to_delete)} records to delete with {args.profile}"
):
for item in tqdm(to_delete, desc="Deleting items"):
if not args.dryrun:
table.counter["Deleted"] += 1
response = delete_item(table, item)
if response.get("ResponseMetadata", {}).get("HTTPStatusCode") == 200:
table.counter["Deleted"] += 1
sys.stdout.write(str(table))
if __name__ == "__main__":
main()

Python library to access a CalDAV server

I run ownCloud on my webspace for a shared calendar. Now I'm looking for a suitable python library to get read only access to the calendar. I want to put some information of the calendar on an intranet website.
I have tried http://trac.calendarserver.org/wiki/CalDAVClientLibrary but it always returns a NotImplementedError with the query command, so my guess is that the query command doesn't work well with the given library.
What library could I use instead?
I recommend the library, caldav.
Read-only is working really well with this library and looks straight-forward to me. It will do the whole job of getting calendars and reading events, returning them in the iCalendar format. More information about the caldav library can also be obtained in the documentation.
import caldav
client = caldav.DAVClient(<caldav-url>, username=<username>,
password=<password>)
principal = client.principal()
for calendar in principal.calendars():
for event in calendar.events():
ical_text = event.data
From this on you can use the icalendar library to read specific fields such as the type (e. g. event, todo, alarm), name, times, etc. - a good starting point may be this question.
I wrote this code few months ago to fetch data from CalDAV to present them on my website.
I have changed the data into JSON format, but you can do whatever you want with the data.
I have added some print for you to see the output which you can remove them in production.
from datetime import datetime
import json
from pytz import UTC # timezone
import caldav
from icalendar import Calendar, Event
# CalDAV info
url = "YOUR CALDAV URL"
userN = "YOUR CALDAV USERNAME"
passW = "YOUR CALDAV PASSWORD"
client = caldav.DAVClient(url=url, username=userN, password=passW)
principal = client.principal()
calendars = principal.calendars()
if len(calendars) > 0:
calendar = calendars[0]
print ("Using calendar", calendar)
results = calendar.events()
eventSummary = []
eventDescription = []
eventDateStart = []
eventdateEnd = []
eventTimeStart = []
eventTimeEnd = []
for eventraw in results:
event = Calendar.from_ical(eventraw._data)
for component in event.walk():
if component.name == "VEVENT":
print (component.get('summary'))
eventSummary.append(component.get('summary'))
print (component.get('description'))
eventDescription.append(component.get('description'))
startDate = component.get('dtstart')
print (startDate.dt.strftime('%m/%d/%Y %H:%M'))
eventDateStart.append(startDate.dt.strftime('%m/%d/%Y'))
eventTimeStart.append(startDate.dt.strftime('%H:%M'))
endDate = component.get('dtend')
print (endDate.dt.strftime('%m/%d/%Y %H:%M'))
eventdateEnd.append(endDate.dt.strftime('%m/%d/%Y'))
eventTimeEnd.append(endDate.dt.strftime('%H:%M'))
dateStamp = component.get('dtstamp')
print (dateStamp.dt.strftime('%m/%d/%Y %H:%M'))
print ('')
# Modify or change these values based on your CalDAV
# Converting to JSON
data = [{ 'Events Summary':eventSummary[0], 'Event Description':eventDescription[0],'Event Start date':eventDateStart[0], 'Event End date':eventdateEnd[0], 'At:':eventTimeStart[0], 'Until':eventTimeEnd[0]}]
data_string = json.dumps(data)
print ('JSON:', data_string)
pyOwnCloud could be the right thing for you. I haven't tried it, but it should provide a CMDline/API for reading the calendars.
You probably want to provide more details about how you are actually making use of the API but in case the query command is indeed not implemented, there is a list of other Python libraries at the CalConnect website (archvied version, original link is dead now).

Time range query in Mongo db

I'm using django-nonrel and mongodb to develop app. I know that object id is start with a timestamp of the insertion time of object creation. So it's possible to do time range query based on _id field.
How can I generate a minimal object_id based on a given time in python or django?
Here is a much more pythonic version of the other answer here provided by OP, along with documentation:
from bson.objectid import ObjectId
import datetime
def datetime_to_objectid(dt):
# ObjectId is a 12-byte BSON type, constructed using:
# a 4-byte value representing the seconds since the Unix epoch,
# a 3-byte machine identifier,
# a 2-byte process id, and
# a 3-byte counter, starting with a random value.
timestamp = int((dt - datetime.datetime(1970,1,1)).total_seconds())
time_bytes = format(timestamp, 'x') #4 bytes
return ObjectId(time_bytes+'00'*8) #+8 bytes
However, starting with version 1.6 of pymongo, it would be much more elegant to do the following:
from bson.objectid import ObjectId
ObjectId.from_datetime(dt)
from bson.objectid import ObjectId
import time
def get_minimal_object_id_for_int_timestamp(int_timestamp=None):
if not int_timestamp:
int_timestamp=int(time.time())
return ObjectId(hex(int(int_timestamp))[2:]+'0000000000000000')
def get_int_timestamp_from_time_string(time_string=None):
# format "YYYY-MM-DD hh:mm:ss" like '2012-01-05 13:01:51'
if not time_string:
return int(time.time())
return int(time.mktime(time.strptime(time_string, '%Y-%m-%d %H:%M:%S')))
def get_minimal_object_id_for_time_string(time_string=None):
return get_minimal_object_id_for_int_timestamp(get_int_timestamp_from_time_string(time_string=time_string))
I find the solution finally. hope it helps to others.