is it possible to store string data in rrd database
for example ,
if i want to store
Employee Name
Employee Address
Employee Phone
Employee Mac Addr
in rrd database using py-rrdtool , if it is possible then how to do that can any give me road map to that
rrdtool stores time series numerical data only, use sqlite or in python maybe even something simple like pickle to store your text data ... if the data is related to an rrd file, store it in a pickle with a different extension right next to the coresponding rrd file.
Related
I am reading json files from GCS and I have to load data into different BigQuery tables. These file may have multiple records for same customer with different timestamp. I have to pick latest among them for each customer. I am planning to achieve as below
Read files
Group by customer id
Apply DoFn to compare timestamp of records in each group and have only latest one from them
Flat it, convert to table row insert into BQ.
But I am unable to proceed with step 1. I see GroupByKey.create() but unable to make it use customer id as key.
I am implementing using JAVA. Any suggestions would be of great help. Thank you.
Before you GroupByKey you need to have your dataset in key-value pairs. It would be good if you had shown some of your code, but without knowing much, you'd do the following:
PCollection<JsonObject> objects = p.apply(FileIO.read(....)).apply(FormatData...)
// Once we have the data in JsonObjects, we key by customer ID:
PCollection<KV<String, Iterable<JsonObject>>> groupedData =
objects.apply(MapElements.via(elm -> KV.of(elm.getString("customerId"), elm)))
.apply(GroupByKey.create())
Once that's done, you can check timestamps and discard all bot the most recent as you were thinking.
Note that you will need to set coders, etc - if you get stuck with that we can iterate.
As a hint / tip, you can consider this example of a Json Coder.
I need to determine the best approach to determine the structure of my Django app models at runtime based on the structure of an uploaded CSV file, which will then be held constant once the models are created in Django.
I have come across several questions relating to dynamically creating/altering Django models at run-time. The consensus was that this is bad practice and one should know before hand what the fields are.
I am creating a site where a user can upload a time-series based csv file with many columns representing sensor channels. The user must then be able to select a field to plot the corresponding data of that field. The data will be approximately 1 Billion rows.
Essentially, I am seeking to code in the following steps, but information is scarce and I have never done a job like this before:
User selects a CSV (or DAT) file.
The app then loads only the header row in (these files are > 4GB).
The header row is split by ",".
I use the results from 3 to create a table for each channel (columns), with the name of the field the same as the individual header entry for that specific channel.
I then load the corresponding data into the respective tables and I ahve my models for my app that will then not be changed again.
Another option I am considering is creating model with 10 fields, as I know there will never be more than 10 channels. Then reading my CSV into the table when a user loads a file, and just having those fields empty.
Has anyone had experience with similar applications?
That are allot of records, never worked with so many. For performance the fixed fields idea sounds best. If you use PostgreSQL you could look at the JSON field but don't know the impact on so many rows.
For flexible models you could use the EAV pattern but this works only for small data sets in my experience.
I am not sure if it relates to "bitwise". I store a music file provided with different format. eg. MP3, WAV, midi... It needs to store the provided type in the DB. One of the solution is to create individual db fields/columns for each format. eg withMP3, withWav, withMidi... But once I add one more format, I need to create an extra column.
Is there any standard solution to store the format to one field? For example first digit store with mp3, second digit store with wav... Once I add one more file format, it just needs to append one more bit to the data, no need to add new column. I am not sure this question related to any aspect. Hope that someone can help me.
Many thanks!!
Turn that data into its own table (id, format, blob) then you can associate them with the rows in the other table via another table. That way the schema is independent of the number of formats.
I'm not sure why you try to store this information as fields. I would just store the mime type, that is normally enough information for a normal database.
I currently have a postgres database in which I store a data about a photo, along with the location as a JSON (using Django). The location is obtained through GooglePlacesAPI-
https://developers.google.com/places/documentation/search (search for Search Responses for example responses)
So currently, every photo has a location column which contains the JSON information for the place as obtained from the GooglePlacesAPI
Now I would like to use Postgres' spatial capabilities to query based on the location, however I am not sure how to do that and what schema changes are required. The Postgres documentation seems to indicate that there would be a new table required with the location's name, lat, lng and other information. So does that mean that every location will be saved in a different table and will have a foreign key referenced to that?
And so the JSON will need to be essentially flattened to be stored in the table?
If so, is there a recommended table format that would be good to store the location in so that I can get any other location data (say from Foursquare, FB, etc) and convert it to the format of the table before storing.
Geometry is not special: it's just another data type. So add a geometry column to your existing table. Assuming you have installed and enabled PostGIS 2.x:
ALTER TABLE mytable ADD COLUMN geom geometry(Point,4326);
Then populate the geometry data by extracting the data out of the location JSON column (which really depends on how the data are structured within this amorphous column):
UPDATE mytable SET
geom = ST_SetSRID(ST_MakePoint((location->>'lng')::numeric,
(location->>'lat')::numeric), 4326)
And that should be a start. Later steps would be to build GiST indices and do spatial queries with other tables or points of interests. You may also want to consider the geography data type, instead of the geometry data type, depending on your needs.
I have a Microsoft Foundation Class (MFC) CMap object built where each object instance stores 160K~ entries of long data.
I need to store it on Oracle SQL.
We decided to save it as a BLOB since we do not want to make an additional table. We thought about saving it as local file and point the SQL column to that file, but we'd rather just keep it as BLOB on the server and clear the table every couple of weeks.
The table has a sequential key ID, and 2 columns of date/time. I need to add the BLOB column in order to store the CMap object.
Can you recommend a guide to do so (read/write Map to blob or maybe a clob)?
How do I create a BLOB field in Oracle, and how can I read and write my object to the BLOB? Perhaps using a CLOB?
CMAP cannot be inserted into blob/clob since its using pointers.
first of all use clob
and store array/vector instead of cmap.