Populating search results with meta data in Amazon CloudSearch - amazon-web-services

Unfortunately, Amazon CloudSearch does not support nested JSON, meaning that the below document structure is not valid.
[{
"type": "add",
"id": 1,
"fields": {
"company_name": "My Company",
"services": [
{
"id": 123,
"name": "Construction",
"logo": "logo1.png"
},
{
"id": 456,
"name": "Programming",
"logo": "logo2.png"
}
]
}
}]
Basically I cannot nest an array of objects under the services key. In this particular scenario, only the nested name field has to be searchable, so what I could do is the following:
[{
"type": "add",
"id": 1,
"fields": {
"company_name": "My Company",
"services": [ "Construction", "Programming" ]
}
}]
The above JSON is valid, and I can still search for the service names. However, now I have lost some meta data about my services that I need when displaying the search results. Is there any way in which I can add the meta data to the document in Amazon CloudSearch and have it returned with my search results, such that I can use it when displaying the results?
Or do I have to fetch this additional meta data from my database afterwards to populate the search results with the additional data required to display the results? This does not seem feasible because it complicates my code much more than if I could fetch this data straight from CloudSearch. This would also impact the performance of the search, even though I could use caching - but I kind of want to avoid that if possible, because I don't need it for anything else right now.
So my questions are:
Can I somehow add the meta data for services to the CloudSearch documents and have it returned with my search results?
If not, should I then extract this data from my data store upon receiving the search results from CloudSearch?
Do you have any other solutions or ideas? Are there any best practices with this?
Thank you in advance!

Related

Best way to load 1MM JSON records into AWS Redshift with Kinesis Firehose?

I've got a bunch of JSON records that I want to add to an Amazon Redshift instance from S3, via Kinesis Firehose. It's several hundred files, give or take, that have 1,000 or so records each, and each file looks like the below sample. For my purposes, I don't care about the info entry, at least for now. I have a working Kinesis Firehose service that can update my Redshift DB with the sample stock ticker data, so that part is OK. My questions are (and hopefully this shouldn't actually be split into two different posts):
This is in large part a learning exercise, so if it's overkill for what I'm trying to do, that's OK. If there's a reason it's actually a bad idea, let me know.
If I want to just ignore the info field, do I have to use a Lambda to strip it, or is there a way to do that without one? If so, are there any tricks that wouldn't be the same as writing a script to process from a regular textfile? As I'm typing this I realize I could probably just put info in the DB and never touch it, but if there's a reason not to do that, or a cleaner way than that, I'd appreciate hearing it.
When I have individual manufacturers with a set of features, and there could be dozens of features per manufacturer, does it make sense to make a separate DB table for features, or am I coming at it from a Python dict/Perl hash perspective that doesn't make sense for a SQL DB when I need to tie them back together later?
Sample:
{
"info": {
"generated_on": "2022-08-09 19:25:34",
"version": "v1"
},
"manufacturer": [
{
"name": "Audi",
"id": 1,
"num_features": 2,
"features": [
{
"name": "seat heaters",
"standard": "N",
"cost": 100
},
{
"name": "A/C",
"standard": "Y",
"cost": 0
}
]
},
{
"name": "BMW",
"id": 2,
"num_features": 3,
"features": [
{
"name": "seat heaters",
"standard": "Y",
"cost": 0
},
{
"name": "backup camera",
"standard": "N",
"cost": 500
},
{
"name": "A/C",
"standard": "Y",
"cost": 0
}
]
}
]
}

How to POST logical type decimal values for Avro records to Confluent Rest Proxy?

We are using the Confluent Rest Proxy to communicate with Kafka and need to test a variety of data. We are using the Rest Proxy to allow a vendor to communicate with our Kafka system.
One of our fields in the Avro schema has a logical type of decimal. To keep this simple, let's assume the schema shown here:
{
"fields": [
{
"name": "fieldName",
"type": "string"
},
{
"name": "amount",
"type": {
"logicalType": "decimal",
"precision": 16,
"scale": 2,
"type": "bytes"
}
}
],
"name": "Sample",
"namespace": "com.test.sample",
"type": "record"
}
It's easy enough to write to the topic via a Java producer, using Avro Tools to produce the appropriate class files. But when attempting to use the Rest Proxy, we have to pass values such as this:
{"value_schema_id":132,"records": [{"value":{"fieldName":"Field Name","amount":"\u0001ã"}}]}
This was copied from a record created via the Java producer and then downloaded from the topic. But in the amount field, we'd like to be able to pass a value such as 123.45. We're using Postman for the most part to send data. Is there a way to do this with a logical decimal field and without having to create and serialize the data first to see the representation such as \u0001ã?

AWS-Console: DynamoDB scan on nested field

I have below table in DynamoDB
{
"id": 1,
"user": {
"age": "26",
"email": "testuser#gmail.com",
"name": "test user"
}
}
Using AWS console, I want to scan all the records whose email address contains gmail.com
I am trying this but it is giving no results.
I am new to AWS, not sure what's wrong here. Is it not possible to scan on nested fields?
I've been trying to figure this out myself but it would seem that nested item scans are not supported through the console.
I'm going based off of this which offer some alternative options via CLI or SDK: https://forums.aws.amazon.com/thread.jspa?messageID=931016

How to set variable Request format in Amazon Api Gateway?

I want to make a Model for request where some part of the request structure may change.
As i don't have uniform structure here. How can i define json model for Amazon Api Gateway?
Request:
Here data inside items.{index}.data is changing according to type_id. Also we are not sure about which item with perticular type_id come at which {index}. even the type of items.{index}.data may change.
{
"name":"Jon Doe",
"items": [
{
"type_id":2,
"data": {
"km": 10,
"fuel": 20
}
},
{
"type_id": 5,
"data": [
[
"id":1,
"value":2
],
.....
]
},{
"type_id": 3,
"data": "data goes here"
},
....
]
}
How should i do this?
API Gateway uses JSON schema for model definitions. You can use a union datatype to represent your data object. See this question for an example of such a datatype.
Please note that a data model such as this will pose problems for generating SDKs. If you need SDK support for strictly typed languages, you may want to reconsider this data model.

Facebook Objects API Queries o

I have asked some old questions previously but those were quite misleading. So I decided to delete them and creating this one.
My object has a custom property named say it is portal content. In the Facebook Graph API Explorer data related to this object shown like;
{
"created_time": "2015-11-26T08:42:26+0000",
"title": "Title",
"type": "ns:type",
"data": {
"portalcontent": "portalcontent"
},
"id": "12515125125"
},
{
"created_time": "2015-11-26T08:04:09+0000",
"title": "Title2",
"type": "ns:type",
"id": "412512512512"
},
{
"created_time": "2015-11-25T12:56:03+0000",
"title": "Title3",
"type": "ns:type",
"id": "234124124124"
}
I am trying to query this data using Facebook graph api. But can not fetch just based on the portal content custom property.
So far I tried;
appid/objects?pretty=0&type=ns:type&limit=100&portalcontent=portalcontent
to do so. But it is still fetching all objects.
PS: please pm or comment on question for why you are downvoting it. Provide what else you need that I must put on the question. Downvoting for no reason getting people annoyed.
There's currently no way to filter results other than those described on the respective endpoint's docs.
See
https://developers.facebook.com/docs/graph-api/reference/application/objects/
https://developers.facebook.com/docs/graph-api/using-graph-api/v2.5